Today it is easy and comparably cheap to buy hardware capable of storing terrabytes of data. Now we need the tools to put this data to good use: Discovering trends from streamed user generated content, automatic learning approaches supplementing market research processes or recommending items in an online shopping system to users.
Apache Mahout is a project that provides tools for building these applications. The mission of the Apache Mahout project is to build a suite of scalable machine-learning algorithms. Scalable here really means: The implementations can cope with today's quantities of data. The project has an active community to help solving complex setups and problem settings. The license of the project is commercially friendly to allow for a wide range of applications to be built on top of Apache Mahout.
This talk provides a beginner-friendly introduction to the topic of machine learning. It presents a broad set of applications that benefit machine learning, as well as a high-level overview of Mahout. It also covers the types of tasks that can be solved with each algorithm, and the pitfalls to look out for along the way.