Summary
In this chapter, we have introduced some main concepts about machine learning. We started with some basic mathematical definitions so that we have a clear view of data formats, standards, and certain kinds of functions. This notation will be adopted in the rest of the chapters in this book, and it's also the most diffused in technical publications. We also discussed how scikit-learn seamlessly works with multi-class problems, and when a strategy is preferable to another.
The next step was the introduction of some fundamental theoretical concepts regarding learnability. The main questions we tried to answer were: how can we decide if a problem can be learned by an algorithm and what is the maximum precision we can achieve? PAC learning is a generic but powerful definition that can be adopted when defining the boundaries of an algorithm. A PAC learnable problem, in fact, is not only manageable by a suitable algorithm, but is also fast enough to be computed in polynomial time. Then, we introduced some common statistical learning concepts, in particular, the MAP and ML learning approaches. The former tries to pick the hypothesis that maximizes the A Posteriori probability, while the latter optimizes the likelihood, looking for the hypothesis that best fits the data. This strategy is one of the most diffused in many machine learning problems because it's not affected by Apriori probabilities and it's very easy to implement in many different contexts. We also gave a physical interpretation of a loss function as an energy function. The goal of a training algorithm is to always try to find the global minimum point, which corresponds to the deepest valley in the error surface. At the end of this chapter, there was a brief introduction to information theory and how we can reinterpret our problems in terms of information gain and entropy. Every machine learning approach should work to minimize the amount of information needed to start from prediction and recover original (desired) outcomes.
In the next chapter, Chapter 3, Feature Selection and Feature Engineering, we're going to discuss the fundamental concepts of feature engineering, which is the first step in almost every machine learning pipeline. We're going to show you how to manage different kinds of data (numerical and categorical) and how it's possible to reduce dimensionality without a dramatic loss of information.