Understanding KNNs
The k-nearest neighbors, or KNN, algorithm is a fundamental supervised learning technique used for both classification and regression tasks. The idea is simple: if we take our collection of existing datapoints and plot them in feature space before we are given a new data point, can we estimate the characteristics of that new datapoint based on the existing ones nearest to it using a distance metric? When we qualify and/or quantify the characteristics of something (like the car or cheetah in our introductory example), we can use a variety of measurements to describe it. For example, things like color and shape would be qualitative measurements while length, width, and height would be quantitative. In combination, these types of measurements can be used to form comparisons when introducing a new something so that we can make logical guesses regarding how it should be included in the existing set of somethings we already have. This next recipe will teach you the basic...