Introduction to Unsupervised Learning & Clustering


What is Unsupervised Learning?


Unsupervised learning, as the name suggests, is about learning without supervision. What does it mean to learn without supervision? What can you learn without supervision? As we saw in the regression, the supervision comes from a part of the input data set - that is required to ascertain the correctness of your hypothesis. The input data has the questions as well as correct answers - the machine just needs to map them.
Unsupervised learning is quite different from this. Here, we do not have any answers. No questions either! Then what do we have? We have raw data and we have no idea about its structure. There is nothing to learn from. Unsupervised learning is just making sense of the data in hand. This is not just a theoretical fantacy. Unsupervised learning has a great application for analyzing data that we know nothing about.
When faced with such a situation, of huge chunk of unknown data, that natural tendency is to categorize it into different parts based on the parameters we already know. Then, we can identify the tendencies of each of these clusters so that we can get some meaningful mapping of the data. If your parameters are wide enough, your predictions will be correct too.
With this background, we can now look into some important aspects of Unsupervised Learning

Python Implementation

Next, we go into the implementation of some of these algorithms