Supervised learning and Unsupervised learning !
Let’s talk about supervised learning vs unsupervised learning. This is the agenda for today. First, I will quickly go over the background of machine learning. Then I will dive into supervised learning and talk about classification and regression and talk about some examples of them and go over some supervised learning algorithms. Then I’ll dive into unsupervised learning, go over clustering and association and some of the examples and some unsupervised learning algorithms. And lastly, we’ll have a grand summary Machine learning is used for many different purposes these days.
Know More: Artificial Intelligence
For example, recommendation systems. and you may notice some of these applications like Netflix, YouTube, and Spotify. At the very front page, they’ll normally have a recommendation for you because that is based on the data of the user, what they have been watching, what they have been listening. So based on that data, they are able to make recommendations for you.
What is machine learning exactly?
Machine learning is the process of computers learning from past data and improving from experience without being explicitly programmed But essentially, you’re giving the computer past data and you’re having the computer learn the patterns of it, and then when you feed new data into that computer, it should be able to make the right prediction based off what it’s learned so far This is the general machine learning flow.
So normally you would get the data, then you clean that data so you take out NAs or adjust the variables accordingly, and then you train the model, and to train the model we will use a machine learning algorithm and this is where supervised learning and unsupervised learning comes in.
By training a model, you are teaching the computer to learn the patterns, so that it can make the right predictions in the future. And then we would test our model to see how accurate it is. And if it’s good we deploy it, if it’s not good, then we go back and improve the model. You don’t need to know the details of this right now for the purpose of learning supervised learning and unsupervised learning, but this is just a good background knowledge to have.
There are two types of machine learning, so one is supervised learning and unsupervised learning.
What is supervised learning?
So, let’s go into supervised learning. What is supervised learning? Supervised learning is the process of learning from labelled data and predicting the output from its experience. But what is labelled data exactly? Labelled data is the data where you already know the target answer.
So, let’s go over an example. Let’s say okay, we have the data of red and blue. and we have the target answers red for the red circle and blue for the blue circle. Let’s say we feed this data into the computer and we have the computer learn the patterns of these target answers. Then, once we feed the computer new data, after it’s learned the patterns, let’s say we feed it a red circle. It should be able to output red based off what it’s learned during machine learning.
There are two types of supervised learning
Classification is when the output is a category or class. So, for example, red or blue/disease or no disease.
Regression is when the output is a real value. So, for example, sales, weight, and income. For regression, the model tries to find a relationship between dependent and independent variables.
Let’s go over a classification example under supervised learning. Let’s say we want to find out if a patient has heart disease or no heart disease. Let’s say we have the general data of height, weight, age, gender, smoke(Y/N), cholesterol level, etc and we feed this data into the computer to have it learn the patterns.
For example, okay with these characteristics, this patient will have heart disease with these characteristics, this patient wouldn’t have heart disease, and this is all based on past data. Once it starts learning the patterns, when we feed this computer new data of height, weight, age, gender, if this patient smokes or not, cholesterol level, etc., then based on those characteristics of a patient, then the computer can decide whether this patient will have heart disease or no heart disease.
Let’s go over a regression example. Let’s say we have the independent variable of car age, and we have the dependent variable of price of the car. Generally, if a car is older than we would have a lower price for the car, and if the car is newer then we have a higher price for the car. Let’s say we feed this data into the computer to have it start learning the patterns of it.Once it learns the patterns, we feed the computer new data. So, we feed it a new car age, then it should be able to predict the price of the car, based on what it’s learned.
So, these are some of the supervised learning algorithms:
- Decision trees,
- Random forests,
- K nearest neighbours,
- which is support vector machines, and logistic regression, and etc so many more.
What is unsupervised learning?
All right let’s dive into unsupervised learning. What is unsupervised learning? Unsupervised learning is the opposite of supervised learning. So, it’s a process of learning formulable data and predicting the output from the patterns detected. For unsupervised learning, you don’t really have the answers, but you have the computer find the patterns for you. Unlabelled data is data where you don’t know the answer.
So, an example of unsupervised learning. This is very similar to what we had before the red and blue. Let’s say we feed the computer this data except this time, we don’t tell the computer if it’s red or blue. Then the computer will look at the data and start recognizing the patterns.
Once it recognizes the patterns, it will start grouping them based on its patterns and similarities. So, in this case, okay the red ended up being grouped together, and the blue ended up being grouped together. There are two types of unsupervised learning. One is clustering and the other is association. Clustering is where it finds a structure or pattern in the data or it groups similar objects into a cluster.
For example, its groups customers by purchasing behaviour. Association discovers rules that explain relationships between variables in the data. So, for example, people who buy X also tend to buy Y. Okay, let’s go over a clustering example under unsupervised learning. Let’s say we want to observe the performance of students in a class. We want to find out what are the patterns of the number of study hours of these students and their final test score.
Let’s say we feed this data into the computer and have the computer recognize the patterns. Once it recognizes the patterns, then it will cluster them together. Group A is what we would expect, the more hours you study, the better your final test score should be. However, group B is a little unexpected because although they didn’t score as high as group A, but considering the number of hours they studied, they scored pretty decently.
So, for this group, we would recommend that they study more so that they can reach their full potential. Group C is very concerning because they studied a considerable number of hours, however, they were still not able to score well. In this case, we would recommend a new study plan for this group.
Let’s say it’s Super Bowl Night and you want to go and get the stuff you need, and you run to the grocery store. This is Customer A, you buy tortilla chips, salsa, and cool drink. Customer B walks in and buys tortilla chips, salsa, sour patch kids, and coke. Let’s say Customer C walks in and buys tortilla chips. So, if Customer C buys tortilla chips, then he or she will most likely buy salsa, too.
A relationship is determined based on these observed behaviours. After looking at these observed behaviors, the management of the grocery store can start making bundle recommendations. Some of the unsupervised learning algorithms are k-means, PCA, which is principal component analysis, hierarchical clustering, apriorism algorithm, and etc.
This is a summary of what we went over today. Supervised learning uses labelled data as input, there are 2 types: classification/regression, and it can improve based on feedback. Unsupervised learning, on the other hand, does not use labelled data as input, there are two types clustering there are 2 types: clustering/association, and it doesn’t get feedback like supervised learning.