Transcript:

Hello, all my name is. Krishna and welcome to my Youtube channel. Today, we will be basically discussing about cosine similarity and cosine distance now in my previous video. I have already explained you about Euclidean distance and Manhattan distance. I just forgot to mention one point. Euclidean distance is basically called as l2 norms. Okay, l2 norms and Manhattan distance is basically called as l1 norm. There is a reason why this is basically represented because we have one terminology called a Minkowski Minkowski is basically L of P. Norm, We basically represent it as L of P. Norm and P value can be 1 if P value is 1 then that basically becomes if I have one that basically becomes a Manhattan distance because this if I represent tvthe 1 this becomes L 1 right, and similarly, if I make P is equal to 2 this becomes L do so if when the P is 1 it is basically represented as Manhattan distance when P is equal to 2 it is basically represented as Euclidean distance. Okay, so if you have not seen that particular video, please have a look. I have explained clearly. I have also explained the practical applications. Also, it is already present in my playlist, so what I’m going to do is that I’m going to continue the discussion with respect to cosine similarity and cosine distance now. This particular topic is widely used in the recommendation system. I tell you why I’ll give a very good example Why cosine similarity and cosine distance is basically used first of all. I’m just going to take this term, which is called as similarity and the other term, which is called as distance now cosine similarity and cosine distance basically says that in terms of similarity and distance, Suppose I have two points, P 1 and P 2 as the distance within this point increases, the similarity between those points decreases. Okay, and similarly, if the distance between these points decreases, then the similarity in between these points basically increases. So in short, I can represent cosine similarity and cosine distance by using a small equation. Let me just write it down, so I can basically write 1 Minus cosine similarity is equal to cosine distance. I’ll explain you what I meant by the how to find out cosine similarity, everything. I’ll mention I will explain you everything, but just remember this particular formula, which in another 1 or 2 minutes. I will be explaining about this. What exactly’s now? Let us go and understand. What is this exact cosine similarity? Okay, so let me just take a very good example. I mean, and it is always good that we and this particular example, whatever I take is based on geometrical figures, right we I try to show it in the geometrical way, so suppose I have a feature f1 and f2 f1 and f2 and suppose I say that there is 2 points like p1 and another point as we do now an e as you know that if I want to find out the Euclidean distance, I will basically compute this, right. I will compute this particular D value. Okay, and let me just draw the other two lines. Okay, so if I want to compute this, D, you already know what is the Euclidean distance formula, and we can basically compute this particular by using, Uh, some normal. Pythagoras theorem now cosine similarity basically says that in order to find out the similarity between these two points. I have to basically find out the angle between them. Now suppose this particular angle is 45 Okay, so what I will write? Cosine similarity is represented. By co’s theta, okay. This is what cosine similarity is all about, and this theta is basically the angle between P 1 and P 2 simple. This is what I am basically representing and always remember. Cosine similarity will be ranging between minus 1 to plus 1 How I just explained you okay now. Let’s just let us take an example now in this case. I know my this angle is basically 45 degree, So let me just replace it over here. So co’s 45 will be somewhere around 0.53 approximately. Okay, so it is basically saying that it is 53% similar. P 1 and P 2 based on the distance, based on the angle that is created between them. Okay, now let me take another example. I am just rubbing this. Okay, and let me just compute some other example. Suppose my p1 is here and my p2 is here now. You can see that the distance between them is very, very big and suppose if I want to calculate the angle, Okay, I want to calculate that if this angle is basically ninety degree, right, this angle is basically ninety degrees, So suppose if I replace in the same formula, so co’s 90 is basically zero, right, then that basically indicates that this point and this point is not similar, right because this point has a huge gap, say this particular. D also, if we try to calculate this, D will be also very high right now. If I go and see the angle, it is basically 90 so when I’m basically assigning in this formula for co’s 90 is also zero. Now let us consider one more thing. Let us consider one more thing instead of writing p2 over here. Suppose my p2 is here itself in the same plane, suppose? My p2 is here in the same plane. Now you see the angle between these two points. These two lines, right. This is the same plane. This is the same vector in short, you can say p1 and p2 are in the same plane and same vector now. The angle between them is zero now when the angle between them is zero If I try to replace in the same equation Co’s 0 will be 1 and again how I’m saying. Cause 0 will be 1 If you know differential equations, If you if you have learned about sine co’s theta tan theta and all those things, you’ll be able to understand what I’m actually saying, the angle between them is basically zero degree, and I’m trying to replace this over here so it becomes 1 now. That basically indicates that this two points are almost similar because the vector is in the same direction. Okay, and when the vector is in the same direction that basically indicates this two are similar points right now. I’ll explain you some more thing. Let me just extend this line. As you know that in all geometry, we have four axis. This is our 90 degree. This is a 180 degree. And this is our 270 degree, right. This is our 270 degree. Now understand that when the angle is zero at that time, output is 1 when the angle is 90 the output is zero when the angle is 180 so costs 180 is base – one and again when I go and find out the value, Of course 270 the value is zero and finally, whenever I go and calculate the cost value of 360 degree, then the value is 1 Now this basically indicates that as the point is going farther from this from this axis, right, it becomes zero, then it became minus one. Now it has again become zero since it is coming towards that particular axis and finally when it makes a complete circle with 360 degree, it is in the same axis, so value will be actually one how. I am saying you. Assign this. Co’s 90 is basically zero, right, Co’s 180 is minus 1 Co’s 270 is also zero. You can verify through the mats also and co’s 360 is basically 1 now. I know all these things right now. I have discussed about cosine similarity. Now remember the equation that I told you for cosine in distance, right, so once you calculate suppose once you have calculated the cosine similarity between the two points If I want to compute the cosine distance, all I have to write is that cosine distance is equal to 1 – cosine similarity, right, so, in this case, suppose my co’s 45 was there that basically means the distance between the two points was 0.5 through 45 degree and that time cos 45 is 0.53 so if I want to find out the cosine distance, it will be mon minus 0.45 3 so it is somewhere around 0.47 What about in this case? If the distance between the two points or sorry, the angle between the two points is 90 degree. What will happen in that case? So this cause distance will be 1 – cosine similarity is basically zero, so this will basically be 1 that basically indicates the distance is more right now. Similarly, for this particular case, right, where my cosine 0 is 1 so the cosine distance, it will be nothing but 1 minus 1 which is 0 This basically indicates that the distance between the two point is very, very less and this is pretty much similar to this cosine similarity cosine distance. Now let me show you An example how this particular technique is basically used in recommendation systems. So I am just going to rub this. I hope you understood what I am trying to explain in this. Now, let me consider. I have two features of movies. Suppose the movie recommendation system suppose? This is my action parameter. This is my comedy parameter. Suppose right, suppose my? I’m taking an example of a movie called as Avengers. You know, that Avengers is action movie. So action parameter value will be 1 and suppose it is not that much company, and we’re just going to take it as zero right now. Suppose let me take the example of minions? So if this is minions movie, OK and minions, you know that everybody that it is comedy. But action, it is like zero. There are some little bit of action, but just understand that. I am making it a zero and Im. The comedy is one. This is vectors, right, these two dimension vectors. Now, If I want to find out the angle between this two point. I know that it is 90 so what I’ll do is that I’ll going to try to find out the cosine similarity. Cosine Three matter is nothing but co’s theta co’s theta is nothing but theta over here is 90 So when I write cos 90 this particular value becomes zero. Now this basically, the thing indicates that a person who has seen. Avengers will not get the recommendation of minions. I’m just taking for an example guys like this. You’ll just not have two parameters to two diamonds because this is basically a two dimension diagram okay here. I’ve just taken action and comedy. There will be other parameters also other generous of movies also. Okay, now similarly, let me just consider one more movie So instead of minions. I will go and take another Biden man now. Iron Man is also action movie. So this vector will also be one zero, so the distance between them will be zero degree. I mean, sorry, the angle between them will be zero degree. So when the angle between them is zero degree, you just replace in theta values called 0 This will be 1 now. If you want to find out that distance, replace that distance is equal to 1 minus cosine similarity. That is the formula that I discussed. So 1 Minus 1 is actually 2 0 The distance between Iron Man and Avengers is basically 0 because it is in the same plane It is it is, it is behaving. It is basically in the same unit vector, right, so that is why this cosine similarity is very, very heavily used in recommend system. Okay, and when I say not only movie recommendation system, it may be your Amazon product recommendation in the Amazon apps and other apps in many websites. This is basically getting used a lot. Whenever you want to create a recommendation system, they are some other techniques. There are also some other techniques like correlation. Pearson correlation and many more. Okay, but this is one of the technique wherein you’re basically using cosine similarity cosine distance to find out what recommended movies. You should get, okay, And this is what is all about cosine similarity in cosine distance. I hope you like this particular video. So guys, please. Do make sure scribe the channel share with all your friends and see yall in the next video. Have a great day. Thank you one and all.