Transcript:
In this video, we will go over f1 score precision recall true, positive, true, negative all those terms, because if you are trying to learn machine learning or data science, knowing these terms is extremely important, they will be used everywhere, So if you don’t know these terms, it’s really bad. Uh, so I have this data set of dog images where six images are dog images and then four images are not dog images, so there is a binary classification here dog versus not a dog and let’s say you build a machine learning model or let’s say you ask someone to make a guess of what these images are and let’s say. This is your prediction. This is a prediction. Hence it will not be true. It will be making some mistakes, okay. Uh, so now let’s first. I have grayed out, no dog. Uh, predictions. Okay, So we just think of predictions as your base and let’s only talk about positive predictions, which is dog. Okay, that’s why I’m graying out this no dog predictions. Okay, so forget those no dog predictions for now and you have now. A total seven are dog predictions out of those seven. How many of them are correct? Well, one! This is not correct. This is correct. Okay, so you have classified. You have basically marked your prediction as correct or not correct? Okay, for your positive class, which is dog. Whichever predictions are correct. Here are called true, positive so here. The positive thing, which is the second word means your class. Our positive class, which is in green, is dog so out of those positive prediction. How many of them are true really? So when you try to identify them as true, you compare it with reality, so when we compare all this with reality, we found only four predictions where true for dog and three were wrong so those are called false positives. So this again, the second word positive indicates. What’s the outcome of your prediction, okay. The outcome of the prediction was for these three samples. Which are, uh, red. Mark was positive, but they’re false. So the first word indicates the reality in reality. They are false, okay. Now think about no dog predictions. Okay, so forget all those greens. That’s why I will grade it out so here again. I will go and compare all these three predictions with the reality, so two of these predictions are right. Uh, two of these are wrong and one is right. So true, negative here is called one the negative. The second word means your class, which is no dog which is which is negative class right in our case and out of these three predictions, only one was true. That’s why it’s called True Negative is one and false negative is two basically out of our three negative predictions. Two are false. So now I have marked all my, uh, predictions here So basically, this shows out of all my predictions, which predictions are correct and which are wrong. Okay, so we got total five, right, and that is called accuracy. How many of your predictions you got right, doesn’t matter positive or negative predictions. How many of your predictions you got it right, so we got five out of ten right hand and hence accuracy is 0.5 Okay, now in this diagram again. We are going back to only positive class, so forget the no dog prediction. Just take those out of your visualization here. We saw true positive was four and false positive. We’re three so precision is out of all log predictions. How many you got it right, So it is four by seven. So you have total seven dog predictions, right, seven things. You predicted it to be dog out of that. Only four are dog. So that’s why precision is four by seven point fifty seven. So this is the formula for precision, true, positive, divided by true, positive, false plus false positive. Okay, now let’s talk about recall so that when you’re thinking about recall, you always think about truth as your baseline. So what is my truth? My in my truth. Total six samples are dog samples. Okay, but when I predicted only I got four as correct predictions for dog, so recall is out of all your dog Truth samples. How many you got right, so you had total six dog samples out of that? You got four, right, Hence four divided by six point. Sixty seven. Is your recall okay. So recall is basically here. Is my truth, okay, and in my truth? I have six dog samples out of that. How many we are able to predict correctly? So now, precision recall has subtle difference, which is when you’re thinking about precision, always think about predictions as a baseline. When you’re thinking about recall, think about truth as your base. Now, let’s talk about the negative class. So by the way, precision and recall is for individual class, so the previously what we saw was precision and recall for dog class. Now we are seeing precision and recall for not a dog class so in not a dog class. Precision is one by three. Because when you’re thinking about precision, you’re always thinking about predictions, so it’s like how many prediction do I have for No dog class? Well, three and how many of them are are correct predictions. Well, one so one by three point, thirty three. When you think about recall, okay, pause this video. Try to guess what will be your recall for recall. You think about truth is your baseline, okay. So what is my truth, all right, in my truth? How many no dog samples I have well one two, three four. And how many of them I got right well? Only one see. There is a green check mark here. So out of four. No dog images I had. I was able to predict only one image correctly. That’s why one by four is 0.25 Now you’ll ask me. What is f1 score because you see f1 score everywhere, along with recall and precision. Whatever, okay, so reference course, um, definition on Wikipedia is two into precision into recall divided by precision plus recall, it is just a harmonic mean of precision and recall it just gives you the overall health or overall performance of your model. So now we are going to write some code in python to check all these things. Okay, so I have imported some libraries in my jupyter notebook, and I have written this function, Actually, this function. I have, uh, sourced it from this gist, so thank you for whoever wrote this function and it will just plot A confusion. Matrix Confusion Matrix is, uh, let me show you. What is confusion Matrix? Okay, so I have all this sample. So like a dog, not a dog. These these are the same samples that we saw in our presentation, right, so in our presentation that we had see, we had dog. No dog dog dog dog. So see dog. No dog dog dog dog. All right, and then prediction, okay. What was the prediction? Dog dog dog? No dog, okay, dog dog dog, no dog. So that’s what we have in truth in prediction when you do confusion matrix, you need to supply truth and prediction, and when you print confusion matrix, this is how it looks like so on the y-axi’s you have truth X-axi’s. Your prediction. What this 4 means is four times. I had dog as a truth, and I predicted that to be a dog so four times. I got it right. This is my true positive. Two times the truth was dog. Look at the Y axis two times. The truth was dog, but I actually predicted it to be not a dog, so I got it wrong. Similarly, three times. It was not a dog, but I predicted it to be dog and one time it was not a dog. I predicted to be not a dog so anything that you see on a diagonal. Those are all correct predictions and these all are errors so two and three is an error. Okay, when you print a classification report using this function so in in sklearn, there is a classification report that you can after building your machine learning model. You can always print that report and in that report, just see the accuracy, says 0.50 So if you look at our presentation, see, the accuracy was 0.50 then the precision for dog was 0.57 and recall was 0.67 so Pre-season was point 57 and a recall was point 67 similarly, not a dog was 0.33 0.25 0.133.25 Sorry, okay. What is f1 score? Now f1 score is, I just use this formula Two into precision recall pieces and processor recall, and this is the f1 score for a dot class, which is see 0.62 almost so here 0.62 and not a dot class is 0.28 Here’s 0.29 because see these are also not rounded. These induced numbers are known. So there is some rounding thing going on, but but it is true. Believe me, so now, you understood precision recall f1 score true, positive, true negative, so this understanding will help you a lot when you are learning machine learning or in general statistics. I hope you’re liking this deep learning series so far, I have so many videos in this deep learning series. So please watch them. In these videos, I also provide exercises and I provide very simple explanation, which is perfect for beginners, so I have theory and code and exercise in all my tutorials, so please make sure you check video description and follow the deep learning series. The link of this jupyter notebook is available in the video description below. Thank you bye.