Transcript:
Hey, I’m Mandy from deep lizard. In this episode, we’ll introduce mobile nets, a class of a lightweight, deep convolutional neural networks that are much smaller and faster in size than many of the mainstream popular models that are really well-known. Mobile nets are a class of small low-power low latency models that can be used for things like classification detection and other things that CN NS are typically good for and because of their small size. These models are considered great for mobile devices, hence the name mobile nets. So I have some stats taken down here So just to give a quick comparison in regards to the size the size of the full VG g16 network that we’ve worked with in the past. Few episodes is about five hundred and fifty three megabytes on disk so pretty large, generally speaking the size of one of the currently largest mobile. Nets is only about 17 megabytes, so that’s a pretty huge difference, especially when you think about deploying a model to run on a mobile app, For example, this vast size difference is due to the number of parameters or weights and biases contained in the model. So for example, let’s see DGG 16 as we saw previously has about 138 million total parameters so a lot and the 17 megabyte mobile net that we talked about which was the largest mobile net, only about 4.2 million parameters, so that is much much smaller on a relative scale than VGG 16 with 138 million aside from the size on disk, being a consideration when it comes to comparing mobile nets to other larger models, we also need to consider the memory as well, so the more parameters that a model has the more space in memory, it will be taking up also, so while mobile nets are faster and smaller than the competitors of these big hefty models like BG Sixteen, there is a catch or a trade-off and that trade-off is accuracy, so mobile nets are not as accurate as some of these big players like VGG sixteen, for example, But don’t let that discourage you while it is true that mobile nets aren’t as accurate as these resource heavy models like VGG sixteen, For example, the trade-off is actually pretty small, with only a relatively small reduction in accuracy and in the corresponding blog for this episode. I have a link to a paper that goes more in depth to this relatively small accuracy difference. If you’d like to check that out further. Let’s now see how we can work with mobile nets and code with Khari’s, All right, so we are in our Jupiter notebook and the first thing we need to do is import all of the packages that we will be making use of which these are not only for this video, but for the next several videos where we will be covering mobile net. And as I mentioned earlier in this course, a GPU is not required. But if you’re running a GPU, then you want to run this cell. This is the same cell that we’ve seen a couple of times already in this course earlier, where we are just making sure that tensorflow can identify our GPU if we’re running one, and it is setting the memory growth to true if we do have a GPU so again, don’t worry if you don’t have a GPU, don’t worry, but if you do then run this cell similar to how we downloaded the VGG 16 model when we were working with it in previous episodes, we take that same approach to download mobile net here, so we call. Tf Carest applications, nibble metal net and that the first time we call, it is going to download mobile net from the Internet. So you need an internet connection. But subsequent calls to this are just going to be getting the model from a saved model on disk and loading it into memory here, so we are going to do that. Now and assign that to this mobile variable now. Mobile net was originally trained on the image net library, just like Vgg 16 so in a few minutes, we will be passing some images to mobile net that I’ve saved on disk. They are not images from the image net library, but they are images of some general things and we’re is going to get an idea about how mobile net performs on these random images, but first in order to be able to pass these images to mobile net. We’re going to have to do a little bit of processing first, so I’ve created this function, called a prepare image and what it does is it takes a file name and it, then inside the function, we have an image path, which is pointing to the location on disk, where I have these saved image files that we’re going to use to get predictions from mobile net, so we have this image path defined to where these images are saved. We then load the image by using the image path and appending the filename that we pass in, so say if I pass in Image 1 DOT PNG here, then we’re going to take that file path. Append 1 PNG here pass that to load image and pass this target size of 224 by 224 Now this load image function is from the Charis API, So what we are doing here is. We are just taking the image File, resizing it to be of size 224 by 224 because that is the size of images that mobile net expects. And then we just take this image and transfer it to be in a format of an array. Then we expand the dimensions of this image because that’s going to put the image in the shape that mobile net expects, and then finally we pass this new processed image to our last function, which is TF Kerris dot application stop mobile net that pre process input, So this is a similar function to what we saw a few episodes back when we were working with BG 16 it had its own pre process input now. Mobile net has its own pre process input function, which is processing images in a way that mobile net expects, So it’s not the same way as BGG 16 Actually, it’s just scaling all the RGB pixel values to be on a scale instead of from 0 to 255 to be on a scale from minus 1 to 1 So that’s what this function is so overall, this entire function is just resizing the image and putting it into an array format with expanded dimensions and then mobile net processing it and then returning this processed image. Okay, so that’s kind of a mouthful, but that’s what we got to do. Two images before we pass them to mobile net, all right, so we will just define that function and now we are going to display our first image called one dot PNG from our mobile net samples directory that I told you I just set up with a few random images and we’re going to plot that to our Jupiter notebook here. And what do you know it’s a lizard. So that is our first image. Now we are going to pass this image to our prepare image function that we defined right above that we just finished talking about, so we’re going to pass that to the function to pre-proces’s the image. Accordingly, then we are going to pass. The pre processed image returned by the function to our mobile net model and we’re going to do that by calling predict on the model, just like we’ve done in previous videos when we’ve called predict on models to use them for inference, then after we get the prediction for this particular image, we are then going to give this prediction to this image. Net util’s decode predictions function. So this is a function from Charis that is just going to return the top five predictions from the 1000 possible image net classes, And it’s going to tell us the top five that mobile net is predicting for this image. So let’s run that and then print out those results, And maybe you can have a better idea of what I mean. Once you see the printed output, so we run this and we have our output. So these are the top five in order results from the image net classes that mobile net is predicting for this image, so it’s assigning a 58% probability to this image of being an American chameleon, 28% probability to green Lizard 13% to a gamma. And then we have some small percentages here under 1% for these other two types of lizards. So it turns out if you’re not aware, and I don’t know how you could not be aware of this because everyone should know this, but this is an American chameleon, and I don’t know, I’ve always called these things in green anoles, but I looked it up and they’re also known as American chameleons, so mobile net got it right so yeah, It assigned a 58 percent probability that was the highest most probable class. Next was Green Lizard so. I’d say that that is still really good for that to be your second place. I don’t know if green lizard is supposed to be more general, but and then a gamma, which is also a similar-looking lizard. If you didn’t know so between these top three classes that it predicted, this is almost 100% between the three of these. I would say mobile, and that did a pretty good job on this prediction. So let’s move on to number two, all right, so now we are going to plot our second image and this is a cup of well. I originally thought that it was espresso and then someone called it a cup of cappuccino. So let’s say I’m not sure I’m not a coffee connoisseur, although I do like both espresso and cappuccinos. This looks like it has some cream in it. So hey, now we’re going to go through the same process of what we just did for the lizard image, where we are passing that passing this new image to our prepare image function so that it undergoes all of the pre-processing then we are going to pass the pre processed image to the predict function for our mobile net model. Then finally, we are going to get the top 5 results from the predictions for this model relative and regards to the image net classes. So let’s see all right, so according to Mobile Net. This is an espresso, not a cappuccino, so I don’t know. But it predicts 99% probability of espresso as being the most probable class for this particular image, and I’d say that that is pretty reasonable. So let me know in the comments. What do you think? Is this espresso or cappuccino? I don’t know if I mobile or if image that had a cappuccino class. So if it didn’t then. I’d say that this is pretty spot-on, but you can see that the other. The other four predictions are all less than 1% but they are reasonable. I mean, the second one is cup third eggnog, fourth coffee mug. Fifth wooden spoon gets a little bit weird, but there is wood. There is a circular shape going on here, but these are all under 1% so they’re pretty negligible. I would say mobile net did a pretty great job at giving a 99% probability to espresso for this image. All right, we have one more sample image, so let’s bring that in and this is a strawberry or multiple strawberries. If you call if you consider the background so same thing, we are pre processing the strawberry image, then we are getting a prediction from mobile net for this image, and then we are getting that top 5 results for the most probable predictions among the 1000 images, and we see that mobile net, with 99.999 percent probability classifies this image as a strawberry correctly so very well and the rest are well well under 1% but they are all fruits so interesting, another really good prediction from mobile net, so even with the small reduction in accuracy that we talked about at the beginning of this episode, you can probably tell from just these three random samples that that reduction is maybe not even noticeable when you’re just doing tests like the ones that we just ran through so in upcoming episodes, we’re actually going to be fine-tuning. This mobile net model to work on a custom data set and this custom data set is not one that was included in the original image net library. It’s going to be a brand new data set. We’re going to do more fine-tuning than what we’ve done in the past, so stay tuned for that By the way we are currently in Vietnam filming this episode. If you didn’t know, we also have a vlog channel where we document our travels and share a little bit more about our selves. So check that out at People’s Red vlog on Youtube also be sure to check out the corresponding blog for this episode, along with other resources available on the Boozer. Calm and check out. The peoples are type mind where you can gain exclusive access to perks and rewards thanks for contributing to collective intelligence. I’ll see you next time [Music] you [Music]!