Pytorch Transfer Learning | Pytorch Tutorial 15 – Transfer Learning

Python Engineer

Subscribe Here





Pytorch Tutorial 15 - Transfer Learning


Hi, everybody, welcome to a new. Pi Torch Tutorial. In this tutorial, we will talk about transfer learning and how it can be applied in Pi Torch transfer learning is a machine learning method where a model developed for a first task is then reused as the starting point for a model on a second task. For example, we can train a model to classify birds and cats and then use the same model modified only a little bit in the last layer and then used a new model to classify bees and docks, so it’s a popular approach in deep learning that allows rapid generation of new models and this is super important because training of a completely new model can be very time consuming. It can take multiple days or even weeks, so if you use a pre trained model, then we typically exchange only the last layer and then do not need to train the whole model again. However, transfer learning can achieve pretty good performance results. And that’s why it’s so popular nowadays. So let’s have a look at this picture here. We have a typical. CNN architecture that I already showed you in the last tutorial and this. Let’s say this has been already trained on a lot of data and we have the optimized weights and now we only want to take the last fully connected layer, so this one here and then modify it and train the last layer on our new data. So then we have a new model that has been trained and tweaked in the last layer and yeah, this is the concept of transfer learning, And now let’s have a look at a concrete example in Pi Torch. So, in this example, we want, we are using the pre trained. Resnet 18 CNN. This is a network that is trained on more than a million images from the image net database, and this network is 18 layers deep and can classify images into 1,000 object categories. And now, in our example, we have only two classes, so we only want to detect bees and ants and, yeah, so let’s start so in this session. I already I also want to show you two other new things. So first, the data sets image folder how we can use this and how use a scapula to change the learning rate and then, of course, how transfer learning is used. So I already imported the things that we need and now we set up the data and the last time we use the Built-in data sets from the torch vision data sets. And now here we use the data sets dot image folder because we saved our data in a folder and this has to have the structure like this, so we have the folder here, and then we have a training and a validation folder. So train and Val and in each one. We have folders for each class so here we have ants and ants and peace and also in the validation folder, we have ants and bees, and now in each folder, we have the images here, so for example, here, we have some ants and also let’s have a look at some piece so here we have a bee and yeah, some, you must structure your folder like this, and then you can call the datasets dot image folder and give it the path, and we also give it some transforms here, and then we get the classes. The class names by calling image sets, image data sets dot classes. And, yeah, then here. I define the training model where I did the loop. Um, and did the training and the evaluation. I will not go into detail here. You should already know this from the last tutorials. How a typical training and evaluation loop looks like you can also check the whole code on Github, So I will provide the link in the description, so have a look at this yourself, and now let’s use transfer learning so first of all, we want to import the pre trained model, so let’s set up this model so we can do this by saying model. So model equals and this is available in the torch vision thought models module, so I imported torch vision models already, and then I can call Models, Dot rest, net 16 or sorry, resonate 18 here, and then I can say pre-trained equals true, so this is already the optimized weights that are trained on the image net data. And now what we want to do. Is we want to exchange the last fully connected layer? So first of all, let’s get the number of input features from the last layer, so let’s say NUM features equals model and we can get this by calling Dot. FC fully connected and then the input features. This is the number of input features for the last layer that we need and then let’s create a new layer and assign it to the last layer, so let’s say model DOT FC equals, and now we give it a new, fully connected layer and N dot Lin. ER, and this gets The number of input features that we have, and then as new output features number of outputs, we have to because we have two classes now. And now we send our model to the device if we have. GPU support. So we created our device in the beginning as always. So this is CUDA or simply CPU, and now that we have our new model, we can again as always define our loss and optimize us, so we say criterion equals N. N DOT Cross Entropy loss and then let’s say the OPTIMIZER equals. This is from the optimization module. Optim dot SGD stochastic gradient descent, which has to optimize the model parameters and we have to specify the learning rate equals. Let’s say point zero zero one, and now as a new thing, let’s use a scheduler. This will update the learning rate, so for this, we can say we can create this by saying it’s called a step. L our schedule ax equals and L. Our schedule ax is available also in the torch optimization module. So we already imported this, and then we can say L our schedule Ax Dot step L R and then here we have to give it The Optimizer so here we say OPTIMIZER and then we say step size step size equals 7 and then we say gamma equals. Let’s say Point 1 this means that every 7 epochs, our learning rate is multiplied by this value, so every 7 epochs, our learning rate has only 10 is now only updated to 10% so, yeah, this is how you use a scapula and then typically what we want to do is in our loop in our loop over the epoch. So for epoch in range, let’s say 100 and then typically here we use the training where we also do the the OPTIMIZER DOT STEP OPTIMIZER Dot Step. Then we want to evaluate it, evaluate it, and then we also have to call schedule a step scheduler step. So this is how we use a scapula. Please have a look at the whole loop. Here yourself so yeah, now we set up the scheduler and let’s call the training function so here we say model equals and then train model. So this is the function that I created and then I have to pass the model the criterion, the OPTIMIZER the Scheduler and also the number of epochs so num epochs let’s say 20 and, yeah, so this is how we use how we can use transfer learning. So in this case, we use a technique that is called fine tuning because here we train the whole model again, but only a little bit, so we fine-tune all the weights spaced on the new data and with the new last layer, so this is one option and the second one is for this. I copy and paste the same thing, let’s see. Where does it start so here? And then as a second option, what we can do is we can freeze all the all the layers in the beginning and only train the very last layer. So for this, we have to loop over all the parameters here after we got our model, so we say for Param in model dot parameters and then we can set the require scratch attribute to false, so we can say Param dot requires GRAT and then say re sorry Dot require. Scrat requires scratch equals false. Now we have it and this will freeze all the layers in the beginning, and now we set up the new last layer. We create a new layer here and by default this has require. Scrat equals true, and then again we set up the loss and optimize and the scheduler in this case and then we do the training function again and so, yeah, so this is even more faster and let’s run this and then have a look at both the evaluations, and I also print out the time that it took. So yeah, let’s save this and let’s run this by saying Python transfer Dot Pi and [Music] this might offers it will download all the images and this might take a couple of seconds because I don’t have GPU support here on my Macbook, so I will skip this, and then I will see you in a second. All right, so now. I’m back so this took super long on my computer. So I reset the number of Epochs to just 2 in this example, so let’s have a look at the results, so after only 2 Epochs, so this is the first training where we did the fine-tuning of the whole model, so this took three and a half minutes and the best accuracy now is 0.92 so 92% and then this is the second training where we only trained the last layer, so this took only one and a half minutes. Approximately Andyeah Curacy is also and it’s already over 80% So of course, it’s not as good as in when we trained the whole training, but still pretty good for only two epochs, and now let’s imagine if we set the number of epochs even higher. So yeah, this is why transfer learning is so powerful because we have a pre trained model and then we only find unit a little bit and do a completely new task and then achieve pretty good results too. So yeah, so now. I hope you understood how transfer learning can be applied in Pi Torch. If you enjoyed this tutorial, please subscribe to the channel and see you next time bye.

0.3.0 | Wor Build 0.3.0 Installation Guide

Transcript: [MUSIC] Okay, so in this video? I want to take a look at the new windows on Raspberry Pi build 0.3.0 and this is the latest version. It's just been released today and this version you have to build by yourself. You have to get your own whim, and then you...

read more