Hey, guys, welcome to a new. Pi torch tutorial today. I want to show you how we can save and load our model. I will show you the different methods and safe options. You have to know, and also what you have to consider when you’re using a GPU, so let’s start. These are the only three different methods you have to remember, so we have torch. That’s safe, then torch that load and model dot load state dict and these are all the methods we must remember and I will show you all of them in detail. So torch dots safe here can use ten source models or any dictionary as parameter for saving. So you should know here that we can save any dictionary with it and I will show you how we can use. This later. In our training pipeline, so Torch Dot safe then makes use of Python’s Pickle module to serialize the objects and saves them, so the result is serialized and not human readable. And now for saving our model. We have two options. The first one is the lazy method, so we just call torch dots safe on our model, and we also have to specify the path or the filename and then later when we want to load our model, we just set up our model by saying model equals torch dot load, and then the file name again, and then we also want to set our model to evaluation method so by saying model dot evil. So this is the lazy option and the disadvantage of this approach is that the serialize serialized data is bound to the specific classes and the exact directory structure that is used when the model is saved, so there is a second option, which is the recommended way of saving our model. If we just want to save our model, our trained model and use it later for interference, then it is enough to only save the parameters, and as you should remember, we can save any dictionary with torch safe, so we can save the parameters by calling torch and then model dot states ticked, so this holds the parameters and then the path and then later when we want to load our model again first, we have to create the model object and then we call Model Dot load. States ticked and then inside this we call torched at load path. So be careful here. Since load static doesn’t take a only a path, but it’s that it takes the loaded dictionary year and then again we set our model to evaluation mode. So this is the preferred way that you should remember, and now let’s jump to the code to see the different saving ways in practice so here. I have a little script where I defined a small model class and here I created our model and now let me show you the lazy method first. So first we define our file name, so we say file equals, and let’s call this model dot pth. So it’s common practice to use the ending dots pth so short for Pi Torch, and then we save the whole model by saying torch dot safe and then model and the file, so let’s save this and let’s run the script, so let’s say Python Save Load, Dot Pi. And now if we open up our browser, so we can ignore this warning here, then we see here. We have the model dot pth file in the explorer. And if we open this, then we see that this is some serialize data, so this is not human readable. And now if you go back to our code, so let’s load our model so we can comment this out, and we also can comment this out, and then we can load our model by saying model equals E Torche dot load, and then the file and then remember, we want to set it to evaluation method, so we say model dot evil, and then we can use our model. For example, we can inspect a parameter, so let’s say for Param in model dot parameters and then let’s print our Param and save this and lets clear this and run our script again and let me make this larger for you. So now if you run our script again, then we can see that we loaded our model and we can use two parameters, so this is the lazy option and now let me show you the preferred way of doing this so instead of just saying torch dot save model here and what we instead want to do is to say torch dot safe, and then we want to save the states ticked so here we have our model again, and then we say Torch Dot model dot torch dot safe model dot state state dict and then let’s run this, so let me clear this and let’s open up the explorer and delete this file here, and now if we run our script again, then we again have the file, but now here, it only saved this date. X, and now if we want to load our model again, we first have to define it, so let’s call. This loaded model loaded model equals, and then let’s also say the model and the number of input features is the same so 6 and then we call loaded model dot load, state dict and inside this. Remember, we have to call torch Dot load and then the file name. And then again, we set our loaded model to evaluation mode and then if we run this, so let’s print the parents of the loaded model and up here. Let’s also print the parents of our normal model, so this if we don’t do any training here, then our model is still initialized with some random parameters, so let’s run the script and let’s check if the parameters are the same so here, yeah, we see that it worked and it first printed the parameters of the model of the normal model and then here of the loaded model. So these are the same, so we see we have a tensor with the weights, and we also have a tensor with the bias, and this is the same for both of our models, so we see that this worked too so yeah, again. So this is the recommended way of doing it by saying safe Dot model state ticked, and then when we load it, we call the load static method, And, yeah, so here we just saved. State Dick. So this holds the parameters. So let me show you how this state dick looks like. So when we have our model, Lets print model dot state dict and let’s save this and let’s clear this and run the script. Then we see here. We have our state date so here we have the linear weight, which has the Tenza with the weights, and then we also have the bias ten zero. So this is our state dict, and now let me show you a common way of saving a whole check point during training. So as you know, we can save any dictionary here, so let’s say we also have a optimizer here, So let’s say we do find a learning rate, so let’s say this is Point Zero Zero One, and we also have a optimizer. This is let’s say Torch Dot Optim dot. Let’s use stochastic gradient descent and here we want to optimize the model parameters and we also have to give it the learning rate by saying learning rate equals The learning rate and our optimizer also has a static so we can also print the OPTIMIZER static. Now if we clear this and run this, then we see the state dictionary of the OPTIMIZER where we can see, for example, the learning rate and the momentum so now during training, let’s say we want to stop somewhere at some point during training and save a check point. Then we can do it like this, so we create our checkpoint and this must be a dictionary, so let’s create a dictionary and as a first thing what we want to save is, for example, the epoch, so let’s define the epoch, so the key is called epoch and let’s say we are just we are in Epoch 90 and then we want to save the model state, so we have to give it a key. Let’s say model state and here we use Model Dot State Dict, and then we also want to save the OPTIMIZER static, so the key that’s a Optim state and then here as a value we have to call OPTIMIZER State Dick. So this is our checkpoint and now we can call torch dot save and then save the whole checkpoint, so let’s say torch checkpoint as a file name. Let’s call this check point DOT P Th, and now again, let me show you the Explorer and let’s run this script so now we clear this and run this, then we see we have our checkpoint here, and now when we load this, we want to load the whole checkpoint, so we can comment this out and also we don’t need this, so let’s say our loaded checkpoint, so let’s say, loaded checkpoint, equal torch dot load, and then the file name was the same as this one, and now you have to set up the different model and optimizes again, so we can get the epoch right away by saying epoch equals load at checkpoint. So this is a dictionary so we can just call the or access the epoch key and then for the model. Remember, we have to create our model here again, so let’s say model equals. And then the model, with the number of input features equals 6 and the OPTIMIZER equals the same as this one, so we don’t have to use the same learning rate. Actually, so let me just wrap this one here and paste it down here. And for example, we can use the learning rate 0 and then later we can see that we load the correct learning rate into the OPTIMIZER. So now let’s say Model Dot load, state dicts, and then here we give it the check point, and then we access the key. We call it model state, so this will load all the parameters into our model and the same with our OPTIMIZER. So we call OPTIMIZER DOT load state sticks, and then we use the check point and here. We called it up. Tim State. So now we have the loaded model and the OPTIMIZER and also the current epoch. So we can continue our training and let me show you that. This is all correct by saying we want to print. The Optimizer Dot State ticked. So if you notice here, we set the learning rate to zero, and then we loaded the correct state date. So now if we run this and as a last thing, we print it and optimize a static, then we see we have the same learning rate as in the initial OPTIMIZER. So this worked too. So this is how we can save and load a whole checkpoint and yeah, these are all the three. Um, ways of saving you have to know, and now as a last thing. I want to show you what you have to consider when you are using a GPU during training. So if you are doing training and loading both on the CPU, then you don’t have to make any difference, so you can just use it like I did here, but now, if you save your model on the GPU and then later, you want to load it on the CPU. Then you have to do it this way, so let’s say somewhere during your training. You set up your cuda device and you send your model to the device and then you save it by using this date date, and then you want to load it to the CPU, so your if you CPU device, then you create your model again and then you call load static and then load path and here you have to specify the map location and here you give it the CPU device, so this is if you want to save on the GPU and load on the CPU. Now if you want to do both on the GPU, so you send your model to the cuda device and saved it, and then you also want to load it on the cheap you then you just do it like this, so you setup your model, you load the static, and then you send your model to the cuda device and now as a third option, so let’s say you saved your model on the CPU, So you didn’t send you didn’t call model to cuda device somewhere, but then later during loading, you want to load it to the GPU? You have to do it like this. So first, you specify your cuda device. Then you create your model and then you call model dot load static and then torch load with the path and then, as map location, you specified CUDA and then : and then any GPU device number you want so for example, CUDA : zero and then also you have to call model to device so through the CUDA device, so this will send the model to device to the device and also all the loaded parameter tencels to the device. So then you can continue with your training or interference on the GPU. And, of course, you also have to send all the training samples to the device that you then use for the forward pass. So, yeah, this is what you have to consider when you’re using a GPU and now you know, all the different ways of saving and loading your model, and yeah, that’s all for now. I hope you enjoyed this tutorial, and if you like this, then please consider subscribing to the channel and see you next time bye.