Transcript:
[MUSIC] In my last video, I went through how to build a simple neural network in Pi Torch, and we went through how all of the imports work? We went through how to load data. The hyper parameters, the loss function and training the network and Lastly, also checking the accuracy of the model on the training in the test data. All we want to do now is instead of using a simple, fully connected neural network. We want to modify the code and create so that we use a convolutional neural network. Instead so convolutional neural networks works a lot better on images then fully connected so. I believe we’re gonna get better performance on the emne. Stata set. So lets. See what we get, but first we have to create the network, so let’s make a class. CNN inherit the N N Dot module, then in our function we’re gonna take as input the number of in channels of the image, so in Mn s, this is gonna be 1 if you use something like Cypher 10 dataset or something that has color, then there’s probably gonna be in channels equal to 3 because of RGB, But in our case, we’re just gonna be. It’s just gonna be 1 and then the number of classes which is going to be 10 yeah. I guess we could do like this as well set it to a standard of ten then. We need to call the super right. Okay, so what we want to do? Now is we want to create a convolutional layer so we can do self-talk on one equals N N comm 2d and then here. Yeah, we can see it here. We have his argument in channels out channels. The kernel size stride padding. I’m not gonna go into depth on how those work or how convolutions work just the implementation of it in the comment. I’ll link you some resources. If you want to learn more about how they actually work, so the in channels that we’re gonna have is as I said one, and then we’re gonna choose some arbitrary out channels. Let’s say 4 or 8 then what we’re gonna use Kernel size is if we’re gonna use a 3×3 stride, we’re gonna use a 1×1 padding. We’re gonna use a 1×1 These might be standard. I’m not really sure apart, sets them. These might be default, so you don’t have to actually write them explicitly, but it’s good to do just to know which arguments to that we can send. So what we’re gonna use is a kernel 3×3 this again! This is quite an arbitrary choice. There’s a reason why. I chose a 3×3 kernel, along with these padding and strides. This is called a same thing. It’s called a same convolution. Same convolution means that the normally the output size will change depending on what we use for kernel stratum padding using these specific ones. It’s actually going to turn out that we keep the same dimensions, so when we send in the input, it’s gonna be 28 by 28 If we do with these values, we’re gonna still have 28 by 28 output. Yeah, so. I just wanted to show you are using the formula. I’m not gonna derive the formula, but just show you that. If we use that that kernel size and the padding and that’s right, then we get a same cover us like a same convolution, meaning the the N in equals the and out, so an in is the number of input features. We have 28 using the M&S data set the padding, which shows was 1 so 2 times 1 minus 3 divided by the stride. Take a floor of that. And then we plus 1 So essentially, this is gonna go going to be 27 right. This is going to be 27 plus 1 which is 28 right, so this is the! N Out right here. And then in was also 28 Alright, we want to do next is use a pulling layer Max pool 2d which? Yeah, so we’re gonna use 2 by 2 and stride 2 by 2 and again using the same formula that we use for this one. If we it’s gonna turn out to have the dimension size, So let’s say that we have 28 by 28 We use a max pool. It’s gonna make it to 14 by 14 Then let’s just do this again. Have to calm too. And then Lastly, we’re gonna have a full of connected layer. Yeah, let’s say so. What we need to do here is that we had the out channels from this to eight, so the in channels from this needs to be eight and let’s set some our channels to maybe sixteen. Then what we need to do is sixteen here times, so what we’ve done here is that but we haven’t actually implemented the forward yet, but I’m thinking we’re gonna use two max pooling layers, so we’re gonna have the input two times, so twenty-eight becomes 14 and then half again becomes seven, so that’s gonna be the end of the end and the linear, and then here we’re gonna you just make the number of classes so define forward. Yeah, so X is F dot relu, set of com1 of X and then X is self dot pool of X, and then again, we’re going to use this. Just come to and again self that pool of X and we can use, reuse the same pooling and then we’re gonna reshape X because we needed to remember this doesn’t. Actually, It’s still a three dimensional tensor or I guess I four dimensional tensor. If you including mini-batches, we want to reshape it, so we keep them in a batch, the number of examples that we send in, and then just minus one and yeah, and then we run the self dot FC, one of X and return X, right. So this is our CN. N what we can do as we did before. Let’s see do X model is CN n of? Yeah, just CNN. X– is torched dot random. That friend before, but one twenty eight, twenty eight. Yeah, okay, yeah, so we need a comma here and a comma here. Yeah, that! R and I think yeah, okay, great, so then we call the model so print model of X dot shape. Yeah, so same kind of basic check that we did in the last video. Do the same here for the convolutional just to make sure that, okay, we’ve implemented it. At least it passes the basic check, so after that, we can reuse a lot of this, the code that we did before, except the input size now we use. I guess in channels instead And let’s see, Yeah. We’re gonna call CNN instead. We actually don’t have to call anything since we. We set them default here, so we can just just do it like this, and then remember that it’s already in the correct shape, so we don’t. We don’t want to flatten it like we do here. That’s actually what we do in this part of the network, so we can just remove this part. The rest stays exactly the same. We still want to do the forward the backward, the great intent step, and then similarly here, remove the reshape part, and I think that should be it with ultra network for five epochs and Ill. Come back to you when we get the result, all right, So we got the result and on the training data disc at 98.5 58% and on the Test data at 98 Point 36 All right, no terrible, good, so yeah. This is how you code a simple. CNN, if you have any questions, write them in the comment and thank you so much for watching the video.