Pytorch Data Augmentation | Pytorch Data Augmentation Using Torchvision

Aladdin Persson

Subscribe Here





Pytorch Data Augmentation Using Torchvision


[MUSIC] In this video, we want to learn how to use PI torch inbuilt transforms on images so first if you’re unfamiliar with data augmentation or wonder why you should use it. Essentially, more data is always better when we train. Internet, and if we can get more for free by doing some transformations to our images, it’s almost always a good thing. In this case we’re loading our data using a custom data set, which I’ve chose showed how to do in a previous video. But that’s really not really the focus in this video. Anyway, you managed to load. The data is fine in this case. The data that were working with is two pictures of cats and we want to apply some transformations to those and we want to see how they look like after we’ve transformed the transformations, and perhaps, most importantly, we want to see. How do we actually apply the transformations? So let’s see what we do. First is we can actually just for image common label in dataset. We can do print image dot shape. Yeah, so we get two images which are colored so three channels. RGB and 224 by 224 size. The first thing I want to show you How to do is we’re gonna use transforms dot compose. Alright, lets. Go back, so were you can see we do one single transformation, which is we convert the in this case numpy array to a tensor, but let’s say we want to do more transformations than this happen here, okay, so transforms that compose is what we’re going to use to combine several different transformations. I’ll talk a little bit more about that later. First of all we want to do is we’re just going to use transforms that compose, and we’re gonna do transforms that to pill image. That’s the first thing we’re gonna do because all the transformations work on this. I guess format when we do it to pill image, so that’s usually what we do first. Then there are a bunch of this different transformations you can apply to two images. We’re gonna go through some of them, but really, there are many configurations of them. You might want to read the documentation for those specifics, but let’s say we want to do transforms that random horizontal flip and we can input a probability here with which is default 0.5 and then in the end, we can do transform start to tensor, so for example. The name is pretty self-explanatory, but what it does is it flips the image horizontally. So what we can do is we can do. We’d be able to visualize this. We can do save. We can use the torch vision that util’s save image that. I’ve imported here so save image image with the name image plus string of image num, which we’re going to define here. Then we’re gonna do plus PNG format and then we’re just gonna do image. NUM plus equals one and let’s say we do that for. We do that ten times. So in total, we will have 20 images and we just run that now. If we go back to our folder here, we have 20 images. We can see that, yeah. So this one is flipped horizontally. I believe that one. Yeah, so some of them are flipped horizontally, and and and that’s kind of, like a simple transformation to do one thing we could do as well is we could do transforms that color jitter with brightness 0.5 so what this does is apply some random brightness, like change to the color to the image. There are more things you can input here as well, but yeah, really. I just want to go through some of the most common ones, and you can choose for your specific case, which one you think is best another one we can do is transform that resize, so we first resize the image to let’s say 256 by 256 and then we do some random crop of that image. We do the crop to 24 – 24 Then let’s say we want to apply. We could also apply rotation to the image, right, a rotated image of a cat still a cat so we can do transforms that random rotation and we input degrees with 45 for example, Um, one more thing we could do transforms that random vertical flip, but perhaps, you know, it’s more common to see. It’s quite uncommon to see vertically flipped images of cats, so maybe this is a low probability of 0.05 and yeah, so those are some examples what we can do. Actually, we can add some more. We can do transforms that random gray scale, with a probability of let’s say twenty percent, so this will convert the image to grayscale with a twenty percent probability another thing that is quite important to do which improves the training and quite a bit is a transform start normalized. What you do after you have two tensor. And for this, you input a mean and a standard deviation and essentially what you want to do here is so for each channel. In this case, we have three channels. You want to find the mean for that specific channel across all training examples and for all of the over all of the pixel values, you would find that mean value and you would also find the standard deviation for all training examples across all pixel values and you would define those for each channel so there will be three values in this case. Then you would input them like this and like this now, of course. I don’t know the mean value in the standard deviation for those two two images that I have in this case, but you find those values first, and then you do that and what it does. It takes each value for that channel, and then it subtracts it with the mean that you input it. And then it divides by the standard deviation. So in this case, this would actually not do anything right since this would just subtract zero and divide by one. Perhaps we can write note. This does nothing. Yeah, but in practice, you would find those values first, and then you would use them. I just want to add a comment here about transforms that compose that what it does is. It applies all the transformations that we wrote inside to the image that we send in, and it also does it in the order that we wrote it, so it performs to peel image before resize and random crop after resize, etc, and since we have a lot of random transformations in the transform compose each time we send in an image, we will get another image as output. So, ya know, we’ve done a lot of transforms on our images. Let’s run this. I’ll see yes. We need a comma here. Anything and yeah, another comma here. It should be as small as yeah. Okay, so now we’ve run it and let’s see so here. We got a rotation and it’s also a grayscale grayscale again. Yeah, so essentially, you can see that the images still all look like cats, except there are variants of that image. And this is a bit weird, but yeah, still a cat. So yeah, that’s an example of how you would use it now. This is just we just do the images, but really what you would do. When training is, you would just do a training loader here using the data loader. And then you would train a network. Yeah, so if you have any questions. Leave them in the comment below. Hopefully this was useful and thank you so much for watching the video.

0.3.0 | Wor Build 0.3.0 Installation Guide

Transcript: [MUSIC] Okay, so in this video? I want to take a look at the new windows on Raspberry Pi build 0.3.0 and this is the latest version. It's just been released today and this version you have to build by yourself. You have to get your own whim, and then you...

read more