Transcript:
What is going on everybody and welcome to part two of the tensor flow objects detection. API tutorial in this video we’re going to be showing you guys is how we can do something a little more custom than just loading in simple images. We can also load in video or what? I’m going to be doing here is using a webcam if you are not familiar with Opencv, which is what we’re going to be using, You can just come to Python program. Annette, search for Opencv and it’ll be this series here and then coming down to part two is where we’re actually going to be loading in from a webcam source. But you can also load in a an actual like video file If you don’t have a webcam or whatever. So anyway, this is just to show you how quick the video detection actually is, at least on a GPU and then also, how do you how to start to adapt this code to something you might actually want to use it with so anyway, minimizing this? I’m going to come over here and the first thing I want to do is I’m going to convert that notebook to a Python file, so I’m just going to do Jupiter notebook, and if you want to keep it in Jupiter by all means, keep in Jupiter and I just want to convert it myself. Where are you, so if you want to convert it? File, download as Python. And then I’m going to do underscore. Ver did good and I’m actually. I’ll sit here. I’m going to edit this also. If you need to install OpenCV or whatever, I’m you can use that tutorial series. So if you don’t have it already installed, pause this and go to the tutorial and make sure you install cb2. Okay, so now I will go ahead and do is first of all we need to import CBT, so we’re going to do import CD, not an all capstone for CB2 and then we’re going to say the capture is C v2 DOT video capture. And if you just have one webcam, it would be 0 I’m using one of my webcam to record me right now. So I’m actually going to use my first it Cam or my second webcam. Basically, so that’s why I’m putting that in there, but if you’re following along and you have a webcam, I would put in 0 so so now that we have that the other thing we need to do is basically. I’m going to come down here and I’m just going to get rid of this because that’s not what we’re going to what I’m going to use matplotlib anymore to show anything, and then, also if there’s anything else, we need to get rid of so this matte pot lid line. We don’t need that either, so I’m just going to leave that as well. So that was the line. Just give that was too quick. It’s like this. Get Ipython magic stuff. Just delete that because that’s not, that’s unnecessary as well. Okay, so once we’ve got the video, all we’re going to do is we’re going to head down to that basically main loop area, so scrolling down into here like, basically, this for loop is that main loop area and what it was doing was iterating through images in a directory, but instead we could actually just have it iterate or not even iterate, but just go through the frames from a webcam or iterate through frames in the video or whatever so we’re going to be using the webcam. So and so what I’m going to do is I’m going to save. Wow, true, and then what are we going to do? Well, it rather than this. This image being image equals image that open and then converting it to an umpire array. And all that I’m actually going to delete this and we’re just going to say ret comma, which is just return information and then image underscore N. P equals cap dot read because this is going to come back already as an umpire. We don’t have to have to convert it so once we’ve done that now we’re ready to just basically come down here and let’s actually visualize it so again. The only changes were making is just right here with the images. I mean, we could probably delete all this stuff and this too, but I’m not going to really clean it up too much, but you could clean it up a little bit more, but these are just the simple changes that we’re going to need to make just to get this to work. So let’s actually show this, so we’re going to do CV to M show and we’re just going to call it object protection. You can call the window. This is what the window name is. You know whatever you want and then we’re going to do is. CBT re-size just to make sure it fits. I’m not sure I think this would display based on like the default camera resolution. I don’t really know what it is. I just know it’s it’s. It’s not a 16 by 9 aspect ratio so anyway. I’m just going to use 800 by 600 so it’s not distorted and it’s not too large to show on video and then what we’re going to do. Is we just throw in this little bit of code. Basically, this is weight, Key, 25 and 0XFF equals or 2 So what this is saying is basically if we exit out the window or rather, actually, if we just press the if we just press. Q will exit and actually, I think we can get away with just having this weight key here, but we’ll go ahead and use for the Q as well. So you can press cute. It closed basically what we’re doing anyway CV to not. Destroy all Windows Camelcase. If that’s the case and then break. And if you don’t have this code here for whatever reason it’s kind of surprising to me. I don’t know exactly how that works in the back end, but basically, if you don’t have this killed here, it won’t actually keep iterating so it’s kind of weird because this is only here for if you want to like, break the entire like running of this loop, but if you don’t have this if check here, it just won’t iterate so I’m not really sure why that why that works that way, but it does someone can comment below and educate me why that how that works on OpenCV? Then maybe it’s really obvious. I’m just missing it anyway. I hit run well. I pressed at five anyway, and it’s just going to take a little bit for everything to kind of load up. It’s got a load into the model and get the GPU version of sensor flow going, but once it does, it’ll pop up, and then I can kind of show you guys. Some examples of objects being detected. I’m gonna go ahead and pause it while we wait, it shouldn’t take too much longer, but I won’t waste any base time, Yep. It was just a second longer, but anyway here. It is and it looks like. I did not camelcase. Wait key. What amateur this alright? Running again? Pausing was some reason. I swear it! Oh, my caps lock is on, okay, really, camel casing, pausing again. Alright, so it is ready. It’s detecting a person on top of my shelf. They’re not very good and my dogs are barking. Lovely anyways. Show must go on so, okay, It is running and like. I was saying before. Um, you know, the frames per second. It’s not like it’s doing like 60 but it’s probably like 15 to 20 which is actually pretty good for, like a running stream of detection and with you know, all the objects it’s detecting here. I mean, I know the least here. The list is 90 but I think actually pull like the whole Coco Like common objects in context. I think it’s 300 but I’m not positive it might be 90 but anyways, there’s a lot of possible objects. Well, anyway, just for an example. We’ve got person here, which is correct chair. Sometimes it calls it a couch. Yeah, but that means pretty close, and I think it just sees cushions but should be able to do like a bottle. Mmm, there we go. Oh, apparently, it’s a wine glass. It’s a bottle now anyway. So you got a bottle? Um, probably do a phone. My guess anyway, Yeah, cellphone so pretty cool anyway. So that’s just one example adapting this code to do something else something other than just classifying images because again to me classifying the images. Isn’t that exciting because we’ve been able to do that with like the hog plus. SVM algorithm, where as classifying video at a decent frames per second is a much a much more impressive task. And this does it really well and its pretty lightweight model and everything. So this is pretty cool now. I’m pressing Q in the future. What I’d like to do is I’ll probably show because I showed some code of basically Using this object Detection algorithm to basically create an aimbot and Grand Theft Auto. I’ll probably do that, but I won’t be including it in this series. So if you want to see that you can go to the. Python plays GTA series. Otherwise, the next thing I want to do in this series is show how we can create our own classifier or our own object. Basically, so there’s a few things we could do. We could train our own object detection algorithm completely and just train it on one object or we can add this object to an already pre-existing model or we can use that model’s weights to at least give us a head start on training a new object. So anyway, that’s the next thing. I’d like to cover with these this attention, law group detection API, because basically up to this point, We really haven’t trained any model at all. We’ve just been using a pre trained model which, honestly there’s so many things we can do with that already, but there’s still a little bit more that we can get out of this object detection. API, so I definitely want to show that as well so anyways. If you have any questions, comments concerns whatever feel free to leave them below otherwise. I will see you in the next tutorial.