Transcript:
Hello, everybody and welcome to data talks. Where today we’re going to be doing exercises together, we’re gonna be doing panda’s exercises. If you want to go ahead and learn a little bit more about the panda’s exercises that we’re doing, there’s a repository link in the description, and there’s also an introduction to this sort of series. If you could call it that above so without further ado, let’s get started. This is the inaugural episode here, so we’re starting with getting and knowing your data and starting with the Chipotle. Whatever one was first. So I think what my students have actually done this, um? I’ve never done this myself, so I’ll be going through this the first time with you. So hopefully there’s no hiccups, but let’s get to it so importing the necessary libraries. We will go ahead and time this so import pandas as PD. So Panas panel or pandas stands for panel data, so we’ll go ahead and import the panel data import data set from this address. Okay, so we’ve got a URL with a tab separated value. Lovely, so I don’t know if you guys know this, but a really nifty thing about pandas don’t read CSV just read CSV is that it can read from URLs. Another super nifty thing is they can also read tab separated values. So see there we go, so we’ve got our data frame in sign it to a variable called. Po po equals! DF great look at the first 10 entries. You guys shouldn’t go away. You guys should know this, so oh. I messed up, got it So Chip Home Dot head. This will shows the first entries and we can specify a number here to get those 10 so first 10 entries Looks like we’ve got order. Id’s, okay, you’ve got quantities. We’ve got the name of the order that we have the choices associated with the order, right, so if you’ve got chicken bouillon, you choose what’s also you get and whether you you get like black beans or stuff like that, and then we’ve got the price, okay, number of observations in the data set, so this is pretty easy as well. You can always just do it, Dot. M phone. This will this will tell you the number of observations. So four six two two also tell you which ones are known which ones aren’t null, but you can also do and. I think this is probably what they wanted Dot shape. This will look at the underlying Numpy Array. So if says Liz that there, it’s a it’s an array that is or it’s a matrix that is 4622 by five, okay, number of columns in the data set so five and print the name of all columns to print the name of columns. It’s very simple. We just go ahead and do don columns. How is the day set index? You could look that up with the head or also you can always do Dot Index. Oh, man, we are zooming and they are there. There’s 617 okay, so we’re almost halfway done. What is the most ordered? I don’t okay, lets. Look at this, so we have items and we have quantities, so let’s go ahead and do a group buying, so we’ll go ahead and we will group by item name name. You’ll go ahead and we’ll pass in here. We go just this an array. So it gives us back a data frame dot. AG. If you guys are interested in learning a little bit about group by, I’ve done a nice little panda’s tutorial on it, so check it out and we’ll go ahead and aggregate the quantity field. We’re just gonna go ahead and sum that up and this will give us I. Tim wants right, okay. Let’s go ahead. Oh, beat this extra space! Print this out, okay, And then the only thing we need to do now is we just need to do a value now. All of you know, we just need to do a value sort, okay. Donna, now, you know? Underscore sort. Data frame has no object value sort. Okay, don’t set index, lets. Go ahead and just sheet here. So instead of setting the index will do this now. We can do donuts now. You sort yeah. This is not a data frame, Quants sort. I wish I were getting autocomplete here. Unfortunately, it’s not working at the moment. Oh, short values. Yep, so always remember to check. They’re the things backwards so sort values. I’m missing one by, okay, So in this case, that’s that’s what I thought, so we’ll go ahead and sort the values by the quantity. Okay, okay, and then we’ll make it descending. Let’s just let’s just try okay, By axis ascending equals true, so ascending equals false. Okay, so should do it now. There’s a lot of them here, so we’ll just look at the top five great. So the chicken bowl. That is that is the number one. Yep, took a little bit of time, but we’re a little bit Rusty. We’ll get into it for the most ordered item. How many items were ordered? Well, there you go. What was the most ordered item in the choice description column? So we’re gonna do the exact same thing, so let’s see here, but down here, And instead of item name, we’re just gonna do choice description. Diet Coke. It sort of depends on how you count this. If you consider rice is one of the items instead of this sort of like fresh to make this this sort of combination package, then this this will be very, very different, though to get rice. We need to do some really fancy string. These plus array work here, so lets. Just assume this is what they wanted, okay. How many items were ordered in total so 159 or it wants all of the items in total period in which case we can just go ahead and do cheapo the dot quantity. Donna, some will give us all the items. Turn the item price into a float. Okay, what is that? What does that mean, right, so item? Christ’s right now is a string with this sort of nasty thing on it here, so in order to go ahead and make it into a float. We’ve got a couple of ways to do it. So first we need to do Daunt item price dot, stir, Daunt slice and then go ahead and just take it from one, so you’ll notice this gives us this, and then maybe we can just do as type float that there we go great, That’s exactly what I wanted and always Dot notation is always a little bit messy, so so let’s go ahead and look at this Chapo info. This will go ahead and tell us whether we made this, right. Look at that item price float64. Oh, now, normally what I recommend doing is using the P dot two numeric. In this case, this is kind of like your go-to function to convert strings and to numeric types, But since we’re going for a speedrun, that’s what we did. Check the item price type. Okay, looks like. I just did that myself, okay. I create a lambda function and change the type of item price. Okay, so this probably wants us to do something like a a chip owed apply, um. I I generally don’t a lambda function. Yep, yep, yep, so I generally don’t really recommend using the dot apply If you can sort of do it in steps like I did up here. It’s a little bit slower, generally speaking, so we’re going to need to rerun these cells up here just so we can get the original chip data set back so will not do this instead. We’ll, just sort of do a head here, just so we can sort of guarantee that it was do kill that one. Okay, a so create a land of, Yep, so create a lambda function, so this will apply apply goes ahead and takes in access equals one. I believe that’s the Rose. Know what else? It takes a function the function and we’re gonna go ahead and get Lambda function instance. It want us to do that. This will go ahead and it will take the row lets. Just make this top item price, so we’ll take the item price. We’ll go ahead and take the item price and slice off the first one of them and then do a float thing to it, and that looks that looks good. Lambda got an unexpected Argh axis, so a float return. Oh, God, why why why would they want us to make this into a lambda function? So it’s saying our Lambda function gun unexpected. Argh Axis keyword argument. But I went ahead and load two fighter here, So Lambda X, let me just make sure Lambda and a colon goes to okay. Well, let’s just put this outside of here just to make a little bit easier. So this is our Lambda function well. Put it here, we’ll apply this Lambda function down the tear. I’m just so got a an unexpected keyword argument. Oh, oh, of course we don’t need. We don’t need the access in this case because we we literally have it here. It’s not okay. Well, there you go. So that’s that’s one way you can do it. I really wouldn’t recommend doing it this way because I think it’s just generally faster to do the other way, and it’s better to sort of get to know the environment, But if you like, Lambda functions, go for it, check the type. The type is obviously in 64 Um, so lets. Go ahead and just assign this since they really wanted us to do it. So she posed item price cool and let’s go ahead. And so how much revenue was for the period of the data set? It’s pretty simple. We can just do a sum on this $24 Remember to take this five off? This is what you get when you when you try to work really really quickly. Okay, but we got it so we will apply the lambda. We’ll take the sum. This looks more like it $34,000 Mmm, how many orders were made? During this period. Just look at the quantity, some 4972 What was the average revenue per order? We get the item price divided by the quantity. I guess I guess it. Doesn’t mean order. In this case. It doesn’t mean like quantity. In this case, it probably means the exact order, so all you really need to do Is you just need to, you know, find the number of orders, which is actually the shape of the data set so 4622 so we could just go ahead and take this so seven bucks. Cool, cool. Cool the second way that you can do. This is just take the average or the mean. That’s probably what they wanted to do and how many different items are sold. Oh, we are so close, lets. Go ahead and do you cheap? Oh, Dot item name dot in unique, unique in unique fifty. Okay, so that was a speed run through. Chipotle exercises a little bit longer than I thought it would take 15 minutes, but maybe this went ahead and and helps you a little bit. If it did. Please go ahead and leave a comment. If you’d like to see me, do other types of exercises for you, so you can finish your data science homework. No, well, honestly. I kind of think this is a. It’s a nice thing to do to sort of see like an actual data scientist. Try to solve these types of problems. Maybe next time I should go a little bit more slowly and in sort of in a careful plotting manner. But otherwise I. I hope you guys enjoyed it, and if you did leave a like, subscribe to the channel and let me know what else you’d want, thanks.