WHT

Pd Scatter Matrix | Python Pandas Tutorial 31 | Python Data Visualization | How To Create Scatter Matrix

Data Science Tutorials

Subscribe Here

Likes

139

Views

19,999

Python Pandas Tutorial 31 | Python Data Visualization | How To Create Scatter Matrix

Transcript:

Hi, there! This is a shake and in this video. I will talk about the scatter plot matrix that you can create with the help of the panda’s library. So as you can see, I am. I am showing you here right now. The scatter plot matrix. So in on the x-axi’s, if you see, there is sales discount profit, all of those different numerical variables, the same numerical variables you will see here on the y-axis and the benefit is that that ensures one simple graph. You are able to see the overview of how your different variables are correlated with each other, so for example, in case of floods, it is showing a linear relationship so as you know, the sales and the profit is showing the linear relationship, they are going hive and the profit is increasing. The sales is also increasing so always over, so you can say that sales is increasing. That profit is also increasing and basically try to see whether any correlation exists or not, so for example, in case of unit price and in case of discount doesn’t seems to be a correlation similarly in case of profit and discount similarly in case of this, this unit profit unit price and is found like shipping cost and discount Doesn’t seems to be a you know, visible in your relationship. Same is the case. If you see the histograms, these are all the histograms, which are diagonally presented for this matrix doesn’t show that there is normal distribution, but they are skewed either to the mostly towards the left side, so so how we can basically go ahead and create it within the Python notebook. I will just show you in few minutes. Okay, so here we are on a scatter plot matrix. So if I just get in poor Parnassus, PT and get the plot in line, I will just use percentage. Mac oclock, live in line. So okay, after that, let’s get our dataset so here we are with our dataset, which what we are seeing is with the help of partner’s library. PD, which is read dot. Excel, this is the path of the file that we need to access and 0 is basically indicating that we want to access the first sheet, which is the order’s sheet, and then we are storing all of this data here in the orders object. Alright, after that, let me show you the first couple of observations. Maybe just to, yeah, so here. We have, you know the sales. The discount, you know, preferred unit price all of those values. So if I show you all the columns, I can use order. Start columns and here. My all the columns are over here, and if I want to know how many total rows and columns which are presents and I can use the shape, so we know that now. Eight three nine nine rows are present and twenty-one columns. If I want more information about these different columns, then I can use info function, which will give me that for OID. How many rows are there? Whether it’s a not null or a non level and then D, what is the whether it is a float integer date/time what it is, so some useful functions you will always use in in every analysis just to get the idea about job data. Set okay now. Let’s come to the business with these scatterplot metric, So the thing is say. VD, dot scatter underscore matrix. And within that if I show you some of the columns that we need to fill so for example, we need to say frame alpha figure size, and you know, different parameter spread diagonal by default histogram. All of those options are default vision, so let’s go ahead and try to create the histogram the scatter matrix within default parameters. So what do you do in this case is we will write orders. Not tell oh, see, or maybe is good, and we want all the rows, and then we want columns from from. Let’s say over here. We have sales discount, then profit, then unit price and shipping cost, so let’s go ahead and write Sales : and then shipping cost covering all of these, so what it does, Basically This is scatter Matrix just picks up all the numerical columns, which are there it will not take this ship mode, which is coming in between of the discount and profit and we’ll just show sales, discount profit unit price and shipping cost. Even after that if you would have any variable over here after a couple of object variable, which contains the textual data, it would have skipped that and gave us back the numerical columns, All right, so this is about the basic, You know, information that you need to provide, so let me hit shift enter, and let’s see what it produce, okay. I’m here so here. We are with this plot and here. We have our scatter plot now. The issue is that this plot is not properly sized. So what we can do is we can go back again up there and well the parameter. If I show you after comma and pressing shift tab tab and the fixed size, I’m using the fixed size to change the size of this figure, and maybe I won’t and say 12 for more 10 Well, now let me press again shift, enter, and now let’s see how it is coming, so if I go down, okay, now it’s coming mid paddle [Music] so with that, you can basically experiment fixed size so parameter that what is the figure size that you need so 12 10 Maybe you just want six comma 10 or 10 comma. Saids depends on your requirement. You can play around then. Get it done! Let’s see another parameter, which is very interesting right now. If you see for this diode mode representation, we have the histogram, but there is a you know, option, which you can change to a more visually aesthetic, You know the diagram, which is KDE kernel density estimation, so let’s go ahead and see that so and just copy the previous code. So then I won’t have to write it again or maybe here itself. I can change now. II, firstly, diode. No buy before. If I show you, what doesn’t shift tap tap? This parameter diode know is basically histogram so you can change it to Katie II as shift enter. If you take couple of seconds now, it is there and here now you have the. Kde, which shows much better information as compared to the histogram about the distribution of laid out. Clearly seeing that here is a peak and then we have seen that the values are starting there at at all, but right on this point, it is going up to the down so that must be. Basically, you know? Give us an indication that what’s what’s really going on with the data. So for example, here’s set so we are chartered over here. It’s going up and then finally going down, what’s really going? Oh, yeah, is there any pattern that it is revealing or any information inside the data that that you know you? Explorer, you should explore, so that’s how you can change the KDE apart from that. It has as you have must have seen if I’m dressed and going here and pressing shift. Tap down again, It has like marker basic one alpha, very, very basic one. Then City KWS has to history. Ws range pairing all those different parameters that you have, and you can actually go to the documentation of of this scattered matrix. If you really want to go deep down and try to understand all of these parameters, which basically you can, you know, utilize to create an impressive and metal scatter matrix, but the idea was that. I’m giving you the quick overview of scatter metrics about how you can use it, but I find it very useful. I always create this metric, man. However, I am doing any data analysis or any designs checked, so that’s pretty much about it and I’ll meet with a new video. A new topic.

0.3.0 | Wor Build 0.3.0 Installation Guide

Transcript: [MUSIC] Okay, so in this video? I want to take a look at the new windows on Raspberry Pi build 0.3.0 and this is the latest version. It's just been released today and this version you have to build by yourself. You have to get your own whim, and then you...

read more