WHT

Runtimewarning: Invalid Value Encountered In Double_scalars | Seaborn Pairplot | How To Make A Pairplot In Python And The Seaborn Pairplot Interpretation

Kimberly Fessel

Subscribe Here

Likes

133

Views

3,787

Seaborn Pairplot | How To Make A Pairplot In Python And The Seaborn Pairplot Interpretation

Transcript:

Hi, welcome back to this intro to seabourn. My name is Kimberly Fessel. And today we’re talking about the Seabourn pair plot coming up in this video. I’ll start off with the basics of the pair plot, including how to code up a basic pair plot in seabourn as well as the interpretation of the pair plot. Ill then move on to the different kinds of figures that you can display on your pair plot as well as how to use categorical variables and how to select certain variables to appear. Finally, I’ll leave you with some advanced styling so that you can make your pair plot Look exactly the way that you want it so with that said, let’s dive into some seaborne code. By the way you can check out all of the code, I’m about to show you on my Github page. And if you have any additional questions, feel free to leave me a comment below. [APPLAUSE] So in the seaborne code. The first thing I’m going to do is, of course, load the Seaborn library as well as the Pi plot and Numpy libraries, and I’m also loading in some data from seaborne called tips. Each row in this data frame represents the tip amount that a server received based on one set of customers. I’ll also go ahead and set my seabourn style to be dark grid just for aesthetic purposes and now we’re ready to create our first pair plot to do that. I’m just calling up the seaborne library and referencing the pair plot and now what I want to do is actually pass the full tip’s data frame. What seaborn has done is now created multiple different figures based on the variables in my data frame, So I’ve actually took a look at this data frames types. Let’s take a look at the data types. What I would see is that I have three different Numerical values and four categorical values in that original data frame. Seaborn is actually detecting that I have three numerical values and that is what it is plotting on my pair plot, so taking a look at what some of these figures are. I see a nice histogram in this upper left corner. That one corresponds to the total bill. So if I look at my Y label, its total bill. And also my X label is total build, so I’m going to see histograms or some kind of distributional plot along the diagonal that just gives me an estimate of the marginal distribution of each numerical feature in my data frame on the off diagonals, However, I’m going to see some kind of relationship plot, so right now I see a scatter plot and what I can see in. This first figure is total bill versus my tip amount. So this could actually help me try to figure out, you know. Is there. Some kind of relationship between total bill and tip amount looks like that’s roughly linear. I might even decide to build a linear regression model based on what I see in this pair plot. One final thing to point out before I move on note that these tick labels actually correspond to the relational plots and not the histogram. So if I actually drew out a histogram for just that size feature, I would find that I have over 150 parties that fall into this size 2 category, But when I look at my pair plot, I do not see that on the Y axis here at all because these tick labels are corresponding to the relationship plot and not the histogram. [APPLAUSE] So the seabourn pair plot actually combines together many different seaborne figures all on one plot, so you may see a scatter plot or a red plot on the off diagonals and a histogram or a kde on the diagonals, so let’s take a look at the seaborn code that allows you to change these options first up, just notice that the default here is going to have scatter plots on the off diagonals and histograms along the diagonals. But you can switch. What is on your diagonal By referencing this argument? Diag kind ill. Just go ahead and switch that over to a kde plot. Instead, you also have the option to change. What is on your off diagonals again right now. We have a scatter plot, but I can go ahead and switch the off diagonals by referencing this argument kind and I’ll just switch that to a reg plot, so I also just wanted to mention that the pair plot actually returns a pair grid, so if I create a pair plot and then save the return object as G, I can check the type of that object. It is a pair grid, a seaborn pair grid. I’ll talk more about these pair grids and also facet grids and upcoming videos. But for now just note that you could save this pair grid and then actually add to it. So here I’m going to add a kde plot on the upper triangle part of this pear grid. So here I’ve added in that Bivariate KDE plot. [APPLAUSE] So so far, we’ve just looked at numeric data. But what can we do with categorical values? Well, it turns out that the seaborne pair plot allows you to display categories through a color or hue property. We can also display only certain variables along either the X or Y Axis, So let’s take a look at the Seaborn code to allow us to do this. Well first off, I’m going to create a new column of data called weekend and this basically just tells me if this day is a Saturday or Sunday, so true or false. Now I’m actually going to use this new weekend column in my pair plot so here I’m going to specify that the hue or the color of my pear plot should correspond to this new weekend column, So what I’m seeing now is that each of these two true and false trues been mapped to orange and false has been mapped to blue. I do see each of these. Scatter points, reflecting that hue change and I’ll also see along the diagonal that the diagonal has now switched over to a default of KDE, and I see two different distributions for these two different, true and false weekend or not weekend so this can be really helpful in diagnosing whether or not this category is important for a particular feature, For example, it looks like people tend to spend a little bit more on the weekends as compared to the weekdays. One other thing that happens, though I get this error and I’m actually getting a whole new column and row of figures and so what’s happening here Seaborne is noticing that I’ve created a new column and it is actually a boolean column. So Boolean columns will be treated as numeric for this pair plot and we will actually see a brand new row and column of figures. Um, these figures are definitely not super helpful. So what I might want to do is actually not display that weekend column of data in my pair plot and I can specify the exact variables that I would like to include, so let’s say, actually just want to see the total bill and the tip amount. Now Seaborn has switched down to a two by two pair plot. With only those two variables, the total bill and the tip amount, I can also be super specific about exactly which variables should appear on which axes. So let’s say this was a regression problem, and I’m really just interested in. How do a couple of these variables factor into the amount that someone believe is a tip? So let’s say I’m actually only interested in total bill as well as the party size so here I’ve specified that along the Y-axis. I should only have that tip column That tip feature, whereas along the x-axi’s. I actually do want to see both the total bill as well as the size, and I’ve still included weekend as my hue value. So I’m able to see that There are some differences on weekend versus not weekend, especially over here in the regression between tip and total bill. [APPLAUSE] now that we’ve got the basics down, lets. Take a look at some advanced styling, so let’s go ahead and take a look at some styling for the seabourn pair plot. The first thing I’m going to do is actually delete that weekend column that I just created just to streamline things a little bit and one thing here that we can do for styling that can actually make quite a bit of difference is actually change the height or aspect ratio of your pair plots, so here is a tiny little widget in order for us to see the difference, so these two properties are called height and aspect. The default for height is 2.5 and the default for aspect is one, so lets. Just take a look at what changing these two will do. If I start decreasing the height, let’s drop it down a little bit, youll. See what happens, the pair plot just becomes smaller or I can, of course, increase that. Make it a little bit bigger now. I’ve got larger figures. The aspect ratio does probably exactly what you’d think. If you start decreasing that you’ll see these really skinny sort of plots occur or, of course, you can increase that and make wider plots. This is basically just the aspect ratio, the width versus the height. Another option You have, of course, is to change the color palette, so right now. The default palette makes things look like blue, orange, green and red. If you’d like to change that you can do this with this palette option, let’s actually change this over to plasma, and you can choose from any one of Seaborne’s different palettes to style, this exactly the way you’d like. And we can pass keyword arguments to specific parts of our pair plot, either to the off diagonals with plot keywords to the diagonal with Diag keywords or to the grid with grid keywords. Here’s that Seaborn code. So there are many many different keyword arguments that you can pass into these pair plots in order to make these look exactly the way that you’d like and the two main ways that you’re going to do. This is either with Diag keywords or plot keywords. Let’s start with Diag keywords. This is going to accept a dictionary and what this is going to be doing is passing whatever keyword arguments you put in this dictionary will be passed to the plot type that is on the diagonal, So right now we have that histogram and so, of course I could change the color. Let’s switch that over to gray now. This is just changing. What is on the diagonal because we use Diag keywords and we could also potentially switch this to be a little bit more transparent with alpha. So depending on what kind of plot you have on that figure will determine what kind of keywords you can pass to this Diag keyword’s argument You can also, of course, change the styling of your plots on the off diagonal. So right now I actually have a reg plot. I can change the keyword arguments of that red plot by referencing plot keywords, and so this again accepts the dictionary And what I’m passing in here are any keyword arguments that the reg plot would normally take so one thing? I know I can do in a reg plot. If I want to turn this shaded area about my line off, that’s just with a property called CI for confidence intervals and I just switch that over to none. Now that completely turns off that shading. I can also switch the color. If I’d like, let’s switch it to Xkcd’s salmon color, a nice, bright pink here. So if that turns out to be a little too bright for you, you can actually change the color of only the scatter points, so we do this by referencing the scatter keywords and this again takes another dictionary, but what we’re doing now is actually referencing the color of the scatter points only so notice how this structure works again. Plot keywords just passes the keywords to the Reg Plot. The red plot has an argument called scatter keywords, which accepts a dictionary, etc. So we’re really just nesting these keywords in each other, and of course we can change. Maybe the size of those dots as well. So thanks so much for joining me to learn about the pair plot today. If you want to learn more about any of those figures you saw on the pair plot. Go ahead and check out some of my past content, For example, these videos on the Reg plot and the disc plot. Thanks so much and ill. See you in the next one, you.

0.3.0 | Wor Build 0.3.0 Installation Guide

Transcript: [MUSIC] Okay, so in this video? I want to take a look at the new windows on Raspberry Pi build 0.3.0 and this is the latest version. It's just been released today and this version you have to build by yourself. You have to get your own whim, and then you...

read more