Transcript:

The DM logistic regression derives from the logit transformation, which is used on in the background of the regression. How is it different from ordinary least-squares or linear regression that we’ve covered in our previous chapters now in logistic regression, the outcome or the bind? The dependent variable is always a binary variable that is it can only take two categories or two states. Yes, no on/off, One or zero, An example would be. Let’s say where a customer will pay or not. Pay or default on a loan, For example, the credit risk of a customer is considered a binary outcome variable when we go ahead and lend to a customer or customer applies for a loan to a bank could be a home loan could be a personal loan. The bank wants to predict whether the customer will be able to pay back on the loan or the customer will default on the loan. So that’s an example where logit regression can be used. Another example is whether the customer will respond or ignore a marketing offer that a company may make that’s an example of a marketing response. So in this case, it’s. DV company can predict whether a customer will take take up an offer or not that can determine whether the the company will actually make an offer or spend money and mailing across and offer to that customer or not if you can already predict whether the customer can default or will not take up the offer. Then you would not want to market to the custom that potential customer or give a loan to that applicant. Another example is whether the customer will churn out or stay loyal to your service. For example, in Telecom, millions of customers change their telecom subscriber rate or telecom provider every month, So if the telecom provider already comes to know, or can predict that the so initial customer is likely to turn out or, you know, turn down my service, they can actually go ahead and make some offers or try out some retention techniques on those customers, so that churn can be avoided, so all of these cases requires the organization to be able to predict or to be able to model a binary behavior and this is where logistic regression can come into place because logistic regression is a technique in which the binary dependent variable can be modeled, for example, in ordinary least square’s regression. The dependent variable is a continuous regression. A continuous value ranging from let’s say minus infinity to plus infinity. For example, what’s the price of a lure of a house? How much can the customer pay? Or what is the the time taken or the average time taken to complete a particular task? So those are those are all continuous outcomes where ordinary least-square’s regression would be would be used, but in these cases that we just discussed, it’s a binary dependent variable, where we are looking at only two particular states on/off 1 0 Yes or no. In these cases, a logistic regression makes more sense, so how does launched in regression actually work in the background? For example, This is a very basic example where we are trying to predict with a a student will pass or fail on an exam based on the number of hours studied, for example. Common sense tells us that if you do not study for an exam at all, you are unlikely to pass, and if you do study for an exam, depending on how much you study for an exam, you’re likely to pass and this particular behavior, however, does not follow a linear trend. First of all getting linear trends and probabilities is very difficult. We are looking at the probability of passing or failing. Now, what you see over here is let’s say, For example, we are trying to predict the probability of passing or failing if we do not study at all, the probability of passing is very low. It’s never zero, and as you continue studying more and more, The probability of passing starts going up and up and up and beyond a certain number of hours, it doesn’t matter whether you study for 40 years or they study for 100 or 200 hours for that exam. You cannot over pass. There are only two possible cases utilize the path of sale and the probabilities and starts to taper off. So you get sort of asymptotic approaches on both both sides or trending to a zero probability, and as the number of our studied goes up, It starts going up and then again an asymptote towards one, so there is no guarantee that you will pass even after studying for hundred hours, and there is no guarantee you will fail if you go and attend the exam without passing without doing any studying. You could just mark off everything randomly as a is and you might still pass so there is always a nonzero probability of passing with no studying and a and there is never a full certainty of passing even after I study a lot and this particular behavior that this particular graph that you see of for the probability changing based on the values of X, This is what we try to model in in logistic regression as well. So can we train a regression model on this relationship? Can I figure out by depending on what is the the value of X? So depending on what is the value of X If I study for 30 years and I fit this model, so this gives me that I have a 0.6 prop or the probability or is 0.6 or 60% chance of passing this exam. If I study for only 25 hours, I have about 0.1 or 10% chance of passing the exam, and similarly, if I go ahead and study for four years, I have a virtually 100% chance, or let’s say not 100% 99.999 percent chance of passing this exam, and this behavior is what we are trying to trying to model now. This could be just one of the behaviors. The other variable could be number of lectures attended for this class. That could be another dimension. I could use obviously if you start cutting too many lectures or you don’t show up. You are unlikely to pass that particular subject. Similarly, it could be other factors which can, which can determine your probability of passing, for example, what’s. How many marks did you get in here and your previous subjects and so on so is an example where we are only looking at number of are studied on one side and whether the customer passed or failed or the student passed or failed. And once we build this particular relationship, we can try and fit a regression equation on top of it so again, a few concepts of logistic regression, we try to measure or model the probability enough of an event rather than the measure, so the measure is always 0 and 1 but the probabilities will always range from 0 to 1 so minimum probability of any event is 0 The Max is 1 So we need to create the dependent variable as a probability range and this requires the transformation from the binary nominal variable in the regression. So a logit transformation is required on the dependent variable, hence the name logistic regression. So the assumptions of Loyalist’s regression are still valid, However, deviations are tolerated to a large extent, And in my end, cases mostly require a rank order. So what does a rank order mean? This is something that we will cover later on in our lectures, so let’s start looking at some basic mathematics behind logistic regression. So in a logistic regression, we actually go ahead and model the log of the odds of an event. Now, what do we mean by odds and order? Odds means probability divided by 1 minus probability or probability is also equal to the odds of or divided by 1 plus odds, so let’s say the probability of an event happening is 0.5 So what do we say the odds of that event? The odds of the event then become 0.5 divided by 1 minus 0.5 which essentially becomes a 1 by 1 Or we say the odds of an event happening is 1 is to 1 so you may have heard this term odds in terms of bookies when we say a particular match is happening and the odds of a country or a team meeting the matches. Let’s say 5 is to 4 What does this mean, actually 5 is to 4 when we model it as a probability, it simply says, what is the odds divided by 1 plus odds or Pi by 4 divided by 1 plus 5 bar for an easy way of saying is out of out of 9 chances. The probability of winning is 5 or the probability is a 5 by 9 and you can then calculate the probability as roughly about 60% or 0.6 or 9 6 are 53 in fact, in fact, it’s less than less than 60% is about 55% over here now. This is what we mean by odds and we try to fit a regression equation, which model the odds of a of a an event happening for in this example in logistic regression equation, the the event that were a or the dependent variable that we trying to model is the log of the odds and the this is an equation that you’ve seen before. So this is the regression equation itself, which is beta naught plus beta 1 X 1 plus beta 2 X 2 And so on. Where’s a beta naught is the intercept Beta 1 Til Beta N. These are all the estimates against the variable and X 1 X 2 X 3 can be your independent variable so this is similar to a regression equation. However, we know that probability is always restricted from 0 to 1 it cannot go less than 0 it cannot go higher than 1 but when we convert this into the log odds or the logic, what happens so for example. What is the log off? When probability is equal to 0 What does the log add become? It becomes 0 over 1 my minus 0 so that’s? What we have over here is a log of 0 over 1 minus of 0 which becomes log of 0 which is basically equal to minus infinity, and, for example, if probability is 1 then this becomes log of 1 divided by 1 minus 1 which, which is again log of 1 divided by 0 which becomes positive infinity. So in this case, even if the probability is is 6 between 0 to 1 the noir, god ranges from minus infinity to plus infinity, so I’ve done a lot of redlining and becomes a little confusing to read over here. Let’s discard this [Music] let’s sort of discard all the annotation that we’ve done lets. Go back so again! There is just a quick recap in logistic regression. We are modeling the log of the odds or S, also known as the logit and where the variables x1 x2 x3 are in a similar data set as what you would expect in any regression problem. The probability is always ranging for or is is always fixed between 0 to 1 however, the log odds when we put in the same restriction, the log odds can range from minus infinity to plus infinity.