Transcript:
[MUSIC] Hey, everyone, how’s it going? So in this video, We’ll be talking about something that I’ve gotten a lot of requests, and I’ve also wanted to make for a while. The Dickie Fuller and the augmented Dickey Fuller tests. So as we know, a big deal in time series modeling is to make sure that the series you’re modeling is stationary, so if you’re not familiar with that term, I’ll link my video on stationary in the description and you can check that out. But it basically means that the properties of the time series, such as mean and standard deviation are not changing over time, so of course, we need to have a formal test because we can always look at the time series and say I think it’s stationary or I think it’s not stationary, but we need some kind of robust formal test in order to better tell us. And that is the Dickey Fuller and augmented Dickie Fuller test. So in this video, we’ll be going through those tests at a pretty high level, so I’m not going to delve into the weeds of the mathematics. If that’s something you want, you can go ahead and leave a comment and I’ll try to do that in the future, but this is going to be a high level first. The Dickey Fuller test how we extend that to the augmented Dickey Fuller test and then we’ll even look at the code in Python, which will be made available to you. So let’s start off with the Dicky Fuller test. The Diki Fuller test basically assumes that our time series in question is a AR1 or autoregressive one process, In a nutshell, that just means that our time series Y Sub T is equal to some constant Mu, plus some coefficient Phi one times that same time series lagged one period. So this is the crucial part right here, which makes this a ar1 model. It just means that our time Series is a function. A linear function of itself lagged one time period in the past. Okay, now the null hypothesis H naught for the Diki. Fuller test is that Phi 1 this coefficient in front of the lagged one time series, is equal to 1 And, of course, if you’ve watched my unit roots video which I’ll also post below. You know that this would mean that the time series has a unit root, and that means that if we were to draw a graph of the time series, it is not stationary. We know that if this v1 is 1 or greater, the time series is not stationary now. The alternative hypothesis is that Phi 1 is less than 1 in which case the time series would be stationary. So now we see why this null and alternative hypothesis Do indeed test for stationarity now. The first thing that we’re going to do is subtract y sub T minus 1 from both sides, so we’re going to subtract it from the left hand side, and we subtract it from the right hand side as well. We introduce the notation Delta Y Sub T, which is equal to Y sub T minus Y sub T minus one, so that’s basically how we go from this equation to the equation written just below, and we also introduce a new symbol Delta here and Delta is simply equal to Phi 1 minus 1 Now this transform version here. We can basically write the null and alternative hypothesis by basically considering in terms of delta instead of Phi 1 If Phi 1 were equal to 1 that would mean Delta is equal to zero, And if Phi one is less than one that would mean that delta is less than zero. Okay, so now basically, we’re dealing with this model and this null and alternative hypothesis. The reason we do, this transformation is so that we can make the left-hand side stationary. So what I mean by that is assume the null hypothesis is true, so assume that Delta is equal to zero. Then we basically don’t consider this term and your time Series is equal to Mu, a constant plus Epsilon sub T, which is assumed to be some normally distributed random noise, right so that would mean that under the null hypothesis, our time series in question, which is now Delta Y sub T is stationary now. The reason why we can’t just do a simple T test for the value of Delta is because y sub T minus 1 is still non-stationary. Why is y sub T minus 1 non-stationary because y sub T is assumed to be non-stationary under the null hypothesis, which is that we have a unit root, okay, but it turns out that we can compute the same exact T statistic. It’s just that the distribution that we compare it against is not the T distribution, but instead a specialized distribution called the Dickey Fuller distribution so explicitly what we do. Next Is We say that the T statistic for Delta Hat Delta again being the coefficient in front of the lagged One version of the time series is equal to Delta Hat, so we go ahead and compute Delta hat and divide that by the standard error of delta hat. So what’s happening here is nothing special. You’re just computing the same old t-statistic you would if this was just any old regression. The only difference is that instead of comparing that t-statistic this one, we just calculated against the normal T distribution. We need to compare it against the Dickey Fuller distribution, which I’ll leave a link in the description below to that distribution again. The reason we need to do, this is because we know that under the null hypothesis, Y sub T minus 1 is non-stationary. It has a unit root and therefore it’s non-stationary. If we knew that it was stationary, we could just do the same thing that we’ve always been doing with the t-distribution we go ahead and compare this t-stat against this Dicky Fuller distribution, and then we either find that this T statistic is less than the critical value, or it’s greater than the critical value if it’s less than the critical value, we reject the null hypothesis. What that means in real terms is that we reject that it has a unit root, which means that we say that the time series is in fact stationary on the other hand. If T is greater than the critical value, we do not reject H naught, and we do not have evidence to say that it is stationary. Okay, so that’s how we decide robustly for a simple ar1 model If the time series is or is not stationary now, of course, the natural question is that, of course, time series models can and will be more complicated than AR1. So how do we extend this to something more complicated? That’s where the augmented Dickey Fuller test comes in. So the augmented Dickey Fuller test starts with the assumption that the time series is not a simple AR1 model. It says that it’s a more complicated ARP model which is written right here. We begin with the same transformation. We subtract Y sub T minus 1 from both sides on the left hand side. We get Delta y sub T. Oh, this is a mistake, so we get delta y. Sub T is equal to Mu Plus Delta y sub T minus one, plus all this other stuff over here, okay, so the null and alternative hypothesis for the augmented Dickie Fuller test are actually the exact same were again testing whether Delta is equal to zero versus Delta is less than zero and the process is the same. We go ahead and calculate the T statistic for Delta. Same exact way we did here. Compare it against the same Dickey fuller distribution here and go through the same conclusion process. So if that t-stat is less than the critical value, we say it’s stationary. Otherwise we say it is non-stationary now. The one extra step we do with the augmented Dickey fuller distribution is we have all of these other coefficient’s beta? I and these we can use a typical t-distribution for we can go ahead and just calculate the t-stat for each of these beta, which is simply beta i-hat divided by standard error of beta i-hat for each of them. We compare this t-statistic against the critical value in just the typical t-distribution, and then we go ahead and make a conclusion about whether each of these is significant, so this was a pretty high level on the Dickie Fuller and augmented Dickey Fuller Test. Let’s round out this video by looking at very simple code in Python. You can run to do this for yourself. All right, you’ll just need the one special library, Statsmodelstsa stats, tools, and from there you’ll import adfuller. So this is the augmented Dickie Fuller test. I have a function here to just generate some simulated ar data. I won’t really explain it too much, but that’s what it’s for. And here is the function to perform the test. It’s very simple. It just accepts the time series you put the time series right into the adfuller function, and you get two things back So result zero gives you the statistic, and then the p-value is what we care about more and that’s stored in result one so again, if the p-value is less than .05 we’ll say that it is stationary, However, if the p-value is greater than .05 well say that it is not stationary so just generate an AR1 process here with v1 is equal to .05 So we know, for a fact this is stationary if we look at it, it looks visually stationary, but we never know, so we go ahead and just perform the test putting in our AR1 process, and we get the p-value of 0.000 bunch of zeros one so strong evidence that this is stationary, which, in fact it is now. What if we put in a non-stationary process? So I generated a r1 process with fee equal to 1 so this clearly has a unit root and we can even see visually, There’s strong evidence of sticking to high and low values. If I were to perform the ADF test here, we get a p-value of 0.66 which is very high so again we get evidence that this is not stationary. We can do this on more complicated ones, so we can do an ar2 process. I have Phi. 1 is equal to .05 v2 is equal to 0.3 So this is stationary. If I look at the graph, of course, it does look stationary. If I run this data through the augmented Dickey Fuller test, I get a very, very low P value again so enforce that this is a stationary time series and just to round up the story. Here’s a AR2 process whose coefficient’s sum up to 1 therefore, it is not stationary, We can see that pretty visually it sticks to high and low values. If I put this ar2 process through my augmented Dickey folder test, I get .52 which is clearly higher than .05 so this is not stationary. So this code I’ll link in the description below, and it’s a very, very nice, easy way to tell if your time series is stationary or is not stationary. Okay, so please, like and subscribe. Hope you like this video and I’ll see you next time.