Transcript:
In this video, I will talk about Time Series Arima models in our before you watch this video. Please make sure that you have watched my other video called Time Series. Arima models and another one called time series Arima models example. I have opened up the art program here and I have executed it and we will go over it. You can download the program data and see a lot more information on the website. Click on the link below the video, so let’s start. The first thing to do. Is you need to install the package’s time series? So remove the sign. Install the package. Put it in the library. The next thing is to read the. Csv file called Time series underscore PPI for the producer price index and this would be our dependent variable. PPI would be our white variable and the difference in this variable would be called D Dot. Y and this is the difference operator. And then your quarter would be the known an ST in the program. So we can do some summary statistics if you do summary. Y here are the results you have. The means sixty sixty four. And then the mean for the difference, variable is point. Forty six. The next thing you can do is plot data you can plot. T against Y and see how the variable looks like, and this is how it looks like, and from this graph, it doesn’t seem like the variable stationary. You know you see it. Increasing, so let’s look at the difference variable, and that’s how it looks like, so it’s a little bit more. The mean is a little bit more custom, but unfortunately, the variance got a little bit bigger, so we’re not going to worry the variance in this video, too. Okay, so what are the formal tests then to do to test for stationarity, they’re called the Dickey Fuller tests and the command for that is ATF that test you put here the dependent variable that you’re testing and you can either test. The alternative is stationarity or the alternative is explosive. Okay, and then K equals zero is the number of additional legs that you can put put in there, so okay, so let’s look at the results. Here’s the Dickey Fuller test and you have the test. Statistic is minus 0.72 Nine, we don’t have any additional lags and the p-value is very large and the alternative hypothesis is stationarity. So with this kind of p-value, we cannot reject the null. Therefore, we have non stationarity now. If we have the alternative being explosive here, the p-value is smaller than 0.05 and therefore we need to accept the alternative hypothesis, which is explosive, which means again that the series is not stationary so typically other softwares and a lot of the the data examples that I give you would be in the first one The that the alternative is stationarity, which is a little bit counterintuitive because we want non significant results. And then we say we have stationarity. Okay, so you can run the Dickey Fuller test and these these commands here will not run with your data, so you need to change these, but here you would need to put a regression of the difference variable on the leg variable. Okay, and you can also have the difference variable on the leg variable, and you put the trend variable in there and again. All these variables are already in my dataset so you would need to have these predefined in your data set or this part of the code will not run for you. Okay, so we have here the summary you have. These are the coefficients, and this is the coefficient on the length, A PPI, so the dependent variable is the difference and this is the length PPI, and we see that this is the T value for for that now. If you have the one with trend right here, you see that this is the coefficient, and this right here is the test statistic. So if you remember what we had above that test statistic right here is what’s reported on the Dickey Fuller Test. You see this test statistic? Right here is the same one, so it’s coming from from running this regression. The difference dependent variable on the leg, one and the trend. And we’re looking at this. T value to decide whether or not we have stationarity so again here. We have that this coefficient is not significantly different than 0 which means we actually do not have stationarity and we need to difference this variable one more time in order to find a stationarity. So here’s the the augmented dickey-fuller test and here we can do. This is the variable. We have the alternative stationarity and here. I have missed a number of other legs, and you see here that you can have up to five legs and again. We need to look at that p-value to decide whether or not we have stationarity, but again, regardless of the number of life, we do not have a stationary process here now. If we’re running the dickey-fuller test on the difference variable, we can do this by using D dot y for the difference variable without any additional legs or with it. You can see now that this is the test of the stick – 6.83 and that is the p-value and we have that the alternative hypothesis is stationarity. So if you reject them now, we have stationarity, which is a good thing and also with the with the leg. Order equals 5 We can have the same result that we have stationarity. So the conclusion from here is that we need to use the difference variable in our Rima models. OK, the next thing that we can do is we can look at the ACF and P Acf, the autocorrelation function in partial autocorrelation function. And here you put the variable, and that’s an ACF and you can just run it and take a look at it. And this is the result when we have a this. This is it like 1 2 3 M and so on up to 25 and when you see a very, very slow decaying function, This is an indication of non stationarity here so again. Our original variable is non stationary. We’re seeing this in many different ways with the Dickey Fuller Test drawing and just looking at the plot and now we’re seeing it in the ACF functions as well. If you look at the P ICF for the the original variable Y. We have a very strong first leg, and then none of the legs are really the correlation and the other legs isn’t significant. OK, so now that we can look at the ACF of the difference variable, and you see now a function that is tailing off rather rapidly, which is good and the difference variable is stationary, so we have here tailing off on the ACF. If we’re running the P. ACF, we have very strong first leg, and then the other legs don’t are not significant. They’re outside of the confidence interval here indicated by the blue line so again, this is an invitation indication of having very slow tailing off on the ACF and and just cutting off after the first leg, so that’s an indication of an AR one process for the different series, so now let’s go ahead and estimate a few ARIMA models here, and I’m gonna estimate all kinds of combinations, and we’re gonna look at them and see which one we will select at the end. Now I will start with the one that is for the original variable and so the way to read it. The Rema. The first number here is P The number of autoregressive terms. Next one is D, then if the variable is different or not, and next one is Q which is the number of legs on the MA components. So here we only have, that’s basically an AR one process and after we estimated, here’s the coefficient of 0.99 which is very significant and again that coefficient is very close to one. This does not seem like a stationary process here. You can estimate an ARIMA to 0 0 which is same as AR 2 This is the commands for ARIMA. You basically put the variable you put order equals and see, and these are in the order that we’ve seen on the Arima Model PD. And Q. Okay, so ARIMA to 0-0 again. You have a significant both the first and the second alter aggressive terms here. But it gives you some warning next one. You can estimate an. Ma, one process here and again we see look at this coefficient is close to 1 or you can estimate and the Rema once you’re on one, which is an ARMA 1 1 again. This coefficient here is very close to 1 on on here so anyway. This is the first models right here that we estimated. I’m just showing you how to do the correct code. So now we decided that we will use a Rima on the difference variable, and this is what we’re what we’re doing here. So in the Rima 1 1 0 would have one auto regressive term on the difference variable. So look at how this changed now, instead of using why I’m using dy because that that second one is the difference, the first DIF of it and here, actually, I’m using a 1 0 0 why because I already took the difference here, so I don’t want to put the difference here as well and the results are here and look, we have a coefficient. That’s not close to 1 anymore, which is also significant coefficient, so this is actually a good, good model, good contender for for a model and you can look at the AIC criterion We’re going to be comparing these two decide the model with a with a good fit. So the next one we can do in the Rema. 0 1 1 here. We have an MA component but again looking at the Acf and Pacaf, we determined that we need to have an autoregressive terminate in here. So here on this one. I’m putting both an AR 1 and an MA 1 on the different series and here. We have both of these coefficients they are, you know, both coefficients on the AR and Ma. So then you could put three legs on the on the MA term, so this these are the coefficients on the legs and there let me just compare it to the handout that I gave you for the example. Okay, so they’re just a little bit different than what I reporting the handout coming from Stata, But they’re very close in magnitude here. Okay, and here’s the last one, the two one three, and this is also a model that I like a lot. I have two auto regressive terms. Both of them are significant, and he has three ma terms and this one and that one is significant. The one in the middle is not and I’m looking at the AIC on this one as a criteria. So I really like this. This model here and the ARIMA, one one. Which is this model here? And as you can see, they have similar AIC. This one is a little bit lower, but it does have more terms. This is significant, and we have that one significant, so it’s a toss coins, which one you want to use, but you can see how this is more like an art than science to when you select the models. So based on what I see here, I would probably go with the Rema. One one or you can go with this one. Based on everything we learned so far, so the last thing that we can do here is we can forecast the variable in for the future values, so you can estimate them a Rhema model, which is a 1:1 so this is actually for the original variable. You can get the predicted values and then you can plot the original variable and then the predicted values and then here you will put a confidence interval minus 2 and plus 2 standard deviations around the prediction and so. I’m gonna run this code here, and this is what we see. This was the variable that we had before, and this is the forecast that we had and it’s the confidence interval. So I think it’s pretty good prediction. Now we will forecast the difference variables. So see, now we are estimating a Rhema on the difference variable here and we will be predicting ahead and so same syntax here, And I’m gonna run this code and this is what we see. This was the variable that we had and this is the prediction here and see it has a much wider confidence interval around that. So that was what? I had on time serious ARIMA. Models with arm. Thank you for watching.