Transcript:
In the last few videos we started with some weird distribution. It doesn’t have to be weird. It can be a nice normal distribution. But to explain that you don’t have to have a normal distribution, i prefer to use the weird ones. Let’s say you have some kind of weird distribution, which looks similar to this. It can look anyway. We have seen many times … you take excerpts from this strange distribution. Let’s say you take samples of n equal to 10. We take 10 separate values of this random variable, we find the arithmetic mean and then apply it. This is a special case. We continue to do this. We do it again. We take 10 values of this random variable, we find the arithmetic mean, we put it again. You do this many times – in theory, countless – and you begin to approach the empirical distribution of the average sample value. For n equal to 10, this will not be perfectly normal distribution, but will be close. It would be perfect only if n was infinity. But, let’s say that eventually – all our samples, we find many arithmetic means that are there. This is done in a pile. This is done in a pile. And, eventually, we’re going to start approaching something, which looks similar to this. From the last video we saw that, first, if … let’s say we did it one more time. This time, assume that n is equal to 20. First, the distribution we will get will be more normal. And, perhaps, in future videos, we will we delve deeper into things like excess and asymmetry. But this will be more normal. But even more important here, or, I suppose, even more obvious for us than we saw in the experiment, this will have a lower standard deviation. They will all have the same average. Let’s say the average here is 5. Then, the average here will also be 5. The mean of the empirical distribution of the mean the sample will be 5. It doesn’t matter what our n is. If our n is 20, it will still be 5. But our standard deviation will is less in these scenarios. We saw this through experimentation. It may look like this. It will be more normal, but it will there is a narrower standard deviation. It may look like this. And, if we did this with a larger sample size – let’s do this in a different color. If we do this with an even larger sample size, n is equal to 100, then we get something that fits even better than normal distribution. We take 100 separate values from this random variable, we take the arithmetic mean and apply it. 100 individual values of this random variable, we take the arithmetic mean, we apply it. We continue to do this. If we continue to do this, we will get it something even more normal than any of these. This will fit much better of true normal distribution, but, even more obvious to the human eye, it will be even narrower. There will be a very low standard deviation. It will look like this. I’ll show you this video of the simulation app, probably later in this video. Two things happen. As you increase the sample size each time, when you take the arithmetic mean, two things will happen. This will become more normal and your standard deviation it becomes less. The question may arise: “Is there a formula?” If I know the standard deviation … this is the standard deviation of my original feature probabilistic density. This is the average of my original probability density function. I know the standard deviation and I know that n will change, depending how many samples do I take each time I calculate the average value of the sample. I know the standard deviation or, perhaps, I know the variance. The variance is simply the standard deviation of the square. If you don’t remember that, you can you want to negotiate these videos. But if I know the variance of my original distribution and if I know how many n, how many samples I take every time I calculate the arithmetic mean to puts one thing in the empirical distribution of the average of the sample, is there a way to predict what the average of these will be distributions? The standard deviation of these distributions. To make sure you don’t get confused between this and that, let me say “variance”. If you know the variance, you can find out the standard deviation because one is just the square root of the other. This is the variance of our original distribution. To show that this is the variance of our empirical distribution of the average value of the sample, we will write it here. This is the variance of the average sample value. Remember, the real average is this, the Greek letter “mu” (mu) is the real average. This is equal to the average value. While “x” with a dash above it means the average of the sample. Here we say that it is the variance of the sample mean values. This will be a real distribution. This is not a calculation. If we magically knew the distribution, there is a real variance here. Of course, the average value – this has an average value. This here – if we want our designation to be true – this is the mean of the empirical distribution of the average value of the sample. This is the average of our averages. It’s just the same thing by accident. This is the average of the average values of our sample. It will be the same as this, especially if we try again and again and again. But the purpose of this video is whether there is a way to find this variance in given the variance of the original distribution and n. It turns out that there is. I will not show the proof here. I want to show you the logic. I think you already understand that with any attempt you make if you make 100 is much more likely when you take the arithmetic mean, get closer to the real one average value than if you take n from 2 or n from 5. You are much less likely to be away from it if you make 100 attempts than if you make five. I think you know that, somehow, must be inversely proportional to n. The larger n, the smaller the standard deviation. It turns out that this is as simple as possible. This is one of those magical things in math. One day I will prove it to you. I want to give you practical knowledge first. In statistics, it is always difficult for me to decide do I have to be more formal when I give you hard evidence, but I came to the conclusion that in statistics it is more important to get practical first knowledge and then, later, after you understand all this, we can get to really in-depth math and prove it to you. But I think all you need for now is experimental evidence, by using these simulations to show you that this is really true. It turns out that the variance of the empirical distribution of the average sample value is equal to the variance of the original distribution – this here – divided by n. That is all. If this up here was a variance of – let’s say that this up here has a variance equal to 20. I just made up that number. Then, let’s say n is 20. Then the variance of the empirical distribution of the average sample value for n equal to 20 – you will take the variance up here – the variance is 20 divided by n, 20. Here the variance will be 20 divided by 20, which is equal to 1. This is the variance of the original probability distribution. And this is your n. What will be the standard deviation? How much will the square root of this be? The standard deviation will be the square root of 1. This will also be 1. We can record that as well. We can take the square root of both sides of this and say that standard deviation of the empirical distribution the mean value of the sample is often called the standard deviation of the average value and is also called – I will write this – a standard error of the mean. All these things I just mentioned mean the standard deviation of the empirical distribution of the average sample value. That’s why it’s confusing. Because you use the words “average” and “sample” over and over. And, if that confuses you, let me know. I’ll make another video or stop and repeat, or something. But if we just take the square root of both sides, the standard error of the mean or the standard deviation of the empirical distribution of the average sample value is equal to the standard deviation of the original function of the original function of the probability density, which can be quite abnormal, divided by the square root of n. I just took the square root of both sides of this equation. Personally, I prefer to remember this, that the variance is inversely proportional to n and then I prefer to go back to that, because it’s easier. You just take the variance divided by n. If I want the standard deviation, i just take the square root of both sides and I get this formula. Here, when n is 20, the standard deviation of the empirical distribution of the average sample value it will be one. Here, when n is 100, the variance … the variance of the empirical mean of the sample distribution or the variance of the mean, or the mean of the sample, will be equal to 20, the variance of that divided by n. This is equal to – n is 100 – equal to one fifth. The standard deviation of this or the standard deviation of the empirical distribution of the sample mean, or the standard error of the mean, will be the square root of that. That is, 1 on the square root of 5. That will have to be a little here less than one-half of the standard deviation, while this here has a standard deviation of 1. You see, this is definitely less. I know what you’re saying now. “Sal, you just gave me a formula. I don’t have to trust you. ” Let’s see if we can prove it, by using the simulation. Just for fun, I’ll play with it a bit distribution. This is my new distribution. Let me take n – let me take two things that are easy to take the square root of, because we look at the standard deviations. Let’s say we take n out of 16 and n out of 25. Let’s do 10,000 attempts. In this case with each attempt we will take 16 samples from here, we will take the arithmetic mean and make a frequency diagram. Here we will do 25 at a time and then take the arithmetic mean. To remember it, I will make it animated once. I take 16 samples, I put this here. I take 16 samples as described by this probability function density, or are now 25. I put them here. What would I get if I did this 10,000 times? How much will I get? Okay. Here, just visually, you can say that when n was greater, the standard deviation here is less. It’s more shrunken. But let me write this down. Let’s see if I can remember it. Here n is 6. With this random distribution I made, the standard deviation was 9.3. I will remember these. The standard deviation for the original was 9.3. The standard deviation here was 2.3 and the standard deviation here is 1.87. Let’s see if this fits our formula. I’ll remove this from the screen for a while, I will go back and do some calculations. This is on my other screen, so I can remember those numbers. In the experience we did, my crazy distribution had a standard deviation of 9.3. When n was 16 – we just did the experiment, we did a few experiments, took the arithmetic mean and did all this – we got the standard deviation of the empirical distribution of the average sample value or the standard error of the mean. We determined experimentally that this is 2.33. Then, when n was equal to 25, we found that the standard error of the mean is equal to 1.87. Let’s see if this fits our formulas. We know the variance … or we almost can say the variance of the mean or standard error – the variance of the empirical distribution of the sample mean is equal to the variance of our original distribution divided of n. We take the square root of both sides. Then we get that standard error of the mean is equal to the standard deviation of the original distribution, divided by the square root of n. Let’s see if that works for these two things. If I take 9.3 – let me do this case. 9.3 divided by the square root of 16 – n is 16 – that is, divided by the square root of 16, which is 4. How much do I get? 9.3 divided by 4. Let me take out a calculator. Let’s see. We want to divide 9.3 by 4. 9.3 divided by the square root of n – n was 16, that is, divided by 4 – is equal to 2.32. That’s equal to 2.32, which is very, very close to 2.33. That was after 10,000 attempts. Maybe right after that I’ll see what happens if we do 20,000 or 30,000 attempts in which we take samples of 16 and take the arithmetic mean. Let’s look at this. Here we will take 9.3. Let me draw a small line here. Maybe I’ll scroll down. This could be better. We take the standard deviation of the original distribution – this formula that we derived here will tell us that our standard error should be equal to the standard deviation of the original distribution – 9.3 – divided by the square root of n divided by the square root of 25. The square root of 16 was 4. This is equal to 9.3 divided by 5. Let’s see if that’s 1.87. Let me take out my calculator again. If I count 9.3 divided by 5, how much will I get? 1.86 and that’s pretty close to 1.87. In this case we have 1.86. As you can see, what we got experimentally, it was almost accurate – and that’s after 10,000 attempts – what you expect. Let’s make another 10,000. You have 10,000 more attempts. We are still in the approximate values. We won’t get to – maybe I can’t to hope to get the exact number, rounded or whatever. But, as you can see, I hope that will be enough satisfactory to you that the variance of the empirical distribution of the average sample value will be equal to the variance of the original distribution, no matter how crazy your distribution will be, divided by the sample size, the number of samples that you do for each group whose arithmetic mean you take. I guess this is the best way to imagine it. Sometimes this can be confusing, because you take samples from arithmetic means based on samples. When someone says sample size, you wonder: “Whether the sample size is the number of times when I took the arithmetic mean, or the number of times when do I take the averages every time? It doesn’t hurt to explain that. Usually, when talking about sample size, they talk about n. And, at least in my mind when I think of experiments like when you take a sample of 16, when you take the arithmetic mean, it’s an experiment. And then you put it on. Then you do it again and that’s another experience. Then you do it again and again. I hope this helps you clarify things. Now you will also understand how to get the standard error of the mean.