Experiment 1: With varying numbers of samples
Descriptive Statistics | First 100 | First 1K | First 10K | First 100K |
Mean | 0.02000071 | 0.020000066 | 0.020000004 | 0.02 |
Standard Error | 2.12714E-06 | 7.53406E-07 | 2.51164E-07 | 9.69855E-08 |
Median | 0.020005 | 0.020004 | 0.020004 | 0.020004 |
Mode | 0.020005 | 0.020005 | 0.020005 | 0.020005 |
Standard Deviation | 2.12714E-05 | 2.38248E-05 | 2.51164E-05 | 3.06695E-05 |
Sample Variance | 4.52471E-10 | 5.67621E-10 | 6.30831E-10 | 9.40618E-10 |
Kurtosis | 28.87137928 | 21.46428225 | 19.07376827 | 12.23083198 |
Skewness | -5.453831468 | -4.509853108 | -3.831289593 | -2.003065575 |
Range | 0.000135 | 0.000252 | 0.000277 | 0.000374 |
Minimum | 0.01988 | 0.019872 | 0.019868 | 0.019815 |
Maximum | 0.020015 | 0.020124 | 0.020145 | 0.020189 |
Sum | 2.000071 | 20.000066 | 200.000044 | 1999.999951 |
Count | 100 | 1000 | 10000 | 100000 |
Confidence Level(95.0%) | 4.2207E-06 | 1.47844E-06 | 4.92331E-07 | 1.9009E-07 |
Transcript
Now, we can say, "How does this vary with the number of samples?" So, if I were to take just the first hundred values that I measure, what's the name mean value? And we can look at this chart, and we can see as we go from a hundred to a thousand to ten thousand to the first hundred thousand, we see it doesn't affect the mean very much. We see that the standard error goes down quite a bit as we increase our number of samples, but we noticed the median is almost unchanged. That means that if we're just simply looking for the median in this set, we only needed actually a hundred samples to get that rather accurately - taking a hundred thousand samples didn't really change that by much. And the mode is unchanged. And the standard deviation, of course, decreases as we increase the number of samples. We can look at the variance - it doesn't change very much irrespective of how many samples we take. We can look at the shape and skewness, the range, etc. And we see our confidence level, however, as we increase the number of samples (yes) our confidence changes as you would expect. The probability that we have a different value does change as we increase the number of samples, exactly as you would expect. But is that really giving us the data we want?