Simulations and Demonstrations for
Introduction to the Practice of Statistics
Chapter 5

NOTE: This page has been delinked.  It is no longer being maintained, and information on this page may be out of date.

The following example uses the heads program.  If you don't have the heads program, you can download if from within Stata by typing findit heads (see How can I use the findit command to search for programs and get additional help? for more information about using findit).
The heads program can be used to produce results simulating those shown in Figure 5.1.  In the example in the book, the number of bad switches are examined, so whenever you think of getting a "head" with the heads program, it is like getting a bad switch. The example draws 10 switches (coins) at a time and does this for 1000 trials. The data for this can be generated with the heads command below.
heads , save 
Then click on quit and the graph in Figure 5.1 can be produced with the graph command below.
histogram heads, discrete xlabel(0(1)6)
The graph we got is shown below, and looks much like (but not exactly like) Figure 5.1. You can vary the number of trials, and you will find that as you increase the number of trials (from 1000) the graph will look more and more exactly like figure 5.1.

The heads program can be used to produce results like that in figure 5.4. The only difference from figure 5.1 above is that there are 100 switches drawn at a time (or the equivalent of tossing 100 coins at a time.  This is illustrated below.
heads , save

tab heads

# heads out |
     of 100 |
     tossed |      Freq.     Percent        Cum.
------------+-----------------------------------
          2 |          1        0.10        0.10
          3 |          5        0.50        0.60
          4 |         18        1.80        2.40
          5 |         42        4.20        6.60
          6 |         60        6.00       12.60
          7 |         88        8.80       21.40
          8 |        123       12.30       33.70
          9 |        131       13.10       46.80
         10 |        120       12.00       58.80
         11 |        119       11.90       70.70
         12 |         86        8.60       79.30
         13 |         74        7.40       86.70
         14 |         50        5.00       91.70
         15 |         39        3.90       95.60
         16 |         23        2.30       97.90
         17 |          5        0.50       98.40
         18 |         11        1.10       99.50
         19 |          1        0.10       99.60
         20 |          1        0.10       99.70
         21 |          3        0.30      100.00
------------+-----------------------------------
      Total |      1,000      100.00
Below we show a graph of this.
histogram heads, discrete xlabel(2(1)21)

Figure 5.5 illustrates the area under the curve using the normal approximation to the binomial. There is an excellent demonstration of this at the Rice Virtual Lab in Statistics at http://www.ruf.rice.edu/~lane/stat_sim/normal_approx/index.html If you choose an N of 100, P of .1, and to show the probability from 0 to 9, you see that you get the results shown at the bottom of page 386 corresponding to figure 5.10.  You can see that the exact probability is .45 vs. the normal approximation of .43.  You can vary the N and see that as the N decreases, the discrepancy between these two results increases, and as the N increases, the discrepancy decreases.  In other words, the accuracy of the "normal approximation" improves with as the N gets greater and greater.
wpe6.jpg (27853 bytes)

The following example uses the clt program.  If you don't have the clt program, you can download if from within Stata by typing findit clt (see How can I use the findit command to search for programs and get additional help? for more information about using findit).

Figure 5.5 illustrates how the distribution of sample means becomes more and more normal as the sample size increases.  The first figure (5.5a) appears like an exponential distribution with sample size of 1, and the following figures have a sample size of 2, 10 and 25.  We can use the clt program (central limit theorem) to illustrate this.  The examples below draw 1000 sample means from an exponential distribution with sample sizes of 1, 2, 10 and 25.

clt 
Sample size of 1
Sample size of 2
Sample size of 10
Sample size of 25
In addition to the examples above, you can try any sample size you like by the N per sample pulldown.Likewise, you can try other distributions including a log distribution, or a normal bimodal distribution, or a uniform distribution.
Below we show one more example where we used a log normal distribution with a sample size of 100 and drawing 5000 sample means, and showing a normal overlay so we can compare the results to a normal distribution.
You can also experiment with producing figures like Figure 5.9 using the Rice Virtual Lab in Statistics demonstration of the Central Limit Theorem at http://onlinestatbook.com/stat_sim/sampling_dist/index.html
Below we started with a parent population that was skewed, and chose to see the distribution of the mean with N=10 and N=20, and a normal overlay.  You can see that as the N went from 10 to 20, the distribution became more normal in shape.  This demonstration allows you to choose other parent populations, and even allows you to make a custom population by clicking the mouse on the parent distribution to alter the shape of the distribution.
wpeC.jpg (49975 bytes)

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.