Stata FAQ
How do I use the Stata survey (svy) commands?

The examples below use Stata 9.  If you are using Stata versions 7 or 8, please see this page.

Here is a tiny example showing how to use the survey commands in Stata 9.  Consider the data file we call svysmall shown below.

use http://www.ats.ucla.edu/stat/stata/faq/svysmall, clear

list 

house    eth     wt      y     x1     x2     x3 
    1      1     .4      3      4      5      3  
    1      1     .9      9      4      5      6  
    2      1    1.2      9      8      7      3  
    2      1      1      8      7      4      2  
    2      1    1.1      8      7      6      3  
    3      2     .8      8      7      3      2  
    4      2     .4      8      2      0      3  
    4      2     .7      8      2      5      3  
In this tiny example, house is the household, eth is the ethnicity, and wt is the weighting for the person.  You can use the svyset commands to tell Stata about these things and it remembers them.  If you save the data file, Stata remembers them with the data file and you don't even need to enter them the next time you use the data file.  Below, we tell Stata that the psu (primary sampling unit) is the household (house).  Further, the sampling scheme included stratified sampling (strata) based on ethnicity (eth). Finally, the weighting variable (pweight) is called wt.

The way the svyset command is constructed is different between Stata version 7, 8 and 9.  If you are not using Stata 9, the syntax below will not work.  Please see this page for examples.  An example is given below.  Notice that the PSU variable is given before the p-weight, which is given in square brackets.
svyset house [pweight = wt], strata(eth)
Once Stata knows about the survey via the svyset commands, you can use the svy: prefix using syntax which is quite similar to the non-survey versions of the commands.  For example, the svy: regress command below looks just like a regular regress command, but it uses the information you have provided about the survey design and does the computations taking those into consideration.
svy: regress y x1 x2 x3

The output is below, and it tells you the pweight, strata, and psu variables so you can confirm the right variables have been chosen.

Survey: Linear regression

Number of strata   =         2                  Number of obs      =         8
Number of PSUs     =         4                  Population size    = 6.5000001
                                                Design df          =         2
                                                F(   2,      1)    =         .
                                                Prob > F           =         .
                                                R-squared          =    0.2216

------------------------------------------------------------------------------
             |             Linearized
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .3321757    .294268     1.13   0.376    -.9339573    1.598309
          x2 |   -.138397   .2335074    -0.59   0.613    -1.143098    .8663043
          x3 |   .5504173   .3170068     1.74   0.225    -.8135527    1.914387
       _cons |   5.050307   2.040247     2.48   0.132    -3.728167    13.82878
------------------------------------------------------------------------------

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.