Stata FAQ
How do I use the Stata survey (svy) commands?

The examples below use Stata 7 or 8.  If you are using Stata version 9, please see this page.

Here is a tiny example showing how to use the survey commands in Stata.  Consider the data file we call svysmall shown below.

use http://www.ats.ucla.edu/stat/stata/faq/svysmall, clear

list 

house    eth     wt      y     x1     x2     x3 
    1      1     .4      3      4      5      3  
    1      1     .9      9      4      5      6  
    2      1    1.2      9      8      7      3  
    2      1      1      8      7      4      2  
    2      1    1.1      8      7      6      3  
    3      2     .8      8      7      3      2  
    4      2     .4      8      2      0      3  
    4      2     .7      8      2      5      3  
In this tiny example, house is the household, eth is the ethnicity, and wt is the weighting for the person.  You can use the svyset commands to tell Stata about these things and it remembers them.  If you save the data file, Stata remembers them with the data file and you don't even need to enter them the next time you use the data file.  Below, we tell Stata that the psu (primary sampling unit) is the household (house).  Further, the sampling scheme included stratified sampling (strata) based on ethnicity (eth). Finally, the weighting variable (pweight) is called wt.

Note that in Stata versions 6 and 7, you will use the svyset commands as shown below.  Starting with version 8 of Stata, the way that the svyset command is issued changed.  An example is given below.
* Stata 7 commands
svyset psu house
svyset strata eth
svyset pweight wt

* Stata 8 command
svyset [pweigh=wt], psu(house) strata(eth)
Once Stata knows about the survey via the svyset commands, you can use the svy_____ commands using syntax which is quite similar to the non-survey versions of the commands.  For example, svyreg command below looks just like a regular reg command, but it uses the information you have provided about the survey design and does the computations taking those into consideration.
svyreg y x1 x2 x3
The output is below, and it tells you the pweight, strata, and psu variables so you can confirm the right variables have been chosen.
Survey linear regression

pweight:  wt                                      Number of obs    =         8
Strata:   eth                                     Number of strata =         2
PSU:      house                                   Number of PSUs   =         4
                                                  Population size  = 6.5000001
                                                  F(   1,      2)  =      0.35
                                                  Prob > F         =    0.6135
                                                  R-squared        =    0.2216

------------------------------------------------------------------------------
       y |      Coef.    Std. Err.       t     P>|t|      [95% Conf. Interval]
---------+--------------------------------------------------------------------
      x1 |   .3321757     .294268      1.129   0.376     -.9339573    1.598309
      x2 |   -.138397    .2335074     -0.593   0.613     -1.143098    .8663043
      x3 |   .5504173    .3170068      1.736   0.225     -.8135527    1.914387
   _cons |   5.050307    2.040247      2.475   0.132     -3.728167    13.82878
------------------------------------------------------------------------------

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.