### Stata FAQ

How do I use the Stata survey (svy) commands?

The examples below use Stata 9. If you are using Stata
versions 7 or 8, please see this

page.

Here is a tiny example showing how to use the survey
commands in Stata 9. Consider the data file we call **svysmall** shown below.

**use http://www.ats.ucla.edu/stat/stata/faq/svysmall, clear
list **
house eth wt y x1 x2 x3
1 1 .4 3 4 5 3
1 1 .9 9 4 5 6
2 1 1.2 9 8 7 3
2 1 1 8 7 4 2
2 1 1.1 8 7 6 3
3 2 .8 8 7 3 2
4 2 .4 8 2 0 3
4 2 .7 8 2 5 3

In this tiny example,

**house** is the
household,

**eth** is the ethnicity, and

**wt** is the weighting
for the person. You can use the

**svyset** commands to tell Stata about
these things and it remembers them. If you save the data file, Stata remembers them
with the data file and you don't even need to enter them the next time you

**use**
the data file. Below, we tell Stata that the

**psu** (primary sampling
unit) is the household (

**house)**. Further, the sampling scheme
included stratified sampling (

**strata)** based on ethnicity (

**eth).
**Finally, the weighting variable (

**pweight**) is called

**wt**.

The way the

**svyset** command is constructed is different between Stata
version 7, 8 and 9. If you are not using Stata 9, the syntax below
will not work. Please see this

page
for examples. An example is
given below. Notice that the PSU variable is given before the p-weight,
which is given in square brackets.

**svyset house [pweight = wt], strata(eth)**

Once Stata knows about the survey via the

**svyset**
commands, you can use the

**svy: **prefix using syntax which is quite
similar to the non-survey versions of the commands. For example, the

**svy: regress**
command below looks just like a regular

**regress** command, but it uses the
information you have provided about the survey design and does the computations taking
those into consideration.

**svy: regress y x1 x2 x3**

The output is below, and it tells you the **pweight**,
**strata**, and **psu** variables so you can confirm the right
variables have been chosen.

Survey: Linear regression
Number of strata = 2 Number of obs = 8
Number of PSUs = 4 Population size = 6.5000001
Design df = 2
F( 2, 1) = .
Prob > F = .
R-squared = 0.2216
------------------------------------------------------------------------------
| Linearized
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | .3321757 .294268 1.13 0.376 -.9339573 1.598309
x2 | -.138397 .2335074 -0.59 0.613 -1.143098 .8663043
x3 | .5504173 .3170068 1.74 0.225 -.8135527 1.914387
_cons | 5.050307 2.040247 2.48 0.132 -3.728167 13.82878
------------------------------------------------------------------------------

The content of this web site should not be construed as an endorsement
of any particular web site, book, or software product by the
University of California.