### R Paper Examples Using SAS Proc Mixed to Fit Multilevel Models, Hierarchical Models, and Individual Growth Models by Judith Singer

This paper can be downloaded from Professor Singer's web site at http://gseweb.harvard.edu/%7Efaculty/singer/Papers/sasprocmixed.pdf

Most of the examples use the nlme package which comes with R and can be made available by typing

> library(nlme)

First set of examples using the hsb12 data file.

To input the hsb12.csv data file in R, use the following code. Change d:/data/ to the where you stored the data files on your computer (and do use forward slashes to separate directory names)

> hsb12 <- read.csv("d:/data/hsb12.csv", header=T)

Page 329
In order to use the lme function the data object must already have the multilevel structure which is imposed by the groupedData function.

> hsb12.int <- groupedData( mathach ~ 1 | school, data=hsb12)

> pg329 <- lme(mathach ~ 1, data=hsb12.int, random = ~ 1 | school)
> summary(pg329)

Linear mixed-effects model fit by REML
Data: hsb12.int
AIC      BIC   logLik
47122.79 47143.43 -23558.4

Random effects:
Formula:  ~ 1 | school
(Intercept) Residual
StdDev:    2.934966 6.256862

Fixed effects: mathach ~ 1
Value Std.Error   DF  t-value p-value
(Intercept) 12.63697 0.2443936 7025 51.70747  <.0001

Standardized Within-Group Residuals:
Min        Q1        Med        Q3      Max
-3.063125 -0.753874 0.02670132 0.7606217 2.742626

Number of Observations: 7185
Number of Groups: 160 

Page 331  Table 2, random intercept, predictor = meanses

> hsb12.meanses <- groupedData(mathach ~ 1 | school, data=hsb12, outer= ~ meanses)
> pg331 <- lme(mathach ~ meanses, data=hsb12.meanses, random = ~ 1 | school)
> summary(pg331)

Linear mixed-effects model fit by REML
Data: hsb12.meanses
AIC     BIC    logLik
46969.28 46996.8 -23480.64

Random effects:
Formula:  ~ 1 | school
(Intercept) Residual
StdDev:     1.62441 6.257562

Fixed effects: mathach ~ meanses
Value Std.Error   DF  t-value p-value
(Intercept) 12.64944 0.1492801 7025 84.73622  <.0001
meanses  5.86354 0.3614580  158 16.22191  <.0001

Standardized Within-Group Residuals:
Min         Q1        Med        Q3      Max
-3.134798 -0.7525634 0.02408657 0.7677326 2.785014

Number of Observations: 7185
Number of Groups: 160 

Page 335

# compute cses
> hsb12$cses <- (hsb12$ses - hsb12$meanses) > hsb12.cses <- groupedData( mathach ~ cses | school, data=hsb12) # use unstructured correlation (default in lme function, corr = NULL) > pg335 <- lme(mathach ~ cses, data=hsb12.cses, random = ~ 1 + cses | school ) > summary(pg335) Linear mixed-effects model fit by REML Data: hsb12.cses AIC BIC logLik 46726.24 46767.52 -23357.12 Random effects: Formula: ~ cses | school Structure: General positive-definite StdDev Corr (Intercept) 2.9462163 (Inter cses 0.8328983 0.011 Residual 6.0580812 Fixed effects: mathach ~ cses Value Std.Error DF t-value p-value (Intercept) 12.64925 0.2444948 7024 51.73627 <.0001 cses 2.19291 0.1282521 7024 17.09842 <.0001 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.096421 -0.7313269 0.01767674 0.7537266 2.90052 Number of Observations: 7185 Number of Groups: 160  Pages 337-338, > hsb12.big <- groupedData( mathach ~ cses | school, data=hsb12) > pg337 <- lme(mathach ~ cses + meanses + sector + cses:meanses + cses:sector, data=hsb12.big, random = ~ 1 + cses | school, method="REML", corr=NULL) > summary(pg337) Linear mixed-effects model fit by REML Data: hsb12.big AIC BIC logLik 46524.79 46593.58 -23252.40 Random effects: Formula: ~1 + cses | school Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr (Intercept) 1.54128855 (Intr) cses 0.01829966 0.006 Residual 6.06348796 Fixed effects: mathach ~ cses + meanses + sector + cses:meanses + cses:sector Value Std.Error DF t-value p-value (Intercept) 12.113821 0.1986555 7022 60.97905 0e+00 cses 2.935840 0.1507202 7022 19.47874 0e+00 meanses 5.342946 0.3690010 157 14.47949 0e+00 sector 1.214628 0.3061363 157 3.96761 1e-04 cses:meanses 1.044054 0.2910687 7022 3.58697 3e-04 cses:sector -1.642061 0.2331191 7022 -7.04387 0e+00 Correlation: (Intr) cses meanss sector css:mn cses 0.005 meanses 0.245 0.001 sector -0.697 -0.003 -0.356 cses:meanses 0.001 0.284 0.005 -0.002 cses:sector -0.003 -0.694 -0.002 0.005 -0.351 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.17005035 -0.72487781 0.01487615 0.75426738 2.96552094 Number of Observations: 7185 Number of Groups: 160  > pg339 <- lme(mathach ~ cses + meanses + sector + cses:meanses + cses:sector, data=hsb12.big, random = ~ 1 | school, method="REML", corr=NULL) > summary(pg339) Linear mixed-effects model fit by REML Data: hsb12.big AIC BIC logLik 46520.79 46575.83 -23252.40 Random effects: Formula: ~1 | school (Intercept) Residual StdDev: 1.541212 6.063506 Fixed effects: mathach ~ cses + meanses + sector + cses:meanses + cses:sector Value Std.Error DF t-value p-value (Intercept) 12.113821 0.1986485 7022 60.98117 0e+00 cses 2.935841 0.1507053 7022 19.48068 0e+00 meanses 5.342945 0.3689877 157 14.48001 0e+00 sector 1.214627 0.3061252 157 3.96775 1e-04 cses:meanses 1.044086 0.2910422 7022 3.58741 3e-04 cses:sector -1.642070 0.2330966 7022 -7.04459 0e+00 Correlation: (Intr) cses meanss sector css:mn cses 0.005 meanses 0.245 0.001 sector -0.697 -0.003 -0.356 cses:meanses 0.001 0.284 0.005 -0.002 cses:sector -0.003 -0.694 -0.002 0.005 -0.351 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -3.17006274 -0.72487339 0.01488267 0.75425043 2.96552444 Number of Observations: 7185 Number of Groups: 160 > anova(pg337, pg339)  Model df AIC BIC logLik Test L.Ratio p-value pg337 1 10 46524.79 46593.58 -23252.40 pg339 2 8 46520.79 46575.83 -23252.40 1 vs 2 0.003256624 0.9984 Second set of examples using willet data file. To input the willett.csv data file in R, use the following code. Change d:/data/ to the where you stored the data files on your computer (and do use forward slashes to separate directory names) > willett <- read.csv("d:/data/willett.csv", header=T) Page 342 Let's describe the grouping structure. > willettg <- groupedData( y ~ time | id, data=willett) Page 342 Including both the intercept and the slope of time as random components. Note: The intercept is included in the random argument even though it is not explicitly listed in the random argument. In order to have only the slope of time included as a random component the random argument would look like this: random=~ time -1 and it would be the -1 which would indicate to R that the intercept is not random. > wfit1 <- lme(y ~ time, data=willettg, random = ~ 1 + time | id) > summary(wfit1) Linear mixed-effects model fit by REML Data: willettg AIC BIC logLik 1278.823 1296.386 -633.4114 Random effects: Formula: ~ 1 + time | id Structure: General positive-definite StdDev Corr (Intercept) 34.62336 (Inter time 11.50654 -0.45 Residual 12.62843 Fixed effects: y ~ time Value Std.Error DF t-value p-value (Intercept) 164.3743 6.118849 104 26.86360 <.0001 time 26.9600 2.166604 104 12.44344 <.0001 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.272063 -0.6003579 0.02631387 0.6073068 1.584396 Number of Observations: 140 Number of Groups: 35 Page 344 #centering the covar variable > willett$ccovar <- willett$covar - mean(willett$covar)

#grouping the data
> willettg1 <- groupedData( y ~ time | id, data=willett, outer=~ccovar)
> wfit2 <- lme(y ~ time*ccovar, data=willettg1, random = ~ time | id)
> summary(wfit2)

Linear mixed-effects model fit by REML
Data: willettg1
AIC      BIC    logLik
1276.285 1299.586 -630.1424

Random effects:
Formula: ~time | id
Structure: General positive-definite, Log-Cholesky parametrization
StdDev   Corr
(Intercept) 35.16281 (Intr)
time        10.35610 -0.489
Residual    12.62844

Fixed effects: y ~ time * ccovar
Value Std.Error  DF   t-value p-value
(Intercept) 164.37429  6.206120 103 26.485836  0.0000
time         26.96000  1.993878 103 13.521386  0.0000
ccovar       -0.11355  0.504014  33 -0.225297  0.8231
time:ccovar   0.43286  0.161928 103  2.673155  0.0087
Correlation:
(Intr) time   ccovar
time        -0.522
ccovar       0.000  0.000
time:ccovar  0.000  0.000 -0.522

Standardized Within-Group Residuals:
Min           Q1          Med           Q3          Max
-2.248168979 -0.618725671  0.004285166  0.614719013  1.556883000

Number of Observations: 140
Number of Groups: 35 

Bottom of page 346 and top of page 347

> #compound symmetry
> fit.gls.cs <- gls(y~time, data=willett, corr=corCompSymm(, form=~time | id))
> summary(fit.gls.cs)
Generalized least squares fit by REML
Model: y ~ time
Data: willett
AIC      BIC    logLik
1308.340 1320.049 -650.1698

Correlation Structure: Compound symmetry
Formula: ~time | id
Parameter estimate(s):
Rho
0.7061649

Coefficients:
Value Std.Error  t-value p-value
(Intercept) 164.3743  5.774154 28.46725       0
time         26.9600  1.465864 18.39189       0

Correlation:
(Intr)
time -0.381

Standardized residuals:
Min          Q1         Med          Q3         Max
-2.47038764 -0.71517526  0.03314509  0.80858053  1.80988181

Residual standard error: 35.77345
Degrees of freedom: 140 total; 138 residual
> #autoregressive
> fit.gls.ar1 <- gls(y~time, data=willett, corr=corAR1(, form=~time | id))
> summary(fit.gls.ar1)
Generalized least squares fit by REML
Model: y ~ time
Data: willett
AIC      BIC    logLik
1281.465 1293.174 -636.7327

Correlation Structure: AR(1)
Formula: ~time | id
Parameter estimate(s):
Phi
0.824912

Coefficients:
Value Std.Error  t-value p-value
(Intercept) 164.33842  6.136374 26.78103       0
time         27.19786  1.919857 14.16661       0

Correlation:
(Intr)
time -0.469

Standardized residuals:
Min          Q1         Med          Q3         Max
-2.42825412 -0.71561366  0.03192972  0.78792580  1.76110724

Residual standard error: 36.37940
Degrees of freedom: 140 total; 138 residual

Table on page 347. Note that unstructured is omitted since this covariance structure is not provided. Note you need to multiply logLik by -2 to get the -2RLL value in the table.

> anova(fit.gls.cs, fit.gls.ar1)

            Model df      AIC      BIC    logLik
fit.gls.cs      1  4 1308.340 1320.049 -650.1698
fit.gls.ar1     2  4 1281.465 1293.174 -636.7327

p. 348

> wfit2.ar1 <- lme( y ~ time*ccovar, data=willettg1, random = ~ 1 + time | id,
corr=corAR1(, form =~ time | id))
> summary(wfit2.ar1)

Linear mixed-effects model fit by REML
Data: willettg1
AIC      BIC    logLik
1278.045 1304.259 -630.0223

Random effects:
Formula:  ~ 1 + time | id
Structure: General positive-definite
StdDev   Corr
(Intercept) 35.46865 (Inter
time 10.53770 -0.488
Residual 11.88619

Correlation Structure: AR(1)
Formula:  ~ 1 + time | id
Parameter estimate(s):
Phi
-0.1374558
Fixed effects: y ~ time * ccovar
Value Std.Error  DF   t-value p-value
(Intercept)  164.4229  6.198601 103  26.52581  <.0001
time   26.9080  1.978001 103  13.60363  <.0001
ccovar   -0.1234  0.503403  33  -0.24517  0.8078
time:ccovar    0.4357  0.160638 103   2.71256  0.0078

Standardized Within-Group Residuals:
Min         Q1         Med        Q3      Max
-2.318034 -0.6248881 -0.01432131 0.6548167 1.567537

Number of Observations: 140
Number of Groups: 35 

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.