We will demonstrate the how to conduct pairwise comparisons in R and the different options for adjusting the p-values of these comparisons given the number of tests conducted. We will be using the hsb2 dataset and looking at the variable write by ses. We will first look at the means and standard deviations by ses.
hsb2<-read.table("http://www.ats.ucla.edu/stat/r/notes/hsb2.csv", sep=",", header=T) attach(hsb2) tapply(write, ses, mean) high low middle 55.91379 50.61702 51.92632 tapply(write, ses, sd) high low middle 9.442874 9.490391 9.106044
In R, we can run the ANOVA with the aov command.
a1 <- aov(write ~ ses) summary(a1) Df Sum Sq Mean Sq F value Pr(>F) ses 2 858.7 429.4 4.9696 0.007842 ** Residuals 197 17020.2 86.4 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
From this output, we can see that ses is significant in the 2-degrees of freedom test, but we do not know which pairs of ses levels are significantly different from each other. However, this will require three tests (high vs. low, high vs. middle, low vs. middle), so we wish to adjust what we consider to be statistically significant to account for this multiplicity of tests. For an one-way ANOVA (ANOVA with a single factor) We can first see the unadjusted p-values using the pairwise.t.test command and indicating no adjustment of p-values:
pairwise.t.test(write, ses, p.adj = "none") Pairwise comparisons using t tests with pooled SD data: write and ses high low low 0.0041 - middle 0.0108 0.4306 P value adjustment method: none
With this same command, we can adjust the p-values according to a variety of methods. Below we show Bonferroni and Holm adjustments to the p-values and others are detailed in the command help.
pairwise.t.test(write, ses, p.adj = "bonf") Pairwise comparisons using t tests with pooled SD data: write and ses high low low 0.012 - middle 0.032 1.000 P value adjustment method: bonferroni pairwise.t.test(write, ses, p.adj = "holm") Pairwise comparisons using t tests with pooled SD data: write and ses high low low 0.012 - middle 0.022 0.431 P value adjustment method: holm
We can see that the adjustments all lead to increased p-values, but consistently the high-low and high-middle pairs appear to be significantly different at alpha = .05. The pairwise.t.test command does not offer Tukey post-hoc tests, but there are other R commands that allow for Tukey comparisons. Below, we show code for using the TukeyHSD (Tukey Honest Significant Differences).
TukeyHSD(a1) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = write ~ ses) $ses diff lwr upr p adj low-high -5.296772 -9.604818 -0.9887256 0.0114079 middle-high -3.987477 -7.645265 -0.3296892 0.0289035 middle-low 1.309295 -2.605257 5.2238465 0.7096950
We can see that these results are significant with what we saw using other adjustments for the p-values.
You may be fitting an ANOVA with multiple factors. Below we look at write on ses and female.
a2 <- aov(write ~ ses + female) summary(a2) Df Sum Sq Mean Sq F value Pr(>F) ses 2 858.7 429.4 5.3869 0.005279 ** female 1 1398.1 1398.1 17.5410 4.247e-05 *** Residuals 196 15622.1 79.7 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
We can look at pair-wise comparisons of the ses levels after adjusting for female. The TukeyHSD command still works well, though now we must specify which factor is of interest.
TukeyHSD(a2, "ses") Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = write ~ ses + female) $ses diff lwr upr p adj low-high -5.296772 -9.434764 -1.1587797 0.0079527 middle-high -3.987477 -7.500879 -0.4740753 0.0216707 middle-low 1.309295 -2.450736 5.0693251 0.6896199
The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.