Often we are dealing with experiments or studies with more than one explanatory variable. In such situation, more than one treatment is applied simultaneously and we want to know what’s their relative effect on the response variable. If both explanatory variables are categorical, the analysis traditionally used is a two-way ANOVA.
Example: Carbon Dioxide Uptake in Grass Plants
The CO2 data frame has 84 rows and 5 columns of data from an experiment on the cold tolerance of the grass species Echinochloa crus-galli.
The CO2 uptake of six plants from Quebec and six plants from Mississippi was measured at several levels of ambient CO2 concentration. Half the plants of each type were chilled overnight before the experiment was conducted.
library(lattice) # quick and dirty way of displaying "complicated" data. xyplot(uptake ~ Treatment, group=Type, type = c("p","r"), auto.key=T, # adds legend data=CO2)
bwplot(uptake ~ Treatment | Type, data=CO2)
# Define the models to compare: CO2.lm.full <- lm(uptake ~ Treatment * Type, data=CO2) CO2.lm.additive <- lm(uptake ~ Treatment + Type, data=CO2) CO2.lm.null <- lm(uptake ~ 1, data=CO2)
anova(CO2.lm.additive, CO2.lm.null) anova(CO2.lm.full, CO2.lm.additive) anova(CO2.lm.additive, CO2.lm.full) summary(CO2.lm.additive)
# models can be also compared via AIC: AIC(CO2.lm.full, CO2.lm.additive, CO2.lm.null)
(Here are some considerations on AIC by Brian McGill.)
summary() provides mean differences between treatments as well as p-values from pairwise t-tests, but it does so only in comparison to a reference level (the one that comes first alpha-numerically). It is possible to produce all possible contrasts by changing the reference level by hand using function
relevel(), but I find the following function way more practical:
pairwise.t.test(x = CO2$uptake, g = CO2$Treatment, p.adj="none") # Oooops, it only handles one response variable at a time! # easy fix: CO2$treatment_combined <- paste(CO2$Treatment, CO2$Type, sep="_") # check: unique(CO2$treatment_combined) pairwise.t.test(x = CO2$uptake, g = CO2$treatment_combined, p.adj="none")
Similarly, it’s possible to obtain mean and SD for each treatment combination from
summary(), but the method shown below is much faster:
library(doBy) msd <- summaryBy(uptake ~ Treatment * Type, data=CO2, FUN=c(mean, sd, length) )
Now we can plot the data as a barplot:
mids <- barplot(msd$uptake.mean, xlab="Treatment", ylab="CO2 uptake", ylim=c(0,50), names.arg = msd$Treatment, col=c(rep(c(grey(0.8), grey(0.3)),2)) ) # Add error bars using arrows(): # ?arrows with(msd, arrows(mids, uptake.mean-uptake.sd, mids, uptake.mean + uptake.sd, code=3, angle=90, length=0.1)) # length=0.1mm legend("topright", title="Origin", legend=unique(msd$Type), fill=c(grey(0.8), grey(0.3)) )