The advantage of a paired t-test – Marco Plebani

The following is a graphical explanation of the advantage of accounting for non-independence of observations when testing for differences between groups.

rm(list=ls())
set.seed(66)
# simulate data
x1 <- rnorm(n=100, mean=30, sd= 10)
x2 <- rnorm(n=100, mean=60, sd= 10)

# arrange the data in a dataset
dd <- data.frame(ID=rep(paste("ID", seq(1,100, by=1), sep="_"),2),
			response=c(x1,x2),
			diet=c(rep("A", 100), rep("B", 100))
			)
dd$response2 <- c(sort(x1, decreasing = FALSE), sort(x2, decreasing = TRUE))

Diets A and B refer to the same 100 subjects. For example, they may represent the body fat percentage of 100 pigs when being fed with diet A and diet B. The body fat distribution for Diet A is the same in response and response2; so is the body fat distribution for Diet B.

In response, data points from groups A and B are not associated according to any pattern (Fig.1: a;c): body fat percentage is overall higher for Diet B, but each subject responds differently to the change in diet. In response2, body fat percentage is higher for Diet B than in is for Diet A for all subjects (Fig.1: b;d). The scenario described by response2 represents a more uniform response to the change in diet: in other words, the response to the change in diet has less intra-individual variability.

If we test for differences between response and response2 in the two groups without accounting for subject identity, the t-tests give exactly the same results:

t.test(dd$response[which(dd$group=="A")],
		dd$response[which(dd$group=="B")],
		paired=F, var.equal=T
		)

	# Two Sample t-test

# data:  dd$response[which(dd$group == "A")] and dd$response[which(dd$group == "B")]
# t = -22.9, df = 198, p-value <2e-16
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
 # -32.549 -27.388
# sample estimates:
# mean of x mean of y 

t.test(dd$response2[which(dd$group=="A")],
		dd$response2[which(dd$group=="B")],
		paired=F, var.equal=T
		)

	# Two Sample t-test

# data:  dd$response2[which(dd$group == "A")] and dd$response2[which(dd$group == "B")]
# t = -22.9, df = 198, p-value <2e-16
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
 # -32.549 -27.388
# sample estimates:
# mean of x mean of y 
   # 30.403    60.371

By running paired t-tests we account for the non-independence of the observations under Diet A and Diet B. This allows to reduce the amount of unexplained variability, thus increasing the signal-to-noise ratio:

t.test(dd$response[which(dd$group=="A")],
		dd$response[which(dd$group=="B")],
		paired=T, var.equal=T
		)
	# Paired t-test

# data:  dd$response[which(dd$group == "A")] and dd$response[which(dd$group == "B")]
# t = -22.5, df = 99, p-value <2e-16
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
 # -32.607 -27.331
# sample estimates:
# mean of the differences 
                # -29.969 

t.test(dd$response2[which(dd$group=="A")],
		dd$response2[which(dd$group=="B")],
		paired=T, var.equal=T
		)
	# Paired t-test

# data:  dd$response2[which(dd$group == "A")] and dd$response2[which(dd$group == "B")]
# t = -177, df = 99, p-value <2e-16
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
 # -30.305 -29.632
# sample estimates:
# mean of the differences 
                # -29.969

The t-value measures the size of the difference between two groups relative to the variability in the data. The closer the value of t is to zero, the more likely it is that the two groups are not significantly different. Note how the module of the t-value is higher in the latter case.

Did you enjoy this? Consider joining my on-line course “First steps in data analysis with R” and learn data analysis from zero to hero!

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.