# Why does repeated measures ANOVA assume sphericity?

user1205901 05/12/2014. 2 answers, 407 views

Why does repeated measures ANOVA assume sphericity?

By sphericity I mean the assumption that the variance of all pairwise differences between groups should be the same.

In particular, I don't understand why this should be the assumption and not that the variances of the observed group scores themselves be the same.

1 ttnphns 05/12/2014
As I've commented here, because the difference variables between the RM levels are tied, by their origin, sphericity then implies that they have the same variances.
1 John 05/12/2014
Before answering it would be helpful if to know if you understand why independent measures ANOVA has an assumption of homogeneity of variance.
user1205901 05/13/2014
@John My understanding is this the answer given at stats.stackexchange.com/questions/81914/… correctly answers that question.
user1205901 05/13/2014
@ttnphns Unfortunately I don't quite understand your answer. Would you or some other poster be interested to spin it out into a more detailed response?

amoeba 05/19/2016.

## Intuition behind sphericity assumption

One of the assumptions of common, non repeated measures, ANOVA is equal variance in all groups.

(We can understand it because equal variance, also known as homoscedasticity, is needed for the OLS estimator in linear regression to be BLUE and for the corresponding t-tests to be valid, see Gauss–Markov theorem. And ANOVA can be implemented as linear regression.)

So let's try to reduce the RM-ANOVA case to the non-RM case. For simplicity, I will be dealing with one-factor RM-ANOVA (without any between-subject effects) that has $n$ subjects recorded in $k$ RM conditions.

Each subject can have their own subject-specific offset, or intercept. If we subtract values in one group from values in all other groups, we will cancel these intercepts and arrive to the situation when we can use non-RM-ANOVA to test if these $k-1$ group differences are all zero. For this test to be valid, we need an assumption of equal variances of these $k-1$ differences.

Now we can subtract group #2 from all other groups, again arriving at $k-1$ differences that also should have equal variances. For each group out of $k$, the variances of the corresponding $k-1$ differences should be equal. It quickly follows that all $k(k-1)/2$ possible differences should be equal.

Which is precisely the sphericity assumption.

## Why shouldn't group variances be equal themselves?

When we think of RM-ANOVA, we usually think of a simple additive mixed-model-style model of the form $$y_{ij}=\mu+\alpha_i + \beta_j + \epsilon_{ij},$$ where $\alpha_i$ are subject effects, $\beta_j$ are condition effects, and $\epsilon\sim\mathcal N(0,\sigma^2)$.

For this model, group differences will follow $\mathcal N(\beta_{j_1} - \beta_{j_2}, 2\sigma^2)$, i.e. will all have the same variance $2\sigma^2$, so sphericity holds. But each group will follow a mixture of $n$ Gaussians with means at $\alpha_i$ and variances $\sigma^2$, which is some complicated distribution with variance $V(\vec \alpha, \sigma^2)$ that is constant across groups.

So in this model, indeed, group variances are the same too. Group covariances are also the same, meaning that this model implies compound symmetry. This is a more stringent condition as compared to sphericity. As my intuitive argument above shows, RM-ANOVA can work fine in the more general situation, when the additive model written above does not hold.

## Precise mathematical statement

I am going to add here something from the Huynh & Feldt, 1970, Conditions Under Which Mean Square Ratios in Repeated Measurements Designs Have Exact $F$-Distributions.

## What happens when sphericity breaks?

When sphericity does not hold, we can probably expect RM-ANOVA to (i) have inflated size (more type I errors), (ii) have decreased power (more type II errors). One can explore this by simulations, but I am not going to do it here.

It turns out, that the effect of violating sphericity is a loss of power ( i.e. an increased probability of a Type II error) and a test statistic ( F-ratio ) that simply cannot be compared to tabulated values of F-distribution. F-test becomes too liberal ( i.e. proportion of rejections of the null hypothesis is larger than alpha level when the null hypothesis is true.

Precise investigation of this subject is very involved, but fortunately Box et al wrote a paper about that:https://projecteuclid.org/download/pdf_1/euclid.aoms/1177728786

In short, the situation is as follows. First, let's say we have one factor repeated measurements design with S subjects and A experimental treatments In this case the effect of the independent variable is tested by computing F statistic, which is computed as the ratio of the mean square of effect by the mean square of the interaction between the subject factor and the independent variable. When sphericity holds, this statistics have Fisher distribution with $\upsilon_{1}=A-1$ and $\upsilon_{2}=(A-1)(S-1)$ degrees of freedom.

In above article Box revealed, that when sphericity fails, the correct number of degrees of freedom becomes $\upsilon_{1}$ of F ratio depends on a sphericity $\epsilon$ like so : $$\upsilon_{1} = \epsilon(A-1)$$ $$\upsilon_{2} = \epsilon(A-1)(S-1)$$

Also Box introduced the sphericity index, which applies to population covariance matrix . If we call $\xi_{a,a}$ the entries of this AxA table, then the index is

$$\epsilon = \frac{\left ( \sum_{a}^{ }\xi_{a,a} \right )^{2}}{\left ( A-1 \right )\sum_{a,a'}^{ }\xi_{a,a'}^{2}}$$

The Box index of sphericity is best understood in relation to the eigenvalues of a covariance matrix. Recall that covariance matrices belong to the class of positive semi-definite matrices and therefore always has positive of null eigenvalues. Thus, the sphericity conditionis equivalent to having all eigenvalues equal to a constant.

So, when sphericity is violated we should apply some correction for our F statistics, and most notable examples of this corrections are Greenhouse-Geisser and Huynh-Feldt, for example

Without any corrections your results will be biased and so unreliable. Hope this helps!

amoeba 05/18/2016
+1. I will comment more later, but for now your first paragraph mixes together the power and the size of the test. What is impaired when sphericity is violated? The type I error rate under the null? Or the power? Or both? You probably mean both, but formulation is not very clear (I think). Also, it's not "Box et al", it's Box alone :)
I think the power will be impaired mostly, because as Box showed, when sphericity is violated we have to rely on completely different statistic ( with another degrees of freedom ). If we don't rely on that, then depending on how strong our violation is we will have larger proportion of rejections of the null hypothesis.
amoeba 05/18/2016
Sorry, still confused, now by your comment: "larger proportion of rejections of the null" -- you mean when the null is actually true? But this has nothing to do with power, this is type I error rate.
amoeba 05/19/2016
+10. I award my bounty to this answer: it's good and also it's the only answer that appeared in the bounty period. I am not fully satisfied with your answer (yet?) and I started writing my own answer (currently incomplete, but already posted), but I have only a partial understanding of the underlying math. Your answer definitely helped and the reference to Box 1954 is very helpful too.
amoeba 05/19/2016
Some further confusing moments. (1) Where does Box introduce the sphericity index $\epsilon$ in this paper? I don't see it at all. The formula for $\epsilon$ does not appear in this paper. (2) Are you sure that $\xi$s in this formula are the eigenvalues of the $A\times A$ covariance matrix? I don't think it's true: when this matrix satisfies RM-ANOVA's "sphericity condition" its eigenvalues don't have to be equal.