9.2 One-Way ANOVA
Analysis of variance extends the comparison of two groups to several, each a level of a categorical variable (factor). Samples from each group are independent, and must be randomly selected from normal populations with equal variances. We test the null hypothesis of equal means of the response in every group versus the alternative hypothesis of one or more group means being different from the others. A one-way ANOVA hypothesis test determines if several population means are equal. The distribution for the test is the F distribution with two different degrees of freedom.
- Each population from which a sample is taken is assumed to be normal.
- All samples are randomly selected and independent.
- The populations are assumed to have equal standard deviations (or variances).
- The factor is a categorical variable.
- The response is a numerical variable.
9.3 The F-Distribution and the F-Ratio
Analysis of variance compares the means of a response variable for several groups. ANOVA compares the variation within each group (within-groups variance) to the variation among the means of each group (between-groups variance). The ratio of these two is the \(F\)-statistic from an \(F\)-distribution with (number of groups – 1) as the numerator degrees of freedom and (number of observations – number of groups) as the denominator degrees of freedom. These statistics are summarized in the ANOVA table.
The graph of the \(F\)-distribution is always positive and skewed right, though the shape can be mounded or exponential depending on the combination of numerator and denominator degrees of freedom. The \(F\)-statistic is the ratio of a measure of the variation in the group means to a similar measure of the variation within the groups. If the null hypothesis is correct, then the numerator should be small compared to the denominator. A small \(F\)-statistic will result, and the area under the \(F\)-curve to the right will be large, representing a large \(p\)-value. When the null hypothesis of equal group means is incorrect, then the numerator should be large compared to the denominator, giving a large \(F\)-statistic and a small area (small \(p\)-value) to the right of the statistic under the \(F\)-curve.