Analysis of Variance (ANOVA) is a parametric statistical technique used to compare the data sets. This technique was invented by R.A. Fisher, hence it is also referred as Fisher’s ANOVA. It is similar techniques such as t-test and z-test, to compare means and also the relative variance between them.

Similarly, A t-test can be used to compare two sample means. What if we want to compare more than two means? Analysis of variance (ANOVA) is best applied where more than 2 populations or samples are meant to be compared.

It is used to test statistical significance of the relationship between a dependent variable (“Y”) and a single or multiple independent variables (“X’s”).

## Types of ANOVA

- One-way
- Measures single factor from multiple sources
- Uses only one technician / one measurer

- Two-way (without replicates)
- Measures 2 factors
- Uses only one technician (unless technicians are one of the factors)

- Two-way (with replicates)
- Measures 2 factors, but has multiple repetitions of each combination.
- Uses only one technician (unless the technicians are one of the factors)

### ANOVA Sum of squares correction factor

- Grand total of all runs (G) = ΣX
- N= Total number of runs
- Correction factor (CF)= (ΣX)
^{2}/N = (G)^{2}/N

**Terms used in ANOVA**

**Degrees of Freedom (df)**: The number of independent conclusions that can be drawn from the data.**SS**_{Factor}: It measures the variation of each group mean to the overall mean across all groups.**SS**_{Error}: It measures the variation of each observation within each factor level to the mean of the level.**Main effect:**A main effect is the effect where the performance of one variable considered in isolation by neglecting other variables in the study**Interaction:**An interaction effect occurs where the effect of one variable is different across levels of one or more other variables.**Mean Square Error (MSE)**: The mean square of the error (MSE) is divide the sum of squares of the residual error by the degrees of freedom.**F-test statistic**: The null hypothesis that the category means are equal in the population is tested by F statistic based on the ratio of mean square related to X and mean square related to error.**P-value**: It is the smallest level of significance that would lead to rejection of the null hypothesis (Ho). If α = 0.05 and the p-value ≤ 0.05, then reject the null hypothesis, similarly if p-value > 0.05, then fail to reject the null hypothesis.

**Analysis of
variance (ANOVA) has three types:**

- One-way analysis
- Two-way analysis
- K-way analysis: K-Way ANOVA can be two-way ANOVA or three-way ANOVA or multiple ANOVA

**One way ANOVA**

One-way ANOVA (one-way analysis of variance) is a statistical method to compare means of two or more populations.

**Assumptions of
One-way ANOVA**

- The sample data drawn from k populations are unbiased and representative.
- The data of k populations are continuous.
- The data of k populations are normally distributed.
- The variation within each factor or factor treatment combination is the same, and hence it is also called homogeneity of variance.
- Finally, The variances of k populations are equal.

## Steps for computing one-way ANOVA:

- Establish the hypotheses. H
_{0}: µ1= µ2= µ3 and H_{1}: At least one of the group means is different from the others. - In ANOVA, the total variance is subdivided into two independent variance; the variance due to the treatment and variance due to random error.

- SS
_{T}= SS_{b}+ SS_{w } - Calculate the ANOVA table with degrees of freedom (df), calculate for the group, error and total sum of squares.

- SS
_{b}= sum of squares between treatments - SS
_{w}= sum of squares due to error - MS
_{b}= mean square for treatments - MS
_{W}= mean square for error - SS
_{T}= total sum of squares - T= number of treatment levels
- n= number of runs at a particular level
- N= total number of runs
- F= the calculate F statistic with k-1 and N-k are the degrees of freedom
- Determine the critical value. F critical value from the F distribution table.
- Finally, Draw the statistical conclusion. If Fcalc< Fcritical fail to reject the null hypothesis and if Fcalc > Fcritical, reject the null hypothesis.

**Example** One Way ANOVA

A Car manufacturer certainly planning to conduct the tests to know the performance of 3 different brands of 12V batteries; so, he selected 5 batteries from each brand and discharged them under controlled condition. Assuming the lifetime of batteries are normally distributed at 95% confidence level. Hypothesis that there is no difference between three brands wrt lifetime.

H_{0}: µ1= µ2= µ3

H_{1}: At least one of the brand mean life is different from the others.

k = 3, n = 5, N = 15, T1 = 71, T2 = 85, and T3 = 116, and G = 272 ΣX^{2}=1021+1489+2730=5240

Correction factor (CF) = (ΣX)^{2} /N = (G)^{2} /N = (272)^{2}/15

For α = .05, the critical value for F with d.f. (2, 12) is 3.89.

Fcalc > Fcritical, hence, we may have to reject the null hypothesis.

So, calculated value does lie in the critical region. Therefore, there is evidence, at the 5% significance level that the mean lifetime of the three brands batteries do differ.

## Examp**le** One Way ANOVA

Three different brands of chlorine are used at a local pool over the course of the summer. Initially they used one brand, as one brand’s four week supply runs out, the staff begins using the next brand of chlorine. While, management is interested in finding out if the different brands have a significant effect on the ability to maintain safe pH levels. Once the previous brand is no longer present in any measurable amount, the pH levels for each brand are collected through random sampling. Is there a significant difference in the three brands of chlorine? Test at the 5% significance level. Complete the ANOVA table (except the p value) manually.

**Two
way ANOVA**

Two-way ANOVA performs an analysis of variance for testing the equality of populations means when classification of treatments is by two categorical (independent) variables or factors.

**Assumptions of Two-way ANOVA**

- The cells contain independent samples
- Two main effects and an interaction
- The populations should have equal variance
- Balanced data and fixed factors

**Steps for computing Two-way ANOVA:**

- Establish the hypotheses: The null hypotheses for each of the sets are given below:
- The population means of the first factor are equal. This is like the one-way ANOVA for the row factor.
- The population means of the second factor are equal. This is like the one-way ANOVA for the column factor.
- There is no interaction between the two factors. This is similar to performing a test for independence with contingency tables.
- Calculate the test statistic, and rejection region.

- Calculate SS
_{AB(Interaction)}as= SS_{t}-SS_{w}-SS_{A}-SS_{B}- a: number of levels in factor A

- b: number of levels in factor B

- r: total number of trials

- Xi..: mean of the i th factor level of factor A

- X…: overall mean of all observations

- X
_{.j.}: mean of the j th factor level of factor B

- X
_{ij.}: mean of observations at the i th level of factor A and the j th level of factor B

- Calculate the ANOVA table with degrees of freedom (df) and F value as below

- Determine the critical value. Remember, F(critical) is from the F distribution table.
- Finally, Draw the statistical conclusion. If Fcalc< Fcritical fail to reject the null hypothesis and if Fcalc > Fcritical, reject the null hypothesis.

**Example:** A researcher conducting an experiment to check the effectiveness of coating. A new coating applied to 2 different materials and also research conducted at 2 different laboratories. For instance, each laboratory tested 5 samples from each of the treated materials. Find the results from the below table:

For α = .05, the critical value for F with d.f. (1, 16) is 4.49.

There appears to be both Laboratory and Material Fcalc > Fcritical hence, we reject the null hypothesis. But for the interaction Fcalc< Fcritical, hence, fail to reject the null hypothesis.

**Multivariate analysis of variance** (MANOVA)

Multivariate analysis of variance (MANOVA) is simply an ANOVA with several dependent variables. That is to say, ANOVA tests for the difference in means between two or more groups, while MANOVA tests for the difference in two or more vectors of means.

MANOVA examines the dependence relationship between a set of dependent measures across a set of groups. Moreover this analysis is used in experimental design, and usually where hypothesized relationship between dependent measures is used.

**Assumptions of MANOVA**

- The response variables are continuous
- The residuals follow the multivariate-normal probability distribution with means equal to zero.
- The variance-covariance matrices of each group of residuals are equal.
- The individuals are independent.

Interestingly, in addition to detecting differences in the average values, it is also detect differences in correlations among the dependent variables between the different levels of the independent variable.

**Benefits of MANOVA**

There are various advantages of MANOVA over ANOVA.

- Study any interaction between the factors.
- Study two or more factors simultaneously increase the model’s efficiency.
- MANOVA reduces the chances of alpha risk.
- Less residual variation in the model when more factors are in the study.

## Additional Resources for ANOVA

http://psychnut.com/statistics/even-more-about-1-way-anova/

http://www.psychstat.missouristate.edu/introbook/sbk27.htm (ANOVA detailed)

http://oak.ucc.nau.edu/rh232/courses/EPS525/Handouts/Understanding%20the%20One-way%20ANOVA.pdf

## Six Sigma Black Belt Certification ANOVA Questions:

**Question:** To assess the significance of factors in either a fractional or a full-factorial experiment structure, a black belt can use: (Taken from ASQ sample Black Belt exam.)

(A) analysis of variance (ANOVA)

(B) fault tree analysis (FTA)

(C) failure mode and effects analysis (FMEA)

(D) evolutionary operation (EVOP)

Login to your account

OR

Enroll in Pass Your Six Sigma Exam

## Comments (4)

What is the purpose of ANOVA?

An ANOVA usually is used to compare the means of three or more factors by using the F Distribution.

Hey,

In your Two Way ANOVA example of the lab/materials, I don’t know how understand what I am supposed to sum up for SStotal to get 201. I was hoping there was a workthrough on how you got each of those numbers for SStotal, SSwithin

Hello Alex,

I have updated the article to include detail calculations of SStotal, SSwithin, SSrowfactor, SScolumnfactor etc.

Hope this clarifies!

Thanks