## What is the F Distribution

The F-distribution, also known Fisher-Snedecor distribution is extensively used to test for equality of variances from two normal populations. F-distribution got its name after R.A. Fisher who initially developed this concept in 1920s. It is a probability distribution of an F-statistic.

The F-distribution is generally a skewed distribution and also related to a chi-squared distribution. The f distribution is the ratio of X_{1} random chi-square variable with degrees of freedom ϑ_{1} and X_{2} random chi-square variable with degrees of freedom ϑ_{2}. (In other words each of the chi-square random variable has been divided by its degrees of freedom)

The shape of the distribution depends on the degrees of freedom of numerator ϑ_{1} and denominator ϑ_{2}.

**What are the properties of an F Distribution?**

- F distribution curve is positively skewed towards right with range of 0 and ∞
- The value of F always positive or zero. No negative values
- The shape of the distribution depends on the degrees of freedom of numerator ϑ1 and denominator ϑ2.
- From the above graph it is clear that degree of skewness decreased with increase of degrees of freedom of numerator and denominator
- F distribution curve never be symmetrical, if degrees of freedom increases it will be more similar to the symmetrical

**When would you use the F Distribution?**

The F-test compares the more than one level of independent variable with multiple groups which uses the F distribution. This is generally used in ANOVA calculations. Always use F-distribution for F-test to compare more than two groups.

**Example:** In a manufacturing unit torque values are key parameters in terminals squeeze welding. To check the significant effect of various torque (nm) values of the squeeze welding, operator set up trails of 5nm, 8nm, 10nm and 12nm of terminals in four randomly selected batches of 30. ANOVA can determine whether the means of these 4 trails are different. ANOVA uses F-tests to statistically test the equality of means.

**Assumptions of F distribution**

- Assumes both populations are normally distributed
- Both the populations are independent to each other
- The larger sample variance always goes in the numerator to make the right tailed test, and the right tailed tests are always easy to calculate.

## F distribution Videos

## What is an F Test?

F test is to find out whether the two independent estimates of population variance differ significantly. In this case F ratio is

or

To find out whether the two samples drawn from the normal population having the same variance. In this case F ratio is

In both the cases σ_{1}^{2} > σ_{2}^{2} , S_{1}^{2} > S_{2}^{2 }in other words larger estimate of variance
always be in numerator and smaller estimate of variance in denominator

Degrees of freedom (ϑ)

- DF of larger variance (i.e numerator) =n
_{1}-1 - DF of smaller variance (i.e denominator) =n
_{2}-1

**What is an F Statistic?**

F statistic also known as F value is used in ANOVA and regression analysis to identify the means between two populations are significantly different or not. In other words F statistic is ratio of two variances (Variance is nothing but measure of dispersion, it tells how far the data is dispersed from the mean). F statistic accounts corresponding degrees of freedom to estimate the population variance.

F statistic is almost similar to t statistic. t-test states a single variable is statistically significant or not whereas F test states a group of variables are statistically significant or not.

F statistics are based on the ratio of mean squares. F statistic is the ratio of the mean square for treatment or between groups with the Mean Square for error or within groups.

F = MS Between / MS Within

If calculated F value is greater than the appropriate value of the F critical value (found in a table or provided in software), then the null hypothesis can be rejected. (helpful in ANOVA)

The calculated F-statistic for a known source of variation is found by dividing the mean square of the known source of variation by the mean square of the unknown source of variation.

**When would you use an F Test?**

There are different types of F tests are exists for different purpose.

- In statistics, an F-test of equality of variances is a test for the null hypothesis that two normal populations have the same variance.
- F-test is to test equality of several means. While ANOVA uses to test the equality of means.
- F-test for linear regression model is to tests any of the independent variables in a multiple linear regression are significant or not. It also indicates a linear relationship between dependent variable and at least one of the independent variable.

**Steps to conduct F test**

- Choose the test: Note down the independent variables and dependent variable and also assume the samples are normally distributed
- Calculate the F statistic, choose the highest variance in the numerator and lowest variance in the denominator with a degrees of freedom (n-1)
- Determine the statistical hypothesis
- State the level of significance
- Compute the critical F value from F table. (use α/2 for two tailed test)
- Calculate the test statistic
- Finally, draw the statistical conclusion. If F
_{calc}> F_{critical}, reject the null hypothesis and if F_{calc}< F_{critical }fail to reject the null hypothesis

## What is an Example of an F Test in DMAIC?

In Measure and Analyze phase of DMAIC. F test is to find out whether the two independent estimates of population variance differ significantly (or) to find out whether the two samples drawn from the normal population having the same variance.

**Example: **A botanical research team wants to study the growth of plants with the usage of urea. Team conducted 8 tests with a variance of 600 during initial state and after 6 months 6 tests were conducted with a variance of 400. The purpose of the experiment is to know is there any improvement in plant growth after 6 months at 95% confidence level

- Degrees of freedom ϑ1=8-1 =7 (highest variance in numerator)
- ϑ2 = 6-1= 5
- Statistical hypothesis:
- Null hypothesis H
_{0}: σ_{1}^{2}≤ σ_{2}^{2}

- Alternative hypothesis H
_{1}: σ_{1}^{2}≥ σ_{2}^{2}

- Null hypothesis H
- Since team wants to see the improvement it is a one-tail (right) test
- Level of significance α= 0.05

- Compute the critical F from table = 4.88
- Reject the null hypothesis if the calculated F value more than or equal to 4.88
- Calculate the F value F= S
_{1}^{2}/ S_{2}^{2}=600/400= 1.5 - F
_{calc}< F_{critical }Hence fail to reject the null hypothesis

#### p-value

From F table we can find F critical values that gives us a certain area of to the right. From the above table the area to the right of 4.88 is 0.05 and area to the right of 3.37 is 0.100. So, the area to the right of 1.5 from the graph must be more than 0.100. But we can find exact p-value using any statistical tool or excel very easily.

- Statistical conclusion: So, calculated value does not lie in the critical region. Hence fail to reject the null hypothesis at 95% confidence level

## F Test Sample Questions

In a manufacturing facility 2 Six Sigma Greenbelts monitoring part that runs on 2 different stamping presses. Each press runs the same progressive die. Student A says that he is 90% confident that the stamping presses have the same variance, while student B says at the 90% confidence level the variances are different. Which student is right? Press1: s = 0.035, n = 16 ; Press 2: s = 0.057, n = 10

## Comments (6)

How to find p-value say if my F- Test value is 9.46

What have you tried to do, Afeef?

to calculate with the variances … the highest goes first always (in the numerator)?

I’ll follow up with you in the member area, Rodney.

How to find p-value if my F-test value is 9.48

What have you tried so far, Ajith?