What is the F-Distribution?

The F-distribution, also known Fisher-Snedecor distribution, is extensively used to test for equality of variances from two normal populations. F-distribution got its name after R.A. Fisher, who initially developed this concept in the 1920s. It is a probability distribution of an F-Statistic.

F Distribution, F Statistic, F Test

The F-distribution is generally skewed and also related to a Chi-Squared distribution. Additionally, the f-distribution is the ratio of the X1 random chi-square variable with degrees of freedom ϑ1 and the X2 random chi-square variable with degrees of freedom ϑ2. In other words, each Chi-Square random variable has been divided by its degrees of freedom.

F distribution F test

The shape of the distribution depends on the degrees of freedom of numerator ϑ1 and denominator ϑ2.  

What are the properties of an F-distribution?

F Distribution, F Statistic, F Test
  • The F-distribution curve is positively skewed towards the right with a range of 0 and ∞.
  • The value of F is always positive or zero. No negative values.
  • The shape of the distribution depends on the degrees of freedom of numerator ϑ1 and denominator ϑ2.  
  • From the above graph, it is clear that the degree of skewness decreased with an increase in the degrees of freedom of the numerator and denominator.
  • The f-distribution curve can never be symmetrical; if degrees of freedom increase it will be more similar to the symmetrical.

When would you use the F-distribution?

The F-test compares more than one level of independent variable with multiple groups, which uses the F distribution. This is generally used in ANOVA calculations. Always use F-distribution for F-test to compare more than two groups.

Example: In a manufacturing unit, torque values are key parameters in terminal squeeze welding. To check the significant effect of various torque (nm) values of the squeeze welding, an operator set up trials of 5nm, 8nm, 10nm, and 12nm of terminals in four randomly selected batches of 30. ANOVA can determine whether the means of these four trials are different. ANOVA uses F-tests to statistically test the equality of means.

Assumptions of F-distribution

  • Assumes both populations are normally distributed.
  • Both populations are independent of each other.
  • The larger sample variance always goes in the numerator to make the right-tailed test, and the right-tailed tests are always easy to calculate.

F-distribution Videos

What is an F-Test?

The F-test determines whether the two independent estimates of population variance differ significantly. In this case, the F-ratio is:

F distribution F test

or

To determine whether the two samples drawn from the normal population have the same variance. In this case, the F-ratio is:

In both cases, σ12 > σ22,  S12 > S22, in other words, a larger estimate of variance always be in the numerator and a smaller estimate of variance in the denominator.

Degrees of freedom (ϑ)

  • DF of larger variance (i.e., numerator) =n1-1
  • DF of smaller variance (i.e., denominator) =n2-1

What is an F-Statistic?

F-statistic, also known as F-value, is used in ANOVA and regression analysis to identify whether the means between two populations are significantly different or not. In other words, F-statistic is the ratio of two variances (Variance is nothing but a measure of dispersion; it tells how far the data is dispersed from the mean). F-statistic accounts for corresponding degrees of freedom to estimate the population variance.

F-statistic is almost similar to t-statistic. The T-test states whether a single variable is statistically significant or not, whereas the F-test states whether a group of variables is statistically significant.

F-statistics are based on the ratio of mean squares. F-statistic is the ratio of the mean square for treatment or between groups with the Mean Square for error or within groups.

F = MS Between / MS Within

F Distribution, F Statistic, F Test

If the calculated F-value is greater than the appropriate value of the F critical value (found in a table or provided in software), then the null hypothesis can be rejected. (helpful in ANOVA)

The calculated F-statistic for a known source of variation is found by dividing the mean square of the known source of variation by the mean square of the unknown source of variation.

When would you use an F-Test?

There are different types of F tests, each for a different purpose.

  • In statistics, an F-test of equality of variances is a test for the null hypothesis that two normal populations have the same variance.
  • The F-test tests the equality of several means. At the same time, ANOVA tests the equality of means.
  • F-tests for linear regression models are to test whether any of the independent variables in the multiple linear regression are significant or not. It also indicates a linear relationship between the dependent variable and at least one of the independent variables.

Steps to Conduct an F-test

  • Choose the test: Note the independent and dependent variables and assume the samples are normally distributed.
  • Calculate the F statistic and choose the highest variance in the numerator and lowest variance in the denominator with a degree of freedom (n-1).
  • Determine the statistical hypothesis.
  • State the level of significance.
  • Compute the critical F value from the F table. (use α/2 for a two-tailed test)
  • Calculate the test statistic.
  • Finally, draw the statistical conclusion. Reject the null hypothesis if the test statistic falls in the critical region.

Examples of Using an F-Test in a DMAIC Project

In the Measure and Analyze phase of DMAIC, the F-test is used to determine whether the two independent estimates of population variance differ significantly or whether the two samples drawn from the normal population have the same variance.

Using an F-Test in Hypothesis Testing

tailed hypothesis is an assumption about a population parameter. The assumption may or may not be true. A one-tailed hypothesis tests the hypothesis where the area of rejection is only in one direction. Whereas two-tailed, the area of rejection is in two directions. The selection of one or two-tailed tests depends upon the problem.

In the F-sampling distribution, F is calculated by dividing the variance of one sample by the other sample’s variance.

For the right-tailed and two-tailed tests, keep the highest variance as the numerator and the lowest variance as the denominator. So, the numerator’s degree of freedom is the degree of freedom for the sample with the highest variance. The degree of freedom for the denominator is the degree of freedom for the sample with the lowest variance.

Meanwhile, for the left-tailed test, keep the lowest variance as the numerator and the highest variance as the denominator. So, the numerator’s degree of freedom is the degree of freedom for the sample with the lowest variance. The degree of freedom for the denominator is the degree of freedom for the sample with the highest variance.

Using an F-Test in Hypothesis testing: Right-tailed test

Example: A botanical research team wants to study the growth of plants using urea. The team conducted eight tests with a variance of 600 during the initial state, and after six months, six tests were conducted with a variance of 400. The experiment aims to know if there is any improvement in plant growth after six months at a 95% confidence level.

  • Degrees of freedom ϑ1 = 8 -1 = 7 (highest variance in the numerator)
  • ϑ2 = 6 – 1 = 5
  • Statistical hypothesis:
    • Null hypothesis H0: σ12≤ σ22
    • Alternative hypothesis H1: σ12>σ22
  • Since the team wants to see the improvement, it is a one-tail (right) test
  • Level of significance α= 0.05
F distribution F test
  • Compute the critical F from table = 4.88
  • Reject the null hypothesis if the calculated F-value is more than or equal to 4.88
  • Calculate the F-value F= S12/ S22 =600/400= 1.5
  • Fcalc< Fcritical Hence, fail to reject the null hypothesis
F Distribution, F Statistic, F Test

p-value

We can find F critical values from the F table that give us a certain area to the right. From the above table, the area to the right of 4.88 is 0.05, and the area to the right of 3.37 is 0.100. So, the area to the right of 1.5 from the graph must be more than 0.100. However, we can easily find the exact p-value using any statistical tool or Excel.

How to calculate p-value in Excel

  • Enter numerator degrees of freedom in the B1 cell.
  • Then, enter the denominator degrees of freedom value in the B2 cell.
  • Enter the calculated F value in the B3 cell.
  • Now, p-value = use FDIST function in Excel. In B4 cell type =FDIST(B3, B1, B2), click on enter, and you will get the exact p-value for the right-tailed test.
F Distribution, F Statistic, F Test

Interpret the results:

Compare f calc to f critical. In hypothesis testing, a critical value is a point on the test distribution that is compared to the test statistic to determine whether to reject the null hypothesis. Since the f cal value is less than the f critical value, it is not in the rejection region. Hence, we failed to reject the null hypothesis at a 95% confidence level.

F-Test Right-Tailed Excel Template

Unlock Additional Members-only Content!

To unlock additional content, please upgrade now to a full membership.
Upgrade to a Full Membership
If you are a member, you can log in here.

Thank You for being a Member!

Here’s some of the bonus content that is only available to you as a paying member.

Helpful Videos

When you’re ready, there are a few ways I can help:

First, join 30,000+ other Six Sigma professionals by subscribing to my email newsletter. A short read every Monday to start your work week off correctly. Always free.

If you’re looking to pass your Six Sigma Green Belt or Black Belt exams, I’d recommend starting with my affordable study guide:

1)→ 🟢Pass Your Six Sigma Green Belt​

2)→ ⚫Pass Your Six Sigma Black Belt ​​

You’ve spent so much effort learning Lean Six Sigma. Why leave passing your certification exam up to chance? This comprehensive study guide offers 1,000+ exam-like questions for Green Belts (2,000+ for Black Belts) with full answer walkthroughs, access to instructors, detailed study material, and more.

​ Join 10,000+ students here. 

Authors

Comments (8)

to calculate with the variances … the highest goes first always (in the numerator)?

Hi MG,

This sounds like a homework or a test prep question. I can only answer questions from my particular test bed. Why not join today?

If you have a question about a concept presented on this page, feel free to rephrase and I’ll do my best to help.

Best, Ted.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.