Two Sample T tests are used to resolve hypothesis tests around comparing process means. The underlying chart makes use of the T distribution.

Two Sample T Hypothesis Test

What is a Two Sample T Hypothesis Test?

A two sample t test used to analyze the difference between two independent population means. The Two-sample T-test is used when two small samples (n< 30) are taken from two different populations and compared.

Assumptions of Two Sample T Hypothesis Test

  • The sample should be randomly selected from the two population
  • Samples are independent to each other
  • Data should be continuous

When Would You Use a Two Sample T Hypothesis Test?

The two sample t test most likely used to compare two process means, when the data is having one nominal variable and one measurement variable. It is a hypothesis test of means.

The two sample t test is used to compare two population means, while analysis of variance (ANOVA) is the best option if more than two group means to be compared.

Two sample T test performed when the two group samples are statistically independent to each other, while the paired t-test is used to compare the means of two dependent or paired groups. 

Note: There are (2) types of Two Sample T Hypothesis tests!

  • Two Sample T Hypothesis Test (Equal Variance)
    • Variance of two populations are equal
  • Two Sample T Hypothesis Test (Unequal Variance)
    • Variance of two populations are NOT equal

Steps to Calculate Two Sample T Hypothesis Test (Equal Variance)

  • State the claim of the test and determine the null hypothesis and alternative hypothesis
  • Determine the level of significance
  • Calculate degrees of freedom
  • Find the critical value
  • Calculate the test statistics
One and Two Sample T Tests

Where Sp is the pooled standard deviation

  • Make a decision, the null hypothesis will be rejected if the test statistic is less than or equal to the critical value
  • Finally, Interpret the decision in the context of the original claim.

Example of a Two Sample T Hypothesis Test (Equal Variance) in a DMAIC Project

Two Sample T test mostly performed in Analyze phase of DMAIC to evaluate the difference between two process means are really significant or due to random chance, this is basically used to validate the root cause(s) or Critical Xs (see the below example for more detail)

Apple orchard farm owner wants to compare the two farms to see if there are any weight difference in the apples. From farm A, randomly collected 15 apples with an average weight of 86 gms, and the standard deviation is 7. From farm B, collected 10 apples with an average weight of 80 gms and standard deviation of 8. With a 95% confidence level, is there any difference in the farms?

  • Null Hypothesis (H0) : Mean apple weight of farm A is equal to farm B
  • Alternative Hypothesis (H1) : Mean apple weight of farm A is not equal to farm B

Significance level: α=0.05

Degrees of freedom df: 10+15-2= 23

Calculate critical value

One and Two Sample T Tests
Two Tailed T Test

If the calculated t value is less than -2.069 or greater than 2.069, then reject the null hypothesis.

Test Statistic

Calculated t statistic value less than the critical value, hence failed to reject null hypothesis ( H0). So, there is no significant difference between mean weights of apples in farm A and farm B.

Two Sample T Hypothesis Test (Equal Variance) Videos

Additional Two Sample T Hypothesis Test Resources

Good example of  two sample T tests here.

2 SAMPLE T-TEST FOR MEANS (UNEQUAL VARIANCE)

Test for equality of two means.  This test is used to determine if the mean of one sample is not equal to the mean of another sample.

If the test is for determining whether the mean of one sample is different than another sample, the test hypothesis is then specified as: 2 tailed test

  • H0: μa – μb = d
  • H1: μa – μb != d

If the test is for determining whether the mean of one sample is larger or smaller than another sample, the test hypothesis is then specified as: one-tailed test

  • H0: μa – μb = d
  • H1: μa – μb > d

Or,

  • H0: μa – μb = d
  • H1: μa – μb < d

d = hypothesized difference between mean of each sample. 

If the question to be answered is “any difference” between the mean of sample “a” and the mean of sample “b”, then d is set to 0. 

ASSUMPTIONS:

  • Observations (data):
    • Are Normally distributed. 
      • Note: Although this method is relatively robust against non-normality the shape of distribution should at least be symmetric.
    • Within and between samples (groups) are independent of each other.
    • Are randomly “drawn”.
    • Are continuous.
      • Depending on some conditions (e.g. wide data range and field of applications), data that is not continuous (e.g. count) may still be used.
      • Not to use certain ratio type data (e.g. proportions and percentages).
  • Variances are unknown and estimated by sample variances (S2).
  • Variances are not equal, Sa2 ≠ Sb2

STEPS & FORMULA:

Step 1 – Calculate mean (X Bar) of each sample

  •  X Bar = (x1 + x2 + ….+ xn) / n
  • N is the number of observations (data)
  • x1, x2, …xn are observations in each sample (“na” = sample size in sample a and “nb” = sample size in sample b)

Step 2 – Calculate variance (S2) of each sample

  • S2 = {( X Bar– x1)2 + (X Bar – x2)2 + … +(X Bar – xn)2} / n-1

Step 3 – Calculate t value

  • T-calc = {(X Bar a – X Bar b) – d} / SQRT ( (S2a / na) + (S2b/nb) )
  • X Bar a &  X Bar b are means of sample a and sample b
    • For practical reason, (X Bar a –  X Bar b) can be stated in the absolute form |X Bar a –  X Bar b|
  • d is the hypothesized difference
  • S2a & S2b are variances of sample a and sample b
  • na & nb are sample sizes of sample a and sample b

Step 4 – Calculate degrees of freedom (DF)

  • DF = (s2a / na + s2/ nb)2 / { [ (s2a / na)2 / (na – 1) ] + [ (s2b / nb)2 / (nb – 1) ] }
  • Round to nearest integer

Step 5 – Compare t-calc with critical value in the t-distribution table.

  • Look up the t-critical value in t-distribution table given calculated degrees of freedom in column α (significance level):
    • For 1 tailed test the t-critical is based on the given α
      • The commonly used T-critical value for a 2 tailed test is in the 0.025 α column
    • For 2 tailed test the t-critical is based on the given α divided by 2
      • The commonly used T-critical value for a 2 tailed test is in the 0.025 α column

INTERPRETATION:

  • If t-calc > t-critical then there is a statistical significant difference (≠, > or <) between the mean of sample a and sample b
  • If t- calc < t-critical there is insufficient evidence that the means differ.

Contributors

Comments (5)

In the blood pressure question, can you please explain how you got 7.3 for s? No matter what I do, I am always getting to 7.08

Jeremy,

I added additional detail in the calculation steps.

For these equations with so many variables I find it helpful to go slowly and write out the smaller operations of each part of the calculation.

Does this make sense?

Hi,

the above states formula for Sample Variation as
S2 = {( X Bar– x1)2 + (X Bar – x2)2 + … +(X Bar – xn)2} / n

however the IASSC Reference document is stating

S2 = {( X Bar– x1)2 + (X Bar – x2)2 + … +(X Bar – xn)2} / n-1

Could you please clarify

Thanks
Maria

Thank you Maria,

When we divide by n in the sample variance S2, it is not an unbiased estimate of the population variance. Hence it is always recommended to use n-1 instead of n.

I have updated the formula

Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.