Two Sample T tests are used to resolve hypothesis tests around comparing process means. The underlying chart makes use of the T distribution.

## Two Sample T Hypothesis Test

### What is a Two Sample T Hypothesis Test?

A two sample t test used to analyze the difference between two independent population means. The Two-sample T-test is used when two small samples (n< 30) are taken from two different populations and compared.

### Assumptions of Two Sample T Hypothesis Test

- The sample should be randomly selected from the two population
- Samples are independent to each other
- Data should be continuous

### When Would You Use a Two Sample T Hypothesis Test?

The two sample t test most likely used to compare two process means, when the data is having one nominal variable and one measurement variable. It is a hypothesis test of means.

The two sample t test is used to compare two population means, while analysis of variance (ANOVA) is the best option if more than two group means to be compared.

Two sample T test performed when the two group samples are statistically independent to each other, while the paired t-test is used to compare the means of two dependent or paired groups.

### Note: There are (2) types of Two Sample T Hypothesis tests!

- Two Sample T Hypothesis Test (Equal Variance)
- Variance of two populations are equal

- Two Sample T Hypothesis Test (Unequal Variance)
- Variance of two populations are NOT equal

## Steps to Calculate Two Sample T Hypothesis Test (Equal Variance)

- State the claim of the test and determine the null hypothesis and alternative hypothesis
- Determine the level of significance
- Calculate degrees of freedom
- Find the critical value
- Calculate the test statistics

Where Sp is the pooled standard deviation

- Make a decision, the null hypothesis will be rejected if the test statistic is less than or equal to the critical value
- Finally, Interpret the decision in the context of the original claim.

### Example of a Two Sample T Hypothesis Test (Equal Variance) in a DMAIC Project

Two Sample T test mostly performed in Analyze phase of DMAIC to evaluate the difference between two process means are really significant or due to random chance, this is basically used to validate the root cause(s) or Critical Xs (see the below example for more detail)

Apple orchard farm owner wants to compare the two farms to see if there are any weight difference in the apples. From farm A, randomly collected 15 apples with an average weight of 86 gms, and the standard deviation is 7. From farm B, collected 10 apples with an average weight of 80 gms and standard deviation of 8. With a 95% confidence level, is there any difference in the farms?

- Null Hypothesis (H
_{0}) : Mean apple weight of farm A is equal to farm B - Alternative Hypothesis (H
_{1}) : Mean apple weight of farm A is not equal to farm B

Significance level: α=0.05

Degrees of freedom df: 10+15-2= 23

Calculate critical value

If the calculated t value is less than -2.069 or greater than 2.069, then reject the null hypothesis.

#### Test Statistic

Calculated t statistic value less than the critical value, hence failed to reject null hypothesis ( H_{0}). So, there is no significant difference between mean weights of apples in farm A and farm B.

### Two Sample T Hypothesis Test (Equal Variance) Videos

### Additional Two Sample T Hypothesis Test Resources

Good example of two sample T tests here.

- http://www.cliffsnotes.com/math/statistics/univariate-inferential-tests/two-sample-t-test-for-comparing-two-means (Two sample T test for comparing two means / DF for separate s: the smaller of n 1– 1 and n 2– 1 DF for pooled s:
*df*=*n*1+*n*2– 2)

**2 SAMPLE T-TEST FOR MEANS (UNEQUAL VARIANCE)**

Test for equality of two means. This test is used to determine if the mean of one sample is not equal to the mean of another sample.

If the test is for determining whether the mean of one sample is different than another sample, the test hypothesis is then specified as: 2 tailed test

- H
_{0}: μ_{a}– μ_{b}= d - H
_{1}: μ_{a}– μ_{b}!= d

If the test is for determining whether the mean of one sample is larger or smaller than another sample, the test hypothesis is then specified as: one-tailed test

- H
_{0}: μ_{a}– μ_{b}= d - H
_{1}: μ_{a}– μ_{b}> d

Or,

- H
_{0}: μ_{a}– μ_{b}= d - H
_{1}: μ_{a}– μ_{b}< d

d = hypothesized difference between mean of each sample.

If the question to be answered is “any difference” between the mean of sample “a” and the mean of sample “b”, then d is set to 0.

**ASSUMPTIONS:**

- Observations (data):
- Are Normally distributed.
- Note: Although this method is relatively robust against non-normality the shape of distribution should at least be symmetric.

- Within and between samples (groups) are independent of each other.

- Are randomly “drawn”.

- Are continuous.
- Depending on some conditions (e.g. wide data range and field of applications), data that is not continuous (e.g. count)
*may still*be used.

- Not to use certain ratio type data (e.g. proportions and percentages).

- Depending on some conditions (e.g. wide data range and field of applications), data that is not continuous (e.g. count)

- Are Normally distributed.
- Variances are unknown and estimated by sample variances (S
^{2}). - Variances are not equal, S
_{a}^{2}≠ S_{b}^{2}

**STEPS & FORMULA:**

#### Step 1 – Calculate mean (X Bar) of each sample

- X Bar = (x1 + x2 + ….+ xn) / n
- N is the number of observations (data)
- x1, x2, …xn are observations in each sample (“na” = sample size in sample a and “nb” = sample size in sample b)

#### Step 2 – Calculate variance (S^{2}) of each sample

- S
^{2}= {( X Bar– x1)^{2}+ (X Bar – x2)^{2}+ … +(X Bar – xn)^{2}} / n-1

#### Step 3 – Calculate t value

- T-calc = {(X Bar a – X Bar b) – d} / SQRT ( (S
^{2}a / na) + (S^{2}b/nb) ) - X Bar a & X Bar b are means of sample a and sample b
- For practical reason, (X Bar a – X Bar b) can be stated in the absolute form |X Bar a – X Bar b|

- d is the hypothesized difference
- S
^{2}a & S^{2}b are variances of sample a and sample b - na & nb are sample sizes of sample a and sample b

#### Step 4 – Calculate degrees of freedom (DF)

- DF = (s
^{2}_{a }/ n_{a}+ s^{2}_{b }/ n_{b})^{2}/ { [ (s^{2}_{a}/ n_{a})^{2}/ (n_{a}– 1) ] + [ (s^{2}_{b}/ n_{b})^{2}/ (n_{b}– 1) ] } - Round to nearest integer

#### Step 5 – Compare t-calc with critical value in the t-distribution table.

- Look up the t-critical value in t-distribution table given calculated degrees of freedom in column α (significance level):
- For 1 tailed test the t-critical is based on the given α
- The commonly used T-critical value for a 2 tailed test is in the 0.025 α column

- For 2 tailed test the t-critical is based on the given α divided by 2
- The commonly used T-critical value for a 2 tailed test is in the 0.025 α column

- For 1 tailed test the t-critical is based on the given α

**INTERPRETATION:**

- If t-calc > t-critical then there is a statistical significant difference (≠, > or <) between the mean of sample a and sample b
- If t- calc < t-critical there is insufficient evidence that the means differ.

## Comments (5)

In the blood pressure question, can you please explain how you got 7.3 for s? No matter what I do, I am always getting to 7.08

Thanks for the head’s up, Jeremy. I see an opportunity for improvement on both of the examples listed. I’ll update asap.

Jeremy,

I added additional detail in the calculation steps.

For these equations with so many variables I find it helpful to go slowly and write out the smaller operations of each part of the calculation.

Does this make sense?

Hi,

the above states formula for Sample Variation as

S2 = {( X Bar– x1)2 + (X Bar – x2)2 + … +(X Bar – xn)2} / n

however the IASSC Reference document is stating

S2 = {( X Bar– x1)2 + (X Bar – x2)2 + … +(X Bar – xn)2} / n-1

Could you please clarify

Thanks

Maria

Thank you Maria,

When we divide by n in the sample variance S2, it is not an unbiased estimate of the population variance. Hence it is always recommended to use n-1 instead of n.

I have updated the formula

Thanks