One and Two Sample T tests are used to resolve hypothesis tests around comparing process means. The underlying chart makes use of the T distribution.

## One Sample T Hypothesis Hypothesis Test

### What is a One Sample T Hypothesis Test?

The one sample t test allows to compares the population mean to some hypothesized value or one sample mean to determine if they are significantly different. For example, if we know the average weight of chickens in a farm is 3lb, and wish to compare average weight of sample black hens to the population mean value.

### One Sample T Test Hypothesis

Null hypothesis (H_{0}): The difference between population mean and the hypothesized value is equal to zero

Alternative hypothesis (H_{1}):

- The population mean is not equal to hypothesized value (two-tailed)
- The population mean is greater than hypothesized value (upper-tailed)
- The population mean is less than hypothesized value (lower-tailed)

### Assumptions of One Sample T Hypothesis Test

- Data is continuous and quantitative at the scale level (in other words data in ratio or interval)
- The sample should be randomly selected from the population
- Samples are independent to each other
- Data should follow normal probability distribution
- Assumes it don’t have extreme outliers in the dependent variable

### When Would You Use a One Sample T Hypothesis Test?

One sample t test is a type of parametric test because the assumption is samples are randomly distributed. It tests whether the sample mean is significantly different than a population mean when the standard deviation of the population is unknown**. **Hence t test is used when the population standard deviation is unknown and the sample size is below 30, otherwise use Z-test (for known variance)

### Steps to Calculate One Sample T Hypothesis Test

- State the claim of the test and determine the null hypothesis and alternative hypothesis
- Determine the level of significance
- Calculate degrees of freedom
- Find the critical value
- Calculate the test statistics

- Where
- x̅ is observed sample mean
- μ
_{0}is population mean - s is sample standard deviation
- n is the number of the observations in the sample

- Make a decision, the null hypothesis will be rejected if the test statistic is less than or equal to the critical value
- Finally, Interpret the decision in the context of the original claim.

### Example of a One Sample T Hypothesis Test in a DMAIC Project

One Sample T test mostly performed in Analyze phase of DMAIC to check the significant difference between the population mean and the sample means, while paired t-test can be performed in Measure phase to review before and after process improvement (see below example for more details).

According to American health association the average blood pressure of a pregnant women is 120 mm Hg. Collected 15 random samples from pregnant women to check the sample blood pressure is different from accepted standard blood pressure.

- Null Hypothesis: No difference between sample data and population blood pressure (H
_{0}: μ=120) - Alternative Hypothesis: There is a difference between sample data and population blood pressure (H
_{1}: μ≠120)

Significance level: α=0.05

Degrees of freedom:15-1= 14

Calculate the critical value

If the calculated t value is less than -2.145 or greater than 2.145, then reject the null hypothesis.

Test statistics

- x̅ = 123
- μ
_{0}= 120

Calculated t statistic value less than the critical value, hence failed to reject null hypothesis ( H_{0}). So, there is no significant difference between sample mean and population mean.

### Additional One Sample T Hypothesis Test Resources

http://www.cliffsnotes.com/math/statistics/univariate-inferential-tests/one-sample-t-test (One sample T test)

### One Sample T Hypothesis Test Videos

## Two Sample T Hypothesis Test

### What is a Two Sample T Hypothesis Test?

A two sample t test used to analyze the difference between two independent population means. The Two-sample T-test is used when two small samples (n< 30) are taken from two different populations and compared.

### Assumptions of Two Sample T Hypothesis Test

- The sample should be randomly selected from the two population
- Samples are independent to each other
- Variance of two populations are equal
- Data should be continuous

### When Would You Use a Two Sample T Hypothesis Test?

The two sample t test most likely used to compare two process means, when the data is having one nominal variable and one measurement variable. It is a hypothesis test of means.

The two sample t test is used to compare two population means, while analysis of variance (ANOVA) is the best option if more than two group means to be compared.

Two sample T test performed when the two group samples are statistically independent to each other, while the paired t-test is used to compare the means of two dependent or paired groups.

### Steps to Calculate Two Sample T Hypothesis Test

- State the claim of the test and determine the null hypothesis and alternative hypothesis
- Determine the level of significance
- Calculate degrees of freedom
- Find the critical value
- Calculate the test statistics

Where Sp is the pooled standard deviation

- Make a decision, the null hypothesis will be rejected if the test statistic is less than or equal to the critical value
- Finally, Interpret the decision in the context of the original claim.

### Example of a Two Sample T Hypothesis Test in a DMAIC Project

Two Sample T test mostly performed in Analyze phase of DMAIC to evaluate the difference between two process means are really significant or due to random chance, this is basically used to validate the root cause(s) or Critical Xs (see the below example for more detail)

Apple orchard farm owner wants to compare the two farms to see if there are any weight difference in the apples. From farm A, randomly collected 15 apples with an average weight of 86 gms, and the standard deviation is 7. From farm B, collected 10 apples with an average weight of 80 gms and standard deviation of 8. With a 95% confidence level, is there any difference in the farms?

- Null Hypothesis (H
_{0}) : Mean apple weight of farm A is equal to farm B - Alternative Hypothesis (H
_{1}) : Mean apple weight of farm A is not equal to farm B

Significance level: α=0.05

Degrees of freedom df: 10+15-2= 23

Calculate critical value

If the calculated t value is less than -2.069 or greater than 2.069, then reject the null hypothesis.

#### Test Statistic

Calculated t statistic value less than the critical value, hence failed to reject null hypothesis ( H_{0}). So, there is no significant difference between mean weights of apples in farm A and farm B.

### Two Sample T Hypothesis Test Videos

### Additional Two Sample T Hypothesis Test Resources

Good example of two sample T tests here.

- http://www.cliffsnotes.com/math/statistics/univariate-inferential-tests/two-sample-t-test-for-comparing-two-means (Two sample T test for comparing two means / DF for separate s: the smaller of n 1– 1 and n 2– 1 DF for pooled s:
*df*=*n*1+*n*2– 2)

**2 SAMPLE T-TEST FOR MEANS (UNEQUAL VARIANCE)**

Test for equality of two means. This test is used to determine if the mean of one sample is not equal to the mean of another sample.

If the test is for determining whether the mean of one sample is different than another sample, the test hypothesis is then specified as:

- H
_{0}: μ_{a}– μ_{b}= d - H
_{1}: μ_{a}– μ_{b}!= d

- This is called a 2 tailed test

If the test is for determining whether the mean of one sample is larger or smaller than another sample, the test hypothesis is then specified as:

- H
_{0}: μ_{a}– μ_{b}= d - H
_{1}: μ_{a}– μ_{b}> d - This is called a 1 tailed test

Or,

- H
_{0}: μ_{a}– μ_{b}= d - H
_{1}: μ_{a}– μ_{b}< d - This is called a 1 tailed test

d = hypothesized difference between mean of each sample.

If the question to be answered is “any difference” between the mean of sample “a” and the mean of sample “b”, then d is set to 0.

**ASSUMPTIONS:**

- Observations (data):
- Are Normally distributed.
- Note: Although this method is relatively robust against non-normality the shape of distribution should at least be symmetric.

- Within and between samples (groups) are independent of each other.

- Are randomly “drawn”.

- Are continuous.
- Depending on some conditions (e.g. wide data range and field of applications), data that is not continuous (e.g. count)
*may still*be used.

- Certain ratio type data (e.g. proportions and percentages) should not be used.

- Depending on some conditions (e.g. wide data range and field of applications), data that is not continuous (e.g. count)

- Are Normally distributed.
- Variances are unknown and estimated by sample variances (S
^{2}). - Variances are not equal, S
_{a}^{2}≠ S_{b}^{2}

**STEPS & FORMULA:**

#### Step 1 – Calculate mean (X Bar) of each sample

- X Bar = (x1 + x2 + ….+ xn) / n
- N is the number of observations (data)
- x1, x2, …xn are observations in each sample (“na” = sample size in sample a and “nb” = sample size in sample b)

#### Step 2 – Calculate variance (S^{2}) of each sample

- S
^{2}= {( X Bar– x1)^{2}+ (X Bar – x2)^{2}+ … +(X Bar – xn)^{2}} / n

#### Step 3 – Calculate t value

- T-calc = {(X Bar a – X Bar b) – d} / SQRT ( (S
^{2}a / na) + (S^{2}b/nb) ) - X Bar a & X Bar b are means of sample a and sample b
- For practical reason, (X Bar a – X Bar b) can be stated in the absolute form |X Bar a – X Bar b|

- d is the hypothesized difference
- S
^{2}a & S^{2}b are variances of sample a and sample b - na & nb are sample sizes of sample a and sample b

#### Step 4 – Calculate degrees of freedom (DF)

- DF = (s
^{2}_{a }/ n_{a}+ s^{2}_{b }/ n_{b})^{2}/ { [ (s^{2}_{a}/ n_{a})^{2}/ (n_{a}– 1) ] + [ (s^{2}_{b}/ n_{b})^{2}/ (n_{b}– 1) ] } - Round to nearest integer

#### Step 5 – Compare t-calc with critical value in the t-distribution table.

- Look up the t-critical value in t-distribution table given calculated degrees of freedom in column α (significance level):
- For 1 tailed test the t-critical is based on the given α
- The commonly used T-critical value for a 2 tailed test is in the 0.025 α column

- For 2 tailed test the t-critical is based on the given α divided by 2
- The commonly used T-critical value for a 2 tailed test is in the 0.025 α column

- For 1 tailed test the t-critical is based on the given α

**INTERPRETATION:**

- If t-calc > t-critical then there is a statistical significant difference (≠, > or <) between the mean of sample a and sample b
- If t- calc < t-critical there is insufficient evidence that the means differ.

## Comments (3)

In the blood pressure question, can you please explain how you got 7.3 for s? No matter what I do, I am always getting to 7.08

Thanks for the head’s up, Jeremy. I see an opportunity for improvement on both of the examples listed. I’ll update asap.

Jeremy,

I added additional detail in the calculation steps.

For these equations with so many variables I find it helpful to go slowly and write out the smaller operations of each part of the calculation.

Does this make sense?