Often used in science, mathematics, and other fields of knowledge, the Basic Hypothesis Testing Process provides a framework for identifying a question you would like to answer, determining the data you need in order to achieve significant results, choosing the correct test for your hypothesis, interpreting the results, and making a decision based on the data.
Identify Question
It’s vitally important to figure out which question you’re actually trying to answer. Once you have that, you can develop your null and alternative hypotheses. Generally speaking, you’re looking to create a simple question that asks whether factor x affects scenario y, with an answer of ‘yes’ or ‘no’.
Examples of Hypothesis Questions
My friend has very pale blond hair. He tells me that his hair is fine because it’s blond. I disagree — I think that fine hair can be found in any hair color.
To figure this out once and for all, I could test his basic claim that blond hair is more likely to be fine than other hair colors.
A bad example of a question would be:
Is blond hair or other hair colors more likely to be fine?
Why? Because it doesn’t lend itself to a yes/no answer. It’s actually proposing three possible answers (‘blond’, ‘other’, ‘chance’), rather than offering a choice between ‘chance’ and ‘not chance’. Plus, it doesn’t actually define ‘fine’ hair in measurable terms.
A good example of a question would be:
Is hair classified as naturally ‘blond’ more likely to measure less than 60 microns in diameter?
Identify your Null and Alternative Hypothesis
Once I’ve framed the question in a simple yes/no format without ambiguities, developing the null and alternative hypotheses is actually pretty simple. The null hypothesis should state that chance is the only factor in seeming correlations between hair color and hair diameter:
There is no correlation between hair color and the diameter of a single hair.
The alternative hypothesis should state that, as my friend claims, blond hair is more likely to be of small diameter:
Hair classified as naturally ‘blond’ is more likely to measure less than 60 microns in diameter.
Null Hypothesis ( H0 )
The assumption that experimental results are due to chance alone is called Null Hypothesis.
First, a Null Hypothesis is what you would expect by chance alone.
Second, a Null Hypothesis assumes things to be equal.
Third, a Null Hypothesis is NOT your theory
- When the null hypothesis contains only an equal sign
- The hypothesis test has two tails (or rejection regions).
- The alternative hypothesis contains a “not equal to” sign.
- It can be rejected by the test statistic being significantly large or small.
Statement of zero or no change. If the original claim includes equality (<=, =, or >=), it is the null hypothesis. If the original claim does not include equality (<, not equal, >) then the null hypothesis is the complement of the original claim. The null hypothesis always includes the equal sign. The decision is based on the null hypothesis.
Alternative Hypothesis ( H1 or Ha )
This is your theory.
A statement that is true if the null hypothesis is false. The type of test (left, right, or two-tail) is based on the alternative hypothesis.
When the null hypothesis contains only an equal sign, the alternative hypothesis contains a “not equal to” sign.
Alternative Hypothesis for a Two-Tailed Test
H0: µnew = µcurrent Ha: µnew is not = µcurrent
Examples of Hypothesis Statements
Ex. Are Cycle times / Error rates / Conversion rates statistically different based on different features (groups of people, processes followed, geography, level of training, age, etc)
Example: Has the cycle time of my transaction changed from year 1 to year 2?
H0 = Average of Year 1 = Average of Year 2 ; No change occurred; any change is due to chance alone
Ha = Average of Year 1 NOT = Average of Year 2.
Example # 2: Determine if a new machining process has reduced the diameter of a product
H0 = It did not reduce the diameter
HA = It did reduce the diameter
Beware of Hypothesis Testing Errors
- Type 1 Error (Alpha) – Happens when our significance level is too large
- Type 2 Error (Beta) – Happens when our significance level is too small
Type I error (alpha risk)
Rejecting the null hypothesis when it is true (saying false when true). Usually the more serious error.
Type 1 error involves the Significance level. For example, if alpha = 5%, then 5% of the time we will say there is a real difference between the null and alternate hypothesis (reject the null hypothesis) when there is no evidence of a difference.
Type II error (beta risk)
Failing to reject the null hypothesis when it is false (saying true when false).
alpha
Probability of committing a Type I error.
beta
Probability of committing a Type II error.
Determine Significance
The next step is to figure out the significance applied to your test. This consists of two basic elements: sample size and confidence level.
When it comes to the sample size, the ideal is to gather data for the whole population on which you’re focusing. However, trying to gather data on a whole population (for example, the entire population of the United States) is cost-prohibitive. So you need a sample of the population – one that is large enough to provide an acceptable cross-section of the population in terms of the hypotheses being tested.
You also need to decide on a confidence level. This is how sure you need to be that the results you receive are actually statistically significant and that the conclusion based on them is correct. Once you’ve decided that, you can calculate the alpha level, which is simply (1 – confidence level). The standard confidence level used is 95%, or 0.95. Hence, the standard alpha level is 5%, or 0.05.
How certain do we need to be of the sampling? Remember, you only use hypothesis testing when analyzing a sample of an entire population.
Choose Test
It’s important to choose the correct style of test to apply to your sample. This depends on the hypotheses you’re looking at and the kind of data that you’re using. Some basic questions that can help you to decide which test to use are:
- What level of measurement was used?
- How many different samples were used?
- What type of analysis do you need to do?
The questions that you need to ask could be far more complicated – see the National Center for Biotechnology’s in-depth article, How to choose the right statistical test?
Interpret Results
Once you’ve run your data through the selected test, you’ll have your results. But that’s not the end of your work! The next step is to interpret those results. One of the key values supplied by any statistical test is the p-value, which gives you the probability that you will make an error in your conclusion depending on the results.
Test statistic
The Test Statistic is calculated from sample data. In order to test the null hypothesis, a test calculation is made from the sample. That calculated (test) value is then compared to a critical value. Depending on the comparison, decisions are made based on where the test statistic falls based on the critical value.
The NULL hypothesis is never accepted; we fail to reject it. We are always testing the NULL.
If the TEST STATISTIC falls in the rejection region (beyond critical value), then we REJECT THE NULL
Confidence level (95%) + Significance level (5%) = 100 %
Due to chance alone, 95% of the time the test statistic will fall in the “Fail to reject” region, and 5% of the time, due to chance alone, the test statistic will fall in the “Critical region or Rejection Region”
P Value
P value: The probability of the sample being studied could have been drawn from the population due to chance
If P is low, Null must Go
A p-value less than the alpha level decided upon in the Decide Significance step means that you can assume that your results are statistically significant. The null hypothesis can be rejected, and the alternative hypothesis can be supported.
A p-value greater than the alpha level means that you cannot assume that your results are statistically significant, and hence cannot reject the null hypothesis.
- The P value is integral in using a hypothesis test to make a decision. It reflects the possibility of falsely rejecting the null hypothesis when it really is true.
- If the P value is less than or equal to the agreed-upon significance level (alpha), then you reject the null and can support the alternate hypothesis.
- If the P value is greater, then you cannot reject the null hypothesis. (in stats terms, you have to fail to reject the null) And thus you cannot support the alternate.
This video walk-through of an independent sample t-test provides a simple example of interpreting test results:
Make Decision
The final step is to make a decision from your results and draw up your conclusion. There are two basic decisions that you can make when you’ve interpreted the results:
- Reject the null hypothesis and support the alternative hypothesis.
- Fail to reject the null hypothesis.
Once you’ve made the decision, you need to construct a conclusion. This should clearly communicate your original hypothesis, the sample on which it was tested, the decision you made, and any additional information that you think is important to convey.
This video provides some good step-by-step instructions on how to construct a conclusion based on your decision:
Six Sigma Black Belt Certification for Hypothesis Questions:
Question: Which of the following terms is used to describe the risk of a type I error in a hypothesis test?
(Taken from ASQ sample Black Belt exam.)
(A) Power
(B) Confidence level
(C) Level of significance
(D) Beta risk
Answer:
C: Level of Significance. A type 1 error involves the Significance level. For example, if alpha = 5%, then 5% of the time we will say there is a real difference between the null and alternate hypothesis (reject the null hypothesis) when there is no evidence of a difference. The lower the alpha, the lower our chance of making a type 1 error.