When we use a sample group to gain insight into an entire population – whether we’re talking people or a product built in a factory – we risk the sample group not completely reflecting the whole population. Therefore, we need confidence intervals.

We use confidence intervals to calculate a degree of certainty that the sample group accurately represents the entire population from which they were drawn. Another way of thinking of it is that if you drew the same sized sample group hundreds of times and performed the same measurements, a certain percentage of confidence intervals in those sample groups will contain the population mean.

A confidence interval is a range of values. You can have a percentage of certainty that the mean of the population lies within that range in any given sample from that population.

## Confidence Interval vs Confidence Level

A confidence *interval* is a range of values that probably contain the population mean.

However, a confidence *level* is a percentage of certainty that in any given sample, that confidence interval will contain the population mean.

## Confidence Interval vs Prediction Interval

Prediction intervals are used to calculate the next probable data point. They tell you about the distribution of data values, whereas confidence intervals tell you about a probable population mean.

## Estimating Population Mean and Standard Deviation

In most cases, we won’t know the mean and standard deviation for an entire population. However, we can calculate the mean and standard deviation for each of our samples. So we can use the sample figures to estimate those of the whole population. This is what we call a *point estimate*.

Once we find the point estimate, we also need to know how accurate it is. The Central Limit Theorem says that in a large enough sample group (30 or more), the sample’s mean and standard deviation will be close to those of the population.

## Z Scores

A Z score is the number of standard deviations between a data point and its mean.

You can use a Z-score table to find the corresponding Z-score for common confidence levels, or calculate the α value using this formula:

α = (1 - confidence level) / 2

If your desired confidence level is 95%, then your calculation would look like this:

α = (1 - 0.95) / 2 = 0.025

### Common confidence intervals and corresponding Z scores

Desired Confidence Interval | Z Score |

90% | 1.645 |

95% | 1.96 |

99% | 2.576 |

## Calculating Population Mean from a Z-Score

We don’t know the population mean, but we can calculate it with a pre-selected confidence level. All we need is the Z-score and some data from our sample.

When we’re talking about confidence intervals, we’re assuming a sample set, not just one sample. That means we need to use the multiple-sample version of the Z-score equation:

Z = (x - x̄) / (s / √n)

Where:

**Z**is the Z-score**x**is the population mean**x̄**is the sample mean**s**is the standard deviation of the sample**n**is the number of samples

### Confidence Interval Example

We have a sample of 25 observations with a standard deviation of 0.17. The mean of the samples is 4.32.

We want a confidence level of 95%. Firstly, we look at a Z-score table. We find that the Z-score we need is 1.96.

Secondly, we plug these values into the equation. We get:

1.96 = (x - 4.32) / (0.17 / √25)

Thirdly, we get `x`

by itself:

x = 1.96 * (0.17 / √25) + 4.32 x = 4.38664

But we also want to know our margin of error.

Fourthly, we use the equation `x - x̄`

; or the mean of the population minus the mean of the sample group.

margin of error = 4.38664 - 4.32 = 0.06664

We say that +/- 0.06664 is the margin of error. The 95% confidence interval for the population mean is 4.25 -> 4.39.

## Calculating the Confidence Interval Comparing Two Population Means

You can also use confidence intervals to compare two population means, using samples from each population. You might use this method to compare two different manufacturing methods, or to look for differences in two groups of people (for example, smokers and non smokers). You could also use it to decide whether or not it will be acceptable to pool your two population samples into one larger sample.

The confidence interval for a comparison between two means is a range of values in which the *difference* between those two means might lie.

We use a similar equation to the one we use to calculate a population mean. Instead of looking for x, though, we’re looking for the difference between means.

Z = ((x̄_{1}- x̄_{2}) – (x_{1}– x_{2})) / √((s_{1}^{2}/ n_{1}) + (s_{2}^{2}/ n_{2}))

Or, in a slightly easier-to-read format:

### Example

High blood pressure has been causally linked to smoking tobacco products. To test this, you want to compare systolic blood pressure between smokers and non smokers. You’ll use a confidence level of 95%.

- You have 45 smokers and 56 non-smokers, with similar variance (age, gender, health levels) in each group.
- In the sample group of smokers, the mean systolic rate is 138.
- In the sample group of non smokers, the mean systolic rate is 135.
- The standard deviation for smokers is 16.5.
- The standard deviation for non smokers is 14.9.
- The Z-score you need is 1.96.

n_{1}= 45 n_{2}= 56 Z = 1.96 (found by looking up 95% confidence level on the chart.) s_{1}= 16.5 s_{2}= 14.9 x̄_{1}= 138 x̄_{2}= 135 Z = ((x̄_{1}- x̄_{2}) – (x_{1}– x_{2})) / √((s_{1}^{2}/n1 ) + (s_{2}^{2}/n2 ))

We use a similar equation to the one we use to calculate a population mean. Instead of looking for x, though, we’re looking for the difference between means.

Firstly, plug the numbers into the equation:

Z = ((x̄_{1}- x̄_{2}) – (x_{1}– x_{2})) / √((s_{1}^{2}/ n_{1}) + (s_{2}^{2}/ n_{2}))

1.96 = ((138 - 135) – (x_{1}– x_{2})) / √((16.5^{2}/ 45) + (14.9^{2}/ 56)) 1.96 = (3 – (x_{1}– x_{2})) / √((272.25 / 45) + (222.01 / 56)) 1.96 = (3 – (x_{1}– x_{2})) / √(6.05 + 3.9645) 1.96 = (3 – (x_{1}– x_{2})) / √(10.0145) 1.96 = (3 – (x_{1}– x_{2})) / (3.1646) 1.96 * (3.1646) = 3 – (x_{1}– x_{2}) 6.2025 = 3 – (x_{1}– x_{2}) 6.2025 + (x_{1}– x_{2}) = 3 (x_{1}– x_{2}) = 3 - 6.2025 (x_{1}– x_{2}) = -3.2025

To find the margin of error, use:

margin of error = Z * √((s_{1}^{2}/ n_{1}) + (s_{2}^{2}/ n_{2})) margin of error = 1.96 * √((16.5^{2}/ 45) + (14.9^{2}/ 56)) margin of error = 1.96 * √((272.25 / 45) + (222.01 / 56)) margin of error = 1.96 * √(6.05 + 3.9645) margin of error = 6.2025

So our confidence interval, with a confidence level of 95%, is:

confidence interval = difference in means ± margin of error confidence interval = -3.2025 ± 6.2025 confidence interval = -9.405–3

## Confidence Interval Question Using Z-Score

**Question:** We conduct a random survey of 500 newly-enrolled university students. We know that the standard deviation for university enrollment age is 8 years. The mean age of our sample is 24. Calculate, with 99% confidence level and to 3 decimal places, the confidence interval for all first-year university students.

**Calculation:** The first step is to consult a Z-score table. A 99% confidence level requires a Z-score of 2.576.

You need this equation:

Z = (x - x̄) / (σ / √n)

(note the use of **σ** instead of **s** in the equation above; this is because the population standard deviation is supplied)

Plug in the data:

2.576 = (x - 24) / (8 / √500) x - 24 = 2.576 * (8 / √500) x - 24 = 0.922

Stop here for a second, because that 0.922 figure is important – it’s your margin of error.

Now solve for x, and apply the margin of error on either side of x to get your confidence interval.

x = 24.922 confidence interval = (calculated mean - margin of error) to (calculated mean + margin of error) confidence interval = 24 to 25.844

**Answer:** The confidence interval for the age of first year university students is 24–25.844, with a 99% confidence level.

## Confidence Interval Question Using T-Score

**Question:** A factory produces tennis balls. A sample of 19 balls is taken from one days’ production in the factory. The mean weight of the sample balls is 58.2g. The standard deviation for the sample balls is 0.4g. Calculate the confidence interval with a confidence level of 95%.

**Calculation: **The sample size is too small to use a Z-score. Instead, use a T-score, which uses a t-distribution. Finding a confidence interval for a mean is a two-tailed test.

You’ll need an alpha score. To calculate it, use this simple equation:

α = (100% - confidence level%) α = (100% - 95%) α = 5%

You also need the degrees of freedom (df), which is the number of samples minus one. Or in equation form:

df = n - 1

df for this question is 18.

Use a T-table to look up the T-score needed for a two-tailed test with an α of 5% and a df of 18: the answer is `2.101`

.

You need this equation:

T = (x - x̄) / (s / √n)

Plug in the data:

2.101 = (x - 58.2) / (0.4 / √19) x - 58.2 = 2.101 * (0.4 / √19) x - 58.2 = 0.192 x = 58.392 margin of error = 0.192 confidence interval = 58.2 to 58.585

**Answer: **The 95% confidence interval for tennis balls produced in the factory is 58.2–58.585g.

## Additional Confidence Intervals Videos

## What are the Difference Between Control Limits and Confidence Intervals?

These are 2 entirely different concepts. One is used in the Analysis of a process and the other in Control of a process.

Control limits depend on your population or sample’s distribution. They can be defined as the average + or – 3 standard deviations. “Control limits are obtained based onhttps://sixsigmastudyguide.com/statistical-process-control-spc/ the nature of the distribution of data that you collect, if a process is in control doesn’t mean that your process is stable, hence control limits gives the limits at that instant.”

These are different from Specification limits which generally show up on control charts. They are assigned by business for what is viable for them. A colleague once described them as if the process goes above this level, we all must update our resumes. If it goes below this other limit, don’t worry about updating resumes, no one will ever hire us again!

Confidence intervals are a device of statistics for when you do not have perfect knowledge of all of the data. For example, imagine you are trying to infer the chance of some event happening by sampling from a population. Let’s take US voting polls. CNN can’t possibly get all of the voting data, but they can sample the population through exit polls. From that sample they can predict the winner. The question is, how sure and certain are they of how accurate their answer is? Are they 90% certain? Are they 95% certain? 99.5%? It all depends on the confidence level required. That’s how you get answers like “We are projecting that candidate X has an 80% chance of winning with a 95% level of confidence.”

“While confidence level signifies how confident you are that the population lies within the range one has specified and this range is nothing but the confidence interval, and this has nothing to do with the control limits as control limits keeps changing if a process is not in stable though its in control. “

## Also See:

How to Calculate a Sample Size Given Standard Deviation, Confidence Interval and Margin of Error

## Other Confidence Intervals Problems

You can find some more confidence interval problems, with links to worked answers, here: Finding the Sample Size Needed for a Confidence Interval for a Single Population Mean.

## ASQ Six Sigma Black Belt Exam Confidence Intervals Questions

Login to your account

OR

Enroll in Pass Your Six Sigma Exam

OR

Get a Free Account

**Question: **Which of the following describes the 95% confidence interval of a 20% absentee rate in a department with 30 people?

(A) 6% to 34%

(B) 8% to 32%

(C) 13% to 27%

(D) 17% to 23%

**Answer:** A 6% to 34%.

This is a confidence intervals for proportion question. p + or – Z (α/2) * SQRT( (p*(1-p))/n )

- p = 0.2
- α = 5% (Use this to look up the Z Score on the Z table.)
- n = 30

p + or - Z (α/2) * SQRT( (p*(1-p))/n ) 0.2 + or - Z (5%) * SQRT( (0.2*(1-0.2))/30 ) 0.2 + or -1.96 * SQRT( 0.0053) 0.2 + or - 0.1431 0.2 + 0.1431 = 0.3431 => 34.31% => Round down to 34% 0.2 - 0.1431 = 0.05686 => 5.68% => Round up to 5%[/membership]

## Comments (2)

I don’t see a formula displayed for margin of error. In the sample workthroughs, its just assumed that a person working through it would know what value was the margin of error and how to appropriately apply it.

Alex,

I’ll see what I can add here. Remember that these materials should be supplementary to previous Six Sigma training.

Best, Ted