When we use a sample group to gain insight into an entire population – whether we’re talking people or a product built in a factory – we risk the sample group not completely reflecting the whole population. Therefore, we need confidence intervals.

We use confidence intervals to calculate a degree of certainty that the sample group accurately represents the entire population from which they were drawn. Another way of thinking of it is that if you drew the same sized sample group hundreds of times and performed the same measurements, a certain percentage of confidence intervals in those sample groups will contain the population mean.

A confidence interval is a range of values. You can have a percentage of certainty that the mean of the population lies within that range in any given sample from that population.

## Confidence Interval vs Confidence Level

A confidence interval is a range of values that probably contain the population mean.

However, a confidence level is a percentage of certainty that in any given sample, that confidence interval will contain the population mean.

## Confidence Interval vs Prediction Interval

Prediction intervals are used to calculate the next probable data point. They tell you about the distribution of data values, whereas confidence intervals tell you about a probable population mean.

## Estimating Population Mean and Standard Deviation

In most cases, we won’t know the mean and standard deviation for an entire population. However, we can calculate the mean and standard deviation for each of our samples. So we can use the sample figures to estimate those of the whole population. This is what we call a point estimate.

Once we find the point estimate, we also need to know how accurate it is. The Central Limit Theorem says that in a large enough sample group (30 or more), the sample’s mean and standard deviation will be close to those of the population.

## Z Scores

A Z score is the number of standard deviations between a data point and its mean.

You can use a Z-score table to find the corresponding Z-score for common confidence levels, or calculate the α value using this formula:

`α = (1 - confidence level) / 2`

If your desired confidence level is 95%, then your calculation would look like this:

`α = (1 - 0.95) / 2 = 0.025`

### Common confidence intervals and corresponding Z scores

 Desired Confidence Interval Z Score 90% 1.645 95% 1.96 99% 2.576

## Calculating Population Mean from a Z-Score

We don’t know the population mean, but we can calculate it with a pre-selected confidence level. All we need is the Z-score and some data from our sample.

When we’re talking about confidence intervals, we’re assuming a sample set, not just one sample. That means we need to use the multiple-sample version of the Z-score equation:

`Z = (x - x̄) / (s / √n)`

Where:

• Z is the Z-score
• x is the population mean
• is the sample mean
• s is the standard deviation of the sample
• n is the number of samples

### Confidence Interval Example

We have a sample of 25 observations with a standard deviation of 0.17. The mean of the samples is 4.32.

We want a confidence level of 95%. Firstly, we look at a Z-score table. We find that the Z-score we need is 1.96.

Secondly, we plug these values into the equation. We get:

`1.96 = (x - 4.32) / (0.17 / √25)`

Thirdly, we get `x` by itself:

```x = 1.96 * (0.17 / √25) + 4.32

x = 4.38664```

But we also want to know our margin of error.

Fourthly, we use the equation `x - x̄`; or the mean of the population minus the mean of the sample group.

`margin of error = 4.38664 - 4.32 = 0.06664`

We say that +/- 0.06664 is the margin of error. The 95% confidence interval for the population mean is 4.25 -> 4.39.

## Calculating the Confidence Interval Comparing Two Population Means

You can also use confidence intervals to compare two population means, using samples from each population. You might use this method to compare two different manufacturing methods, or to look for differences in two groups of people (for example, smokers and non smokers). You could also use it to decide whether or not it will be acceptable to pool your two population samples into one larger sample.

The confidence interval for a comparison between two means is a range of values in which the difference between those two means might lie.

We use a similar equation to the one we use to calculate a population mean. Instead of looking for x, though, we’re looking for the difference between means.

`Z = ((x̄1 - x̄2) – (x1 – x2)) / √((s12 / n1) + (s22 / n2))`

Or, in a slightly easier-to-read format:

### Example

High blood pressure has been causally linked to smoking tobacco products. To test this, you want to compare systolic blood pressure between smokers and non smokers. You’ll use a confidence level of 95%.

• You have 45 smokers and 56 non-smokers, with similar variance (age, gender, health levels) in each group.
• In the sample group of smokers, the mean systolic rate is 138.
• In the sample group of non smokers, the mean systolic rate is 135.
• The standard deviation for smokers is 16.5.
• The standard deviation for non smokers is 14.9.
• The Z-score you need is 1.96.
```n1 = 45
n2 = 56
Z = 1.96 (found by looking up 95% confidence level on the chart.)
s1 = 16.5
s2 = 14.9
x̄1 = 138
x̄2 = 135

Z = ((x̄1 - x̄2) – (x1 – x2)) / √((s12 /n1 ) + (s22 /n2 ))```

We use a similar equation to the one we use to calculate a population mean. Instead of looking for x, though, we’re looking for the difference between means.

Firstly, plug the numbers into the equation:

`Z = ((x̄1 - x̄2) – (x1 – x2)) / √((s12 / n1) + (s22 / n2))`
```1.96 = ((138 - 135) – (x1 – x2)) / √((16.52 / 45) + (14.92 / 56))
1.96 = (3 – (x1 – x2)) / √((272.25 / 45) + (222.01 / 56))
1.96 = (3 – (x1 – x2)) / √(6.05 + 3.9645)
1.96 = (3 – (x1 – x2)) / √(10.0145)
1.96 = (3 – (x1 – x2)) / (3.1646)
1.96 * (3.1646) = 3 – (x1 – x2)
6.2025 = 3 – (x1 – x2)
6.2025 + (x1 – x2) = 3
(x1 – x2) = 3 - 6.2025
(x1 – x2) = -3.2025```

To find the margin of error, use:

```margin of error = Z * √((s12 / n1) + (s22 / n2))
margin of error = 1.96 * √((16.52 / 45) + (14.92 / 56))
margin of error = 1.96 * √((272.25 / 45) + (222.01 / 56))
margin of error = 1.96 *  √(6.05 + 3.9645)
margin of error = 6.2025 ```

So our confidence interval, with a confidence level of 95%, is:

```confidence interval = difference in means ± margin of error
confidence interval = -3.2025 ± 6.2025
confidence interval = -9.405–3```

## Confidence Interval Question Using Z-Score

Question: We conduct a random survey of 500 newly-enrolled university students. We know that the standard deviation for university enrollment age is 8 years. The mean age of our sample is 24. Calculate, with 99% confidence level and to 3 decimal places, the confidence interval for all first-year university students.

Calculation: The first step is to consult a Z-score table. A 99% confidence level requires a Z-score of 2.576.

You need this equation:

`Z = (x - x̄) / (σ / √n)`

(note the use of σ instead of s in the equation above; this is because the population standard deviation is supplied)

Plug in the data:

```2.576 = (x - 24) / (8 / √500)
x - 24 = 2.576 * (8 / √500)
x - 24 = 0.922```

Stop here for a second, because that 0.922 figure is important – it’s your margin of error.

Now solve for x, and apply the margin of error on either side of x to get your confidence interval.

```x = 24.922
confidence interval = (calculated mean - margin of error) to (calculated mean + margin of error)
confidence interval = 24 to 25.844```

Answer: The confidence interval for the age of first year university students is 24–25.844, with a 99% confidence level.

## Confidence Interval Question Using T-Score

Question: A factory produces tennis balls. A sample of 19 balls is taken from one days’ production in the factory. The mean weight of the sample balls is 58.2g. The standard deviation for the sample balls is 0.4g. Calculate the confidence interval with a confidence level of 95%.

Calculation: The sample size is too small to use a Z-score. Instead, use a T-score, which uses a t-distribution. Finding a confidence interval for a mean is a two-tailed test.

You’ll need an alpha score. To calculate it, use this simple equation:

```α = (100% - confidence level%)
α = (100% - 95%)
α = 5%```

You also need the degrees of freedom (df), which is the number of samples minus one. Or in equation form:

`df = n - 1`

df for this question is 18.

Use a T-table to look up the T-score needed for a two-tailed test with an α of 5% and a df of 18: the answer is `2.101`.

You need this equation:

`T = (x - x̄) / (s / √n)`

Plug in the data:

```2.101 = (x - 58.2) / (0.4 / √19)
x - 58.2 = 2.101 * (0.4 / √19)
x - 58.2 = 0.192
x = 58.392
margin of error = 0.192
confidence interval = 58.2 to 58.585```

Answer: The 95% confidence interval for tennis balls produced in the factory is 58.2–58.585g.

## What are the Difference Between Control Limits and Confidence Intervals?

These are 2 entirely different concepts. One is used in the Analysis of a process and the other in Control of a process.

Control limits depend on your population or sample’s distribution. They can be defined as the average + or – 3 standard deviations. “Control limits are obtained based onhttps://sixsigmastudyguide.com/statistical-process-control-spc/ the nature of the distribution of data that you collect, if a process is in control doesn’t mean that your process is stable, hence control limits gives the limits at that instant.”

These are different from Specification limits which generally show up on control charts. They are assigned by business for what is viable for them. A colleague once described them as if the process goes above this level, we all must update our resumes. If it goes below this other limit, don’t worry about updating resumes, no one will ever hire us again!

Confidence intervals are a device of statistics for when you do not have perfect knowledge of all of the data. For example, imagine you are trying to infer the chance of some event happening by sampling from a population. Let’s take US voting polls. CNN can’t possibly get all of the voting data, but they can sample the population through exit polls. From that sample they can predict the winner. The question is, how sure and certain are they of how accurate their answer is? Are they 90% certain? Are they 95% certain? 99.5%?  It all depends on the confidence level required. That’s how you get answers like “We are projecting that candidate X has an 80% chance of winning with a 95% level of confidence.”

“While confidence level signifies how confident you are that the population lies within the range one has specified and this range is nothing but the confidence interval, and this has nothing to do with the control limits as control limits keeps changing if a process is not in stable though its in control. “

## Also See:

How to Calculate a Sample Size Given Standard Deviation, Confidence Interval and Margin of Error

## Other Confidence Intervals Problems

You can find some more confidence interval problems, with links to worked answers, here: Finding the Sample Size Needed for a Confidence Interval for a Single Population Mean.

## ASQ Six Sigma Black Belt Exam Confidence Intervals Questions

This section requires you to be logged in to either a Pass Your Six Sigma Exam or a free account. Sign up in seconds with the buttons below!

OR

OR

Question: Which of the following describes the 95% confidence interval of a 20% absentee rate in a department with 30 people?

(A) 6% to 34%
(B) 8% to 32%
(C) 13% to 27%
(D) 17% to 23%

```p + or - Z (α/2) * SQRT( (p*(1-p))/n )