Sampling is a data collection technique that is used when you want to create a statistically-sound conclusion from a subset of a population of data. In a DMAIC sense this is most common in the Measure phase.

## Why Use Data Sampling?

Sometimes trying to gather information on a complete population is just cost prohibitive. Think about CNN’s coverage of an election cycle in the United States. It is not possible to ask every voter how they voted. Even if it were, not all would answer. Instead they use exit polls to derive statistical conclusions about the population as a whole.

## Concerns About Data Sampling

When taking a sample from a larger population you must make sure that the samples are an appropriate size and are sampled without bias. You should address these concerns in your data collection plan.

For example, it is very helpful if the sample size is large enough for the data to follow normal distribution as this opens the door to use an array of statistical tools.

Note: Here’s a helpful article on avoiding bias in sampling.

## How Large Should a Data Sample Be?

### The calculation for how large a sample data set should be depends on:

- The type of data (continuous or discrete) being measured
- How precise you want your statistical inferences to be.
- The estimate of the standard deviation for the entire population.
- The confidence level desired.

### Sample size needed for hypothesis testing depends on:

- Desired Risk (Both alpha and beta)
- Minimum value to be detected between the population means (u – u0)
- The variation in the characteristic being measured (S or sigma) – the population variance.
- Even parameter shift sensitivity
- (Population size does NOT come into the determination of how big a population is.)

## How to Calculate a Sample Size

How to Calculate a Sample Size – the techniques are different given your situation. Here’s the most common ways to calculate a sample size.

### How to Calculate a Sample Size Videos

### Variable Data Sample Size

n = Z^2 * σ^2 / E^2.

Where n is the sample size, Z is the Z score from the desired risk, sigma is the standard deviation and E is the mean shift – or error.

### Binomial Data Sample Size

n = Z^2 (p bar) (1-p bar) / (Δp)^2

Where p bar is the proportion rate, Δp is the desired proportion interval.

### How to Calculate a Sample Size Given Standard Deviation, Confidence Interval and Margin of Error

The equation we want to use is: Sample Size = (Z*σ / Margin of Error)^2

#### Step 1: Find the Z score

We need Z(α/2), where α is the confidence interval.

You would just look that up on the Z table.

#### Step 2: Apply the Equation Sample Size = (Z*σ / Margin of Error)^2

Just a simple plug and play equation: Sample Size = (Z*σ / Margin of Error)^2

See example walk through below

## Types of Sampling Techniques

It is important to chose the best plan for sampling.

Sampling plans for inspection & auditing consider validity, applicability, and known risks.

Here are a few possibilities:

### Process Sampling

Samples can be taken from a population or a process. If taking from a process, be sure to preserve the time order.

### Random Sampling

Just choose at random so each data point has an equal chance of selection.

### Stratified Random Sampling

Divide the population into groups and then take an equal percentage of each group as a sample.

Ex. If a vat is suspected of not being homogeneous.

Ex. Poll x% voters in each age range.

Ex. Hypothesis that Cargo containers stacked at the end are disproportionately more likely to be damaged should be tested with the stratified method. Others could be used, but this is quicker and easier.

### Systematic Sampling

Choose every N # units. Ex. every 3rd person going through the airport screening process gets chosen for a pat down.

### Subgroup Sampling

Use a regular time period to take n # of samples. Ex measure the chlorine in a pool 3 times every hour and then use the average value. (also see rational subgroups.)

### Sequential Sampling

- Often used in auditing.
- Products coming from a production stream.

### Discovery Sampling

- Often used in auditing.

### Skip-lot Sampling

- Products coming from a production stream.

**Sample Variance:** For a set of data, the average squared deviation from the mean, with a denominator of n-1

## Sampling from a controlled process:

- Ranges of the samples should vary.
- Means of the samples should be slightly different but be in accordance to the process average and center on some central value.

## Data Sampling Videos

## ASQ Six Sigma Black Belt Practice Questions

**Question: **When σ = 10, what sample size is needed to specify a 95% confidence interval of ±3 units from the mean?

(A) 7

(B) 11

(C) 32

(D) 43

Login to your account

OR

Enroll in Pass Your Six Sigma Exam

OR

Get a Free Account

## Comments (2)

would you please enlighten me on application of sampling techniques in English research methods othwerwise thx good presentation

Aruho, I’m sorry, I don’t understand the question. What are you trying to achieve with your data?