In the DMAIC methodology, a data collection plan is created during the Measure phase. It is a useful tool to focus your data collection efforts on.

Why Do You Need a Data Collection Plan?

Achieve Context

Simply getting “all of the data” and looking at it is not likely to yield great results; you can easily get overwhelmed. Or you could interpret the data in an incredibly unhelpful way. Here’s a brilliant 5 min TED talk that illustrates great data analysis flawed by not setting great context:

Save Resources

How many people work projects with unlimited resources, with no deadlines, and an unlimited budget? Not very many, right? Getting data takes a lot of time and may be expensive. Seems the rest of us need a better way.

Bounds of Reality

It may not be possible to get all of the data that we want.

Why you need a data collection plan.
Why you need a data collection plan.

This is Six Sigma! We want to be efficient! By creating a data collection plan you can focus your efforts on answering specific questions that have business value. This directed approach helps you avoid locating & measuring data just for the sake of doing so.

“Acknowledging what you don’t know is the dawning of wisdom.” —Charlie Munger

How to Create a Data Collection Plan

Step 1: Identify the Questions we want to answer.

Our data must be relevant to the project. What is your project’s hypothesis? What are we trying to answer? The entire reason to have a DMAIC project is to improve a process so these questions should be centered around what the reality of your process is. And that’s best discovered by defining the current state.


What happens if we just gather data instead of making a data collection plan?

Collecting data ‘just to see what’s out there’ is a poor approach that leads to bloat and wasted effort. It may lead you to collecting the wrong data – or collecting the data a wrong manner. By starting with the questions you want to ask you can then determine what kind of data (and in what manner) would help you definitively answer those questions. This will lead you to higher-quality solutions.

Don’t Forget: A data collection plan begins and ends with people.

To better avoid errors, you should talk to people who disagree with you and you should talk to people who are not in the same emotional situation you are.” — Daniel Kahneman

“When a possibility is unfamiliar to us, we do not even think about it.” — Nate Silver

A great best practice is to use the SIPOC as a guide for data collection.

Figure out the type of measurements we want to include.

2-3 Output measures

1-2 Input measures

1 process measure

Use the critical-to-quality tree to reduce the number of outputs brainstormed while Process Mapping down to two or three as part of the Data Collection Plan creation steps.

A Good Checklist for Collecting Data

Your data should meet these criteria.

  • Answer specific questions that are linked to your project’s goals.
  • Is feasible to collect (on time, with your budget, appropriate effort)
  • Considers related & influencing conditions.
  • Provides insight to the process
  • Can be found on the Input-Process-Output diagram of your SIPOC.


Step 2: What kind of data is available?

Now we break those questions down into their parts. What data exists that can give us these answers – or part of these answers.
Sometimes a particular piece of data can give us multiple answers. Sometimes we need to explore that data in relation to other data.
Make a list of all of the data that is needed to answer the questions the project is centered on.

Step 3: What form does that data come in?

Determine what Type of data we are measuring. Create a Data Collection Form.  Is it continuous data or discrete?


Indicate on your data needs list the type of data each item is.

Step 4: How Much Data Do We Need?

We want to get enough data so what we can see patterns and trends. For each data element on the list write down how much is needed.

Step 5: How are we going to measure this data?

Data can be measured in different ways; check sheets, survey answers, etc. The way we measure will be dependent upon the kind of data we seek.

  • Decided on an operational definition for each measurement.
  • Identify the specification  of the measurement. (Should be based on the customer’s limits of acceptability.)
  • Define the target values (And what direction do we want the process to go in?)
  • Put a real, objective  value for each target.
How will we be consistent in our measurements? Is a gauge R & R required?

Step 6: To Sample or Not to Sample

Sometimes it is impractical to measure an entire population of data and instead you have to take a sample. How much do you need to sample of the parent population to make statistically-sound judgements? How are you going to sampling the data? How will you avoid measurement bias?

Step 7: How will we display the data?

We can display data in many ways; control charts, pareto diagrams, run charts, etc.  Which graphical display tool is best suited to answer our questions?

Data Collection Planning Videos

Proceed to the next tollgate, Baseline Sigma.


Comments (5)

I am looking for available data set to apply lean six sigma on that. Would you please advise where I can find data?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.