A Scatter Diagram provides relationship between two variables, and provides a visual correlation coefficient.

## Why You Would Use Scatter Analysis and Scatter Plots

A Scatter Analysis is used when you need to compare two data sets against each other to see if there is a relationship. Scatter plots are a way of visualizing the relationship; by plotting the data points you get a scattering of points on a graph. The analysis comes in when trying to discern what kind of pattern – if any – is present. And what that pattern means.

It is this kind of analysis we are talking about when we are trying to get at the root cause of an issue.

Scatter Diagrams are used to show the “cause-and-effect” relationship between two kinds of data, and to provide more useful information about a production process.

### Specific instances of when to utilize scatter diagrams:

- Pairs of numerical figures are present
- Dependent variables have multiple values for each figure associated with the independent variable
- Defining if there is a relationship between two variables

## What Kind of Data Should You Use on Scatter Analysis?

Scatter analysis generally makes use of continuous data. (See notes on the different data types here.)

Discrete data is best at pass/ fail measurements. Continuous data lets you measure things deeply on an infinite set and is generally used in scatter analysis.

You could use discrete data on one axis of a scatter plot and continuous data on the other axis. For the discrete data, you’d have to put it into some kind of quantified band – like say 1-10 on a customer satisfaction score.

I suppose you also *could* put discrete data that comes out like pass/fail as one of two bands, but it would really depend on the data if you got any useful information out of it.

Best bet is continuous data.

If you are looking for a way to do graphical analysis on discrete data, you might try attribute charts.

## Scatter Plot Videos

## Scatter Plots and Correlation

Scatter plots only show correlation. They do not prove causation. The example often used is shark attacks and ice cream sales. There may be a correlation between the two, but ice cream does not cause shark attacks — the heat of the day does. In other words, more people are in the water on hot days equaling more shark attacks, and more people buy ice cream on hot days

## How to Make a Scatter Diagram:

- Collect sets of data where a relationship is present.
- Draw a graph in the shape of an “L,” and make the scale even multiples (i.e., 10, 20).
- Place the independent variable on the horizontal (X) axis.
- Place the dependent variable on the vertical (Y) axis.
- Place a dot or a symbol where the x-axis value intersects the y-axis value.
- If two dots fall together, place them side by side, so they are touching, and both are visible.

- Review the pattern of points to determine if a relationship is present:
- Stop if the data forms a line or a curve, as the variables are considered correlated.
- Use regression or correlation analysis, if necessary. If regression or correlation analysis are not needed, complete steps four through seven below.

- Divide points on the graph into four equal sections. If X points are present on the graph:
- Count X/2 points from top to bottom and draw a horizontal line.
- Count X/2 points from left to right and draw a vertical line.
- If the number of points is odd, draw a line through the middle point.

- Count the points in each quadrant.

**NOTE: Do not count points on a line.**

- Locate the smaller sum and the total of points in all quadrants, and add the diagonally opposite quadrants:

A = points in upper left + points in lower right

B = points in upper right + points in lower left

Q = the smaller of A and B

N = A + B

- Look up the limit for N on the trend test table:

- If Q is less than the limit, the two variables are related.
- If Q is greater than or equal to the limit, the pattern may have originated from random chance.

## Comments (3)

what data does scatter analysis uses, is it discreet data

Hi Kelvin, I’ve addressed this question in the private member area. Thanks.

On an linear pattern, Q will be mostly equal to greater than N.

Then these variables have some relation right.

Please explain “If Q is less than the limit, the two variables are related.

If Q is greater than or equal to the limit, the pattern may have originated from random chance.