What is Central Tendency?
Central tendency refers to the statistical measure used to determine the center of a distribution of a data set. It is also called measure of central location. In other words it is the single value which is most representative of the entire data set.
Central tendency reflects the principle that when using normal data, all three of the main measures (mean, mode, median) tend to be roughly the same.
When Do You Use Central Tendency
Central Tendency is the widely used method in basic data analysis. Based on the situation, the measure of central tendency could either by Mean, Median or Mode.
Mean is the well know measure of central tendency and also most common method to measure the center value of a continuous data set. The mean is the total of all data values divided by the number of data points, and it is also called as arithmetic mean. There are different types of means exists like arithmetic mean, geometric mean, and weighted mean etc. The notation for population mean represents with μ (‘Mu”) whereas sample mean represents as X̅.
Where X represents the each number and n is the sample size
Advantage of mean is no sorting of data is required, and it uses all the data values for the calculations
Disadvantage of mean is that the data is influences by outliers and also mean is not the actual value of any data point.
The median is the middle value when the data is arranged in an ascending order or descending order. If the data set is having even values, the median is the average of the middle two values.
Advantage of median is it provides an idea where most of the data located and also it will not have outliers impact.
Disadvantage is that the data must be sorted and arranged, and the median have more variation than mean
The mode is the value that occurs most frequently in the data set. It is possible for groups of data to have more than one mode, if the frequency of the numbers are more in the data set. In the histogram we can identify the mode that had a highest bar
Advantage of Mode is no need to sort the data unlike median, and it is not influences by outliers
Disadvantage of Mode is sometimes it will have multiple modes and few times no mode in the data set
Notes about Central Tendency
The most common assumption of statistical test is that data is normally distributed. For normally distributed data we can use any of the central tendency method (mean, mode or median), because for symmetrical data all the three values are equal. However, most statisticians’ uses mean as it considers all the values in the data set for calculations and if any value changes it will affect the mean.
The median is the best central tendency method, as the skew increases the difference between mean and median will also increases.
A central tendency is rarely perfectly centered. Even if it was, it wouldn’t stay that way. As time goes on a standard deviation can drift over 1.5 sigma over time.
If there was no 1.5 sigma drift a 6 Sigma process would only generate 2 defects per BILLION! As it is, because of this drift, the errors are 3.4 per million.
Examples of Central Tendency in DMAIC
Central Tendency is the widely used method in Measure phase of DMAIC
Example 1: Find the mean, median and mode of metal piece weigth distribution
Mean: What is the mean (average) weight of a metal piece (in gms)
Median: What is the median weight of a metal piece (in gms) ?
Sort the data
Since, N=10 (even number), average the two middle values = 55+55/2 = 55
Mode: What is the mode of the distribution of a metal piece (in gms)?
Since frequency of 55 weight is 2, therefor Mode of the weight distribution is 55.
Example 2: Find the cumulative frequency distributions using Central Tendency.
Mean, Median and Mode of grouped data
Since we don’t have exact marks of each individual, approx Mean and Median values can be calculated sung frequency distribution
Find the mid marks of each class
Multiply frequency with Mid marks
- Where L is the lower boundary of median class
- FL = Cumulative frequency of lower class next to the median class
- FM = Frequency of median class
- C = Class width
Find the cumulative frequency
The smallest cumulative frequency great than or equal to 17.5 is 24 the corresponding class is 18-20
- L=18+17/2 =17.5
- FL =12
- FM =12
- p= highest frequency minus the frequency of the next lower class
- q= highest frequency minus the frequency of the next higher class