Non-Parametric Tests and Data Analysis

You use non-parametrics when you don’t know, can’t assume, and can’t identify what kind of distribution your have.

This always reminds me of the Ghostbuster’s scene when they get their first call and head into the hotel where the manager says ‘I want this think taken care of quickly!’ Venkmann of course replies ‘Hold on, we don’t even know what you have yet.’

What are Nonparametric Tests?

Non-parametric tests, as their name tells us, are statistical tests without parameters. For these types of tests you need not characterize your population’s distribution based on specific parameters. They are also referred to as distribution-free tests due to the fact that they are based n fewer assumptions (e.g. normal distribution). These tests are particularly used for testing hypothesis, whose data is usually non normal and resists transformation of any kind. Due to the lesser amount of assumptions needed, these tests are relatively easier to perform. They are also more robust. An added advantage is the reduction in the effect of outliers and variance heterogeneity on our results. This test can be used for ordinal and sometimes even for nominal data. However, nonparametric tests do have their own disadvantages as well. Firstly, the results that they provide may be less powerful compared to the results provided by the parametric tests. To overcome this problem it is preferred that a larger number of samples be taken if one is adopting this approach. Secondly, their results are usually more difficult to interpret than the results of parametric tests. This is because we usually assign ranks to samples in the case of non-parametric tests rather than using the original data. This further complicates the system and distorts our intuitive understanding of the data. Non-parametric tests are useful and important in many cases, but they may not provide us with the ideal results.

When to use Non-parametric testing?

Non-parametric methods can be used to study data that is ranked in an order but has no or little clear numerical interpretation. Due to the small amount of assumptions involved, non-parametric tests have a wide range of applications. They are usually used in situations where there is only a small amount of information available about the application in question. For a data to give you reliable results with non-parametric tests it should not follow a normal distribution. A common test to check that is the Anderson-Darling Test which helps us determine the type of distribution the data may follow. If the test result is statistically significant then the data does not follow a normal distribution and a non-parametric test can be performed. In these tests the hypothesis are not about the populations. Instead the null hypothesis to be tested is very general and usually assumes that the two populations are equal (in terms of their central tendency). Some of the situations when the data is not following a normal distribution and these tests can be performed easily are as follows:

  • When the outcome is a rank or an ordinal variable – For example in the case of movie ranking etc.
  • When there are a number of explicit outliers – The samples may show a continuous pattern with some very extreme ended outliers.
  • When the outcome has a clear limit of detection – This means that the outcome being determined has been done so with some limitations or imprecision.

Types of Non-parametric Tests:

There are many types of non-parametric tests. Some of them have been discussed below:

  1. Sign Test – It is a primitive test that can be applied when the typical conditions for the single sample t-test are not met. The test itself is very simple and involves doing a binomial test on the signs. This can be performed quite easily on the excel as well.
  2. Mood’s Median Test (for two samples) – This is a primitive two sample version of the above mentioned sign test. It is used to estimate whether the median of any two independent samples are equal. This test can be applied for more than two samples.
  3. Wilcoxon Signed-Rank Test for a Single Sample –If the requirements for t-test are not fulfilled then this test can be used only if the two independent samples to be used have been derived from populations with an ordinal distribution. This is also a primitive test. It has two sub types: the exact test and the advanced one.
  4. Mann-Whitney Test for Independent Samples – This is also an alternative version of t-test for two independent populations. This test is completely equivalent and resembles the Wilcoxon test in some ways. This test has three types: the exact test, the median confidence interval and the advanced one.
  5. Wilcoxon Signed-Rank Test for Paired Samples – This test is mainly an alternate of the t-test for paired samples i.e. if the requirements for the two paired t-test are not satisfied then we can easily perform this test. It has three requirements all of which should be satisfied in order to perform this test. It has two methods: the exact one and the advanced one.
  6. McNemar Test – This test is basically a type of matched pair test and used to analyze data before and after an event has occurred. It tells us whether there is a significant change in the data before and after the occurrence of any said event. McNemar’s Test can be used with paired samples where the dependent variable is dichotomous.
  7. Runs Test – This test is usually used to determine whether the sequence of a series of events is random or not. It can be used for one or two sample types depending on the data available at hand and the resources available. It is also known as runs distribution. The two sample test determines whether the two samples come from the same distribution of data or not.
  8. Resampling Procedures – Works on the assumption that the original population distribution is the same as in the given sample. This helps us create a large number of samples from this pseudo-population and then in end draw valuable conclusions.

Above mentioned are the most common types of tests used nowadays. Some of the other examples of non-parametric tests used in our everyday lives are: the Chi-square Test of Independence, Kolmogorov-Smirnov (KS) test, Kruskal-Wallis Test, Mood’s Median Test, Spearman’s Rank Correlation, Kendall’s Tau Correlation, Friedman Test and the Cochran’s Q Test.

Non-Parametric notes:

Non-parametric tests can be applied to correlation studies

Non-parametric Test of equality of population medians – Mood’s Median, Mann Whitney, and Kruskal Wallis

Non-Parametric Test of equality of population variances – Levene’s Test

Levene’s test – makes an evaluation using a t-test


Six Sigma Black Belt Certification Non-Parametric Tests and Data Analysis Questions:

Question: A black belt would use non-parametric statistical methods when:

(A) knowledge of the underlying distribution of the population is limited
(B) the measurement scale is either nominal or ordinal
(C) the statistical estimation is required to have higher assurance
(D) management requires substantial statistical analysis prior to implementing

Answer: A: knowledge of the underlying distribution of the population is limited.

You use non-parametrics when you  can’t identify or assume what kind of distribution your have so A is the easy choice. Also, you can eliminate b, c, and d as they have no bearing on the problem.

0 comments… add one

Leave a Comment