Sum of Squares (SS) is a statistical method to know the data dispersion and to determine mathematically best fit model in regression analysis. Sum of squares is one of the critical outputs in regression analysis. A large value of sum of squares indicates large variance. In other words, individual values are varying widely from the mean.

The Sum of Squares (SS) technique calculates a measure of the variation in an experiment. Used in Designed experiments and Anova.

Sum of Square formula

Sum of Squares
  • x̅ is the mean
  • xi is the frequency

Types of Sum of Squares

Sum of Squares are widely used in the regression analysis. There are three basic types of sum of squares

  • Total sum of squares
  • Sum of squares regression
  • Sum of squares error

Total sum of squares: TSS represents the total sum of squares. It is the squared values of the dependent variable to the sample mean. In other words, the total sum of squares measures the variation in a sample.

Sum of Squares
  • yi is the one of the value in the sample
  • y̅ is the sample mean

Sum of squares regression: Sum of squares due to regression is represented by SSR. It is the difference between the predicted value and the sample mean.

  • ŷi is the predicted value
  • y̅ is the sample mean

Sum of squares error: SSE represents sum of squares error, also known as residual sum of squares. It is the difference between the observed value and the predicted value.

Usually, the lower the sum of squares error better model the regression. SSE is that part of the total variation which is not modeled by the regression line.

Sum of Squares
  • yi is the one of the value in the sample
  • ŷi is the predicted value

The relationship between TSS, SSR and SSE is mathematically represented by

TSS = SSR+SSE

The Sum of Squares is used in ANOVA. ANOVA uses sum of squares between(SSB) group and sum of squares within groups (SSW).

Total sum of squares TSS = SSB+SSW

Uses of Sum of Squares

Sum of squares explains how many individual values are away from the mean, it helps to know the variability in the data. Standard deviation and variance are the two important parameters in statistics, but to compute these values first, we need to calculate the sum of squares.

How to calculate the Sum of Squares

Step 1: List all of the values.

Step 2: Calculate the mean (arithmetic average) of all values. Summing up all the values and divided by number of values.

Step 3: Subtract each value from the mean. If the value is less than the mean, then it produces a negative value.

Step 4: Square of each value, now the result is always positive.

Step 5: Add all the Squared values is the Sum of Squares.

Example Sum of Squares problem.

Calculate the sum of squares of 10 students’ weights (in lbs) are 67, 86,62,77,73,61,80,75,69,73.

Step 1: List all of the values.

Step 2: Calculate the mean (arithmetic average) of all values. Summing up all the values and divided by number of values.

67+86+62+77+73+61+80+75+69+73/10= 723/10=72.3

Step 3: Subtract each value from the mean. If the value is less than the mean, then it produces a negative value

Step 4: Square of each value, now the result is always positive

Step 5: Add all the Squared values is the Sum of Squares

Sum of squares = Σ(xi- x̅)2 = 28.09+187.7+106.1+22.09+0.49+127.7+59.29+7.29+10.89+0.49 = 550.1

Important Videos

Author

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.