*W.R. Wilcox, Clarkson University,
October 2010*

**Explanation of results returned by the
Regression tool **

**in**** Excel’s
Data Analysis **

The Regression tool in Excel’s
Data Analysis determines the coefficients (a_{i})
that yield the smallest residual sum of squares (see below), which is
equivalent to the greatest correlation coefficient squared, R^{2}, for
Equation (1). This is known as linear
regression analysis.

y = a_{0} + a_{1}x_{1
}+ a_{2}x_{2} + a_{3}x_{3} + …. (1)

where y
is the dependent variable (response), a_{0} is the intercept, and x_{1},
x_{2}, x_{3} etc. are the independent variables (factors). It is assumed that you have n observations of
y versus different values of x_{i}.
Note that the x_{i} can be functions of the actual experimental
variables, including products of different variables. Following are the meanings of the results
produced by Excel:

·
R Square = (Multiple R)^{2} = R^{2}
= 1 - Residual SS / Total SS = Regress
SS / Total SS

(roughly the fraction of the variation in y that is explained by equation 1).

· R = correlation coefficient

· Adjusted R Square = 1 - (Total df / Residual df)(Residual SS / Total SS)

·
Standard Error = (Residual MS)^{0.5}

·
Note that “Error” does __not__ mean there’s a
mistake or an experimental error. It’s
just a definition related to residuals.
This word is used a lot in statistical analysis, and is misunderstood by
those who don’t know statistics.
Unfortunately, this might be your boss.

__ANOVA = ANalysis
Of VAriance__

· Regression df = regression degrees of freedom = number of independent variables (factors) in equation 1. See, for example, http://en.wikipedia.org/wiki/Degrees_of_freedom_%28statistics%29

· Regression SS = Total SS - Residual SS

· Regression MS = Regression SS / Regression df

· Regression F = Regression MS / Residual MS

· Significance F = FDIST(Regression F, Regression df, Residual df) = Probability that equation (1) does NOT explain the variation in y, i.e. that any fit is purely by chance. This is based on the F probability distribution. If the Significance F is not less than 0.1 (10%) you do not have a meaningful correlation.

·
Residual df = residual
degrees of freedom = Total df - Regression df = n - 1 - number of independent variables (x_{i})

·
Residual SS = sum of squares of the differences between
the values of y predicted by equation 1 and the actual values of y. If the data exactly fit equation 1, then
Residual SS would be 0 and R^{2} would be 1.

· Residual MS = mean square error = Residual SS / Residual df

· Total df = total degrees of freedom = n – 1

· Total SS = the sum of the squares of the differences between values of y and the average y

= (n-1)*(standard deviation of y)^{2}

·
Coefficients = values of a_{i}
which minimize the Residual SS (maximize R^{2}). The Intercept Coefficient is a_{0} in
equation 1.

·
Standard error = (Residual MS using only the
Coefficient for that row)^{0.5}

· t Stat = Coefficient for that variable / Standard error for that variable

· P-value = TDIST(|t Stat|, Residual df, 2) = the Student’s t distribution two-tailed probability

If one divides this by 2, it is the probability that the true value of the coefficient has the opposite sign to that found. You want this probability to be small in order to be sure that this variable really influences y, certainly less than 0.1 (10%). If near 50%, try another fit with this variable left out.

·
There is a 95% probability that the true value
of the coefficient lies between the Lower 95% and Upper 95% values. The probability is 2.5% that it lies below the lower value, and 2.5% that it lies
above. The narrower this
range the better. If the lower
value is negative and the upper value positive, try correlating the data with
this variable left out. If the resulting
R^{2} and Significance F are either improved or little changed, then
the data do not support inclusion of that variable in your correlating
equation.

See any statistics textbook for more detailed information and fundamental understanding. Comments and suggestions on this page would be greatly appreciated: wilcox@clarkson.edu

Return to Excel tutorials

*Last modified October 27, 2010*