How to Interpret Regression Results in Excel – Detailed Analysis

 

What Is Regression?

Regression analysis is often used in data analysis to determine the associations among multiple variables. Regression analysis allows you to choose what happens to the dependent variable if one of the independent variables is altered. It also mathematically shows which independent variables have an influence.

Simple linear regression is distinct from multiple linear regression in statistics. Using a linear function, simple linear regression analyzes the association between the variables and an independent variable. Multiple linear regression is when two or more factors are used to determine the variables. This article will focus on multiple linear regression on a data set.


How to Do Regression in Excel

The dataset below will be used for analysis.The independent variable is:  Price  and Sold . The independent column is Demand .

Interpret Regression Results in Excel

Steps

  • Go to the Data tab and click on Data Analysis.

Interpret Regression Results in Excel

  • In the new window; select the dependent variable and independent variable data range.
  • Check Labels and Confidence.
  • Click the output cell range box to select the output cell address.
  • Check Residual to calculate the residuals.
  • Check Residual Plots and Line Fit Plots.
  • Click OK.

Interpret Regression Results in Excel

  • The primary output parameters of the analysis will be displayed.

Interpret Regression Results in Excel

  • Other parameters, for example Significance value, will also be displayed in the ANOVA (Analysis of Variance) table.
  • df is the degree of freedom of the source of variance.
  • SS is the sum of squares. (For better results, the Residual SS should be smaller than the Total SS).
  • MS means square.
  • F is the F-test for the null hypothesis.
  • Significance F is the P-value of F.

Interpret Regression Results in Excel

  • You will also see the variable’s coefficients, significance value, etc in a table.

Interpret Regression Results in Excel

  • The coefficient table contains the residual value for each entry.

Interpret Regression Results in Excel

  • This is the Demand vs Price regression chart, with a trendline.

Interpret Regression Results in Excel

  • This is the Demand vs Sold regression chart with a trendline.

  • This chart shows the distribution of residuals for each entry of the Sold variable.

  • This chart shows the distribution of residuals for each entry of the Price variable.

 


How to Interpret Regression Results in Excel: Detailed Analysis

Multiple R-Squared Regression Value Analysis

The R-squared number indicates how closely the  elements in dataset are related and how well the regression line matches the data. The multiple linear regression analysis will be used to determine the impact of two or more variables on the main factor. The range of this coefficient is from -1 to 1:

  • 1 means a close positive relationship
  • 0 means there are no relationships among variables.
  • -1 means inverse or negative relationship among variables.

In the output results shown above, the multiple R-value of the given data sets is 0.7578(approx), which indicates strong relations between the variables.

R Squared

R squared value explains how the response of the dependent variables varies, according to the independent variable. Here, the value is 0.574(approx), which can be interpreted as a reasonable relationship between the variables.

Adjusted R-Squared

It is merely an alternative version of the R squared value. It shuffles the predictor variables, forecasting the response variable:

R^2 = 1 – [(1-R^2)*(n-1)/(n-k-1)]

Here, R^2: The R^2 value we got from the dataset.

n: the number of observations.

K: the number of predictor variables.

This value is significant in regression analysis between two predictor variables. If there is more than one predictor variable in the dataset, the R squared value will be inflated, which is highly undesirable. The adjusted R squared value adjusts this inflation and gives an accurate picture of the variables.

Standard Error

It is a goodness-of-fit metric that indicates the accuracy of your regression analysis; the lower the value, the more accurate is the regression analysis.

Standard Error is an empirical metric representing the average distance points deviate from the trendline. R2 represents the proportion of dependent variable variation. In this case, the value of Standard Error is 288.9 (approx), which means that our data points, on average, drop 288.9 from the trendline.

Observations

Indicate the number of observations or entries.

Determine Significant Variable

The Significance value indicates the trustworthiness (statistically sound) of the analysis. This value should be below 5%. But in this case, the significance value is  0.00117, which translates to 0.1% – well below the 5%. So the analysis is accurate.

P-value in Regression Analysis

The P-value represents the probability of the coefficient value being wrong: the association of the null hypothesis with the variables.

If your p-value < the Significance number, there is enough evidence to reject the null value hypothesis. This means there is a non-zero correlation between the variables.

If the p-value > Significance value, there will be insufficient evidence to dismiss the null hypothesis. There could be no correlation between the variables.

Here, the P-value of the variable Price =0.000948 < 0.00117 (significance value):

There is no null hypothesis, and there is enough evidence to declare a correlation between variables.

For the variable Sold, the (P-value) 0.0038515 < 0.0011723 (Significance value)

There could be a null hypothesis and there is not enough evidence to declare a non-zero correlation between variables.

 

Regression Equation

To determine the linear regression analysis in Excel, the trend line should also be linear. The general form is:

Y=mX+C.

Here, Y is the dependent variable.

And X is the independent variable: the effect of the change of variable x on variable Y will be determined.

C  is the value of the Y-axis intersection of the line.

The value of the C  intercept, here, is equal to 9502.109853

And the value of m for the two variables is -809.265 and 0.424818.

there’s a final equation for the two separate variables:

The first one is:

Y=-809.265771X+9502.12

And the second  is:

Y=0.4248X+9502.12

Coefficients

The coefficients, here, are m1=-809.2655 and m2=04248. And interceptor, C= 9502.12.

  • The interceptor value indicates that the demand will be 9502 when the price is zero.
  • And the values of m are the rate at which demand changes per unit of price change. The price coefficient value is -809.265, indicating that a per unit increase in price will drop the demand by roughly 809 units.
  • For the second variable, Sold, the m value is 0.424. This shows that the change per unit sold item will result in a 0424-time unit increase of the product.

Residuals

The Residual is the difference between the original and the calculated entry from the regression line. Residuals indicate how distant the actual value is from the line. For example, the computed entry in the regression analysis for the first entry is 9497. And the first original value is 9500. The residual is around 2.109.

T-Statistics Value

T-statics value is the division of coefficient by the standard value. The higher the value is, the more reliable are the coefficients.

There is another significance of this value, which is required to calculate the P-value in linear regression.

The 95% Confidence Interval

Here the confidence of the variable was set as 95 at the beginning. It can change, though.

  • Here, the coefficient value of the lower 95% is calculated as 8496.84. It means the upper 95% is calculated as 10507.37,
  • While the  main coefficient is about 9502.1, there is a high chance that the value is below 8496 for 95% of the cases and a 5% chance of being over 10507.37

Things to Remember

✎ The regression analysis method assesses the relationship among variables under examination. It doesn’t establish causation.

Regression analysis hampers heavily by outliers. All kinds of outliers must be removed before analysis is done.


Download Practice Workbook

Download this practice workbook.

 


Related Article


<< Go Back to Regression Analysis in Excel | Excel for Statistics | Learn Excel

Get FREE Advanced Excel Exercises with Solutions!
Rubayed Razib Suprov
Rubayed Razib Suprov

Rubayed Razib, holding a BSC degree in Naval Architecture & Engineering from Bangladesh University of Engineering and Technology, serves as a devoted member of the ExcelDemy project. He has contributed significantly by authoring numerous articles and showcasing proficiency in VBA. Razib efficiently automates Excel challenges using VBA macros and actively participates in the ExcelDemy forum, providing valuable solutions for user interface challenges. Apart from creating Excel tutorials, he is interested in Data Analysis with MS Excel,... Read Full Bio

We will be happy to hear your thoughts

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo