Regression analysis is in almost every type of statistical software like **SPSS**,** R, **and not to mention Excel. Regression can give us the big picture about relationships between variables. Linear regression can be done pretty quickly in Excel using the **Data Analysis** tool. This article will show how you can **interpret regression results in Excel.**

**Download Practice Workbook**

Download this practice workbook below.

## What Is Regression?

Regression analysis is often used in data analysis to determine the associations among multiple variables. Regression analysis allows you to choose what happens to the dependent variable if one of the independent variables alters. It also lets you figure out mathematically which independent variables have an influence.

**Simple linear regression** is distinct from a **multiple linear regression** in statistics. Using a linear function, simple **linear regression analyses** the association between the variables and one independent variable. **Multiple linear regression** is when two or more explanatory factors are used to determine the variables. Using nonlinear regression instead of the dependent variable is described as a nonlinear function since the data relationships are not linear. This article will concentrate on **multiple linear regression** to demonstrate how you can** interpret regression results in Excel.**

## Steps to Do Regression in Excel

For regression purposes, we will use the below dataset for analysis purposes. Here the independent variable will be the **Price** column and **Sold** column. The **independent** column will be the **Demand **column.

**Steps**

- We need to go to the
**Data**tab and click on the**Data Analysis**to do regression**.**

- There will be a new window; select the dependent variable and independent variable data range.
- Then tick the
**Labels**box and**Confidence**box. - Then click on the output cell range box to select the output cell address
- Next, tick on the
**Residual**to calculate the residuals. - After that, tick the
**Residual**plots and**Line Fit Plots boxes** - Click
**OK**after this.

- After clicking
**OK,**the primary output parameters of the analysis will be at the specified cells.

- Then you will also get some parameters such as
**Significance**value etc in the**ANOVA**(*Analysis of Variance*) table. - Here,
**df**denotes the degree of freedom related to the source of variance. **SS**denotes the sum of squares. Your model will reflect the data better if the**Residual**SS is smaller than the**Total SS.****MS**means square.**F**denotes the**F**-test for the null hypothesis.**Significance F**denotes the**P**-value of**F**.

- Then you will also get the variable’s coefficients, significance value, etc in a table.

- Then you will get a final table below the coefficient table which contains the residual value for each entry.

- Next, you will get the
**Demand**vs**Price**regression chart, with a trendline.

- After this, you get the
**Demand**vs**Sold**regression chart with a trendline.

- There is another chart showing the distribution of residuals of each entry from the
**Sold**variable.

- There is another chart showing the distribution of residuals of each entry from the
variable.**Price**

Next, we will show you how you can interpret these regression results in Excel.

**Read More:** **How to Do Logistic Regression in Excel (with Quick Steps)**

## How to Interpret Regression Results in Excel

The next thing you need to do after doing the regression analysis and interpret them. The outcomes are described and elaborated on below.

__Multiple R-Squared Regression Value Analysis__

The **R-squared** number indicates how closely the dataset’s elements are related and how well the regression line matches the data. We are going to use the multiple linear regression analysis, in which we are going to determine the impact of two or more variables on the main factor. This refers to how the dependent variable changes as one of the independent variables changes. The range of this coefficient is from -1 to 1. Here,

- 1 means a close positive relationship
- 0 means there are no relationships among variables. In other words, the data points are random.
- -1 means inverse or negative relationship among variables.

In the output results shown above, the multiple R-value of the given data sets is o.7578(*approx*), which indicates strong relations between the variables.

__R Squared__

**R squared **value explains how the response of dependent variables varies to the independent variable. In our case, the value is 0.574(approx), which can be interpreted as a reasonably okay relationship between the variables.

__Adjusted R-Squared__

This is merely an alternate version of the** R squared** value. This simply shuffles the **predictor **variables while forecasting the **response** variable. It calculates as

**R^2** = **1 – [(1-R^2)*(n-1)/(n-k-1)]**

Here, **R^2**: The **R^2** value we got from the dataset.

**n**: the number of observations.

**K**: the number of predictor variables.

The significance of this value arises while doing regression analysis between two **predictor** variables. If there is more than one **predictor** variable in the dataset, then the R squared value will be inflated, which is highly undesirable. The adjusted **R squared** value adjusts this inflation and gives an accurate picture of the variables.

__Standard Error__

Another goodness-of-fit metric that indicates the accuracy of your regression analysis; the lower the value, the more sure you can be in your regression analysis.

**Standard Error** is an empirical metric representing the average distance that the points deviate from the trendline. In contrast, **R2** represents the proportion of dependent variable variation. In this case, the value of **Standard Error** is 288.9 (*approx*), which denotes that our data points, on average, drop 288.9 from the trendline.

__Observations__

Indicate the number of observations or entries.

__Determine Significant Variable__

The **Significance value **indicates the trustworthiness (statistically sound) of our analysis. In other words, it denotes the probability of our dataset being wrong. This value should be below 5%. But in this case, our significance value is 0.00117, which translates to 0.1%, which is well below the 5%. So our analysis is ok. Otherwise, we may have to choose different variables for our analysis.

__P-value in Regression Analysis__

Closely linked to a significant value, the **P-value** denotes the probability of the coefficient value being wrong. P-value denotes the association of the null hypothesis with the variables.

If your **p-value** < the **Significance **number, there is enough evidence to reject the null value hypothesis. This means there is a non-zero correlation between the variables.

But if the **p-value **> **Significance **value, there will be insufficient evidence to dismiss the null hypothesis. That denotes that there could be no correlation between the variables.

In or case, the **P-value** of variable **Price **=0.000948 < 0.00117 (significance value),

So there is no null hypothesis going on here, and there is enough evidence to declare a correlation between variables.

On the other hand, for the variable **Sold**, the (P-value) 0.0038515 < 0.0011723 (Significance value)

So there could be a null hypothesis going on here, and there is not enough evidence to declare a non-zero correlation between variables.

In most cases, this P-value determines whether a variable will be in the dataset or not. For instance, we should remove the **Sold** variable to preserve the robustness of the dataset.

__Regression Equation__

As we determine the linear regression analysis in Excel, the trend line should also be linear. The general form is:

**Y=mX+C.**

Here,** Y** is the dependent variable.

And X is the independent variable here, meaning that we will determine the effect of the change of variable x on variable Y.

**C** will just be the value of the Y-axis intersection of the line.

In this case, the value of the C intercept is equal to 9502.109853

And the value of m for the two variables is -809.265 and 0.424818.

So, we have the final equation for the two separate variables.

The first one is:

**Y=-809.265771X+9502.12**

And the equation for the second variable is:

**Y=0.4248X+9502.12**

__Coefficients__

The coefficients we got are **m1=-809.2655** and **m2=04248**. And interceptor, **C= 9502.12**.

- First, the interceptor value indicates that the demand will be 9502 when the price is zero.
- And the values of
**m**denote the rate at which demand changes per unit of price change. The price coefficient value is -809.265, indicating that a per unit increase in price will drop the demand by roughly 809 units. - For the second variable, Sold, the m value is 0.424. This denotes that the change per unit sold item will result in a 0424-time unit increase of the product.

__Residuals__

The **Residual **difference between the original and the calculated entry from the regression line is the difference. **Residuals** indicate how distant the actual value is from the line. For example, the computed entry from the regression analysis for the first entry is 9497. And the first original value is 9500. So the residual is around 2.109.

__T-Statistics Value__

T-statics value is the division of coefficient by the standard value. The higher the value is, the better reliability of the coefficient indicates.

There is another significance of this value, which is required to** calculate the P-value.**

__The 95% Confidence Interval__

Here the confidence of the variable we set as 95 at the beginning. It can change, though.

- Here, the coefficient value of the lower 95% is calculated as 8496.84 means the upper 95% is calculated as 10507.37,
- This means that while our main coefficient is about, 9502.1.there is a high chance that the value could be below 8496 for 95% of cases and a 5% chance of it being over 10507.37

Read More:

**Things to Remember**

**✎ **The regression analysis method solely assesses the relationship among variables under examination. It doesn’t establish causation. In other ways, only the aspect of correlation considers. When some act causes something, it’s become causation. When an alteration of one variable creates changes, it can be considered causation.

**✎** Regression analysis hampers heavily by outliers. All kinds of outliers must be removed before analysis is done. To analyze and interpret regression analysis results in Excel, you need to consider these points.

## Conclusion

To sum it up, the question “how to interpret regression results in Excel” is answered by elaborately analyzing and later interpreting it. The analysis is done through the **Data Analysis** tool in the **Data** tab.

For this problem, a workbook is available for download where you can practice regression analysis and interpret it.

Feel free to ask any questions or feedback through the comment section. Any suggestion for the betterment of the **Exceldemy** community will be highly appreciable.