## What Is Regression?

Regression analysis is often used in data analysis to determine the associations among multiple variables. Regression analysis allows you to choose what happens to the dependent variable if one of the independent variables is altered. It also mathematically shows which independent variables have an influence.

**Simple linear regression** is distinct from multiple linear regression in statistics. Using a linear function, simple **linear regression analyzes** the association between the variables and an independent variable. **Multiple linear regression** is when two or more factors are used to determine the variables. This article will focus on **multiple linear regression on a data set**.

## How to Do Regression in Excel

The dataset below will be used for analysis.The independent variable is:Â *Price*Â and *Sold *. The *independent* column is *Demand* .

**Steps**

- Go to the
**Data**tab and click on**Data Analysis****.**

- In the new window; select the dependent variable and independent variable data range.
- Check
**Labels**and**Confidence**. - Click the output cell range box to select the output cell address.
- Check
**Residual**to calculate the residuals. - Check
**Residual Plots**and**Line Fit Plots**. - Click
**OK**.

- The primary output parameters of the analysis will be displayed.

- Other parameters, for example
**Significance**value, will also be displayed in the**ANOVA**(*Analysis of Variance*) table. **df**is the degree of freedom of the source of variance.**SS**is the sum of squares. (For better results, the**Residual**SS should be smaller than the**Total SS**).**MS**means square.**F**is the**F**-test for the null hypothesis.**Significance F**is the**P**-value of**F**.

- You will also see the variable’s coefficients, significance value, etc in a table.

- The coefficient table contains the residual value for each entry.

- This is the
*Demand*vs*Price*regression chart, with a trendline.

- This is the
*Demand vs Sold*regression chart with a trendline.

- This chart shows the distribution of residuals for each entry of the
*Sold*variable.

- This chart shows the distribution of residuals for each entry of the
variable.**Price**

## How to Interpret Regression Results in Excel: Detailed Analysis

__Multiple R-Squared Regression Value Analysis__

The **R-squared** number indicates how closely theÂ elements in dataset are related and how well the regression line matches the data. The multiple linear regression analysis will be used to determine the impact of two or more variables on the main factor. The range of this coefficient is from -1 to 1:

- 1 means a close positive relationship
- 0 means there are no relationships among variables.
- -1 means inverse or negative relationship among variables.

In the output results shown above, the multiple R-value of the given data sets is 0.7578(*approx*), which indicates strong relations between the variables.

__R Squared__

**R squared **value explains how the response of the dependent variables varies, according to the independent variable. Here, the value is 0.574(approx), which can be interpreted as a reasonable relationship between the variables.

__Adjusted R-Squared__

It is merely an alternative version of the** R squared** value. It shuffles the **predictor **variables, forecasting the **response** variable:

**R^2** = **1 â€“ [(1-R^2)*(n-1)/(n-k-1)]**

Here, **R^2**: The **R^2** value we got from the dataset.

**n**: the number of observations.

**K**: the number of predictor variables.

This value is significant in regression analysis between two **predictor** variables. If there is more than one **predictor** variable in the dataset, the R squared value will be inflated, which is highly undesirable. The adjusted **R squared**Â value adjusts this inflation and gives an accurate picture of the variables.

__Standard Error__

It is a goodness-of-fit metric that indicates the accuracy of your regression analysis; the lower the value, the more accurate is the regression analysis.

**Standard Error** is an empirical metric representing the average distance points deviate from the trendline. **R2** represents the proportion of dependent variable variation. In this case, the value of **Standard Error** is 288.9 (*approx*), which means that our data points, on average, drop 288.9 from the trendline.

__Observations__

Indicate the number of observations or entries.

__Determine Significant Variable__

The **Significance value **indicates the trustworthiness (statistically sound) of the analysis. This value should be below 5%. But in this case, the significance value isÂ 0.00117, which translates to 0.1% – well below the 5%. So the analysis is accurate.

__P-value in Regression Analysis__

The **P-value** represents the probability of the coefficient value being wrong: the association of the null hypothesis with the variables.

If your **p-value** < the **Significance **number, there is enough evidence to reject the null value hypothesis. This means there is a non-zero correlation between the variables.

If the **p-value **> **Significance **value, there will be insufficient evidence to dismiss the null hypothesis. There could be no correlation between the variables.

Here, the **P-value** of the variable **Price **=0.000948 < 0.00117 (significance value):

There is no null hypothesis, and there is enough evidence to declare a correlation between variables.

For the variable **Sold**, the (P-value) 0.0038515 < 0.0011723 (Significance value)

There could be a null hypothesis and there is not enough evidence to declare a non-zero correlation between variables.

__Regression Equation__

To determine the linear regression analysis in Excel, the trend line should also be linear. The general form is:

**Y=mX+C.**

Here,** Y** is the dependent variable.

And X is the independent variable: the effect of the change of variable x on variable Y will be determined.

**C**Â is the value of the Y-axis intersection of the line.

The value of the CÂ intercept, here, is equal to 9502.109853

And the value of m for the two variables is -809.265 and 0.424818.

there’s a final equation for the two separate variables:

The first one is:

**Y=-809.265771X+9502.12**

And the secondÂ is:

**Y=0.4248X+9502.12**

__Coefficients__

The coefficients, here, are **m1=-809.2655** and **m2=04248**. And interceptor, **C= 9502.12**.

- The interceptor value indicates that the demand will be 9502 when the price is zero.
- And the values of
**m**are the rate at which demand changes per unit of price change. The price coefficient value is -809.265, indicating that a per unit increase in price will drop the demand by roughly 809 units. - For the second variable, Sold, the m value is 0.424. This shows that the change per unit sold item will result in a 0424-time unit increase of the product.

__Residuals__

The **Residual **is the difference between the original and the calculated entry from the regression line. **Residuals** indicate how distant the actual value is from the line. For example, the computed entry in the regression analysis for the first entry is 9497. And the first original value is 9500. The residual is around 2.109.

__T-Statistics Value__

T-statics value is the division of coefficient by the standard value. The higher the value is, the more reliable are the coefficients.

There is another significance of this value, which is required to** calculate the P-value in linear regression.**

__The 95% Confidence Interval__

Here the confidence of the variable was set as 95 at the beginning. It can change, though.

- Here, the coefficient value of the lower 95% is calculated as 8496.84. It means the upper 95% is calculated as 10507.37,
- While theÂ main coefficient is about 9502.1, there is a high chance that the value is below 8496 for 95% of the cases and a 5% chance of being over 10507.37

**Things to Remember**

**âœŽÂ **The regression analysis method assesses the relationship among variables under examination. It doesn’t establish causation.

**âœŽ** Regression analysis hampers heavily by outliers. All kinds of outliers must be removed before analysis is done.

**Download Practice Workbook**

Download this practice workbook.

## Related Article

- How to Get Regression Statistics in Excel
- How to Interpret Linear Regression Results in Excel
- How to Interpret Multiple Regression Results in Excel
- How to Plot Least Squares Regression Line in Excel
- How to Do Logistic Regression in Excel

**<< Go Back to Regression Analysis in ExcelÂ | Excel for StatisticsÂ |Â Learn Excel**