This article illustrates how to perform multiple linear regression on data sets in Excel. You will learn 2 easy ways to do that here. Linear Regression is used to predict the value of a response variable dependent on another explanatory variable. We need to use Multiple Linear Regression instead when there are more than one explanatory or independent variables on which the value of the target variable is dependent. You can easily perform a Multiple Linear Regression analysis or data sets in Excel. Have a quick look through the article to learn how to do that.
What Is Multiple Linear Regression?
Multiple Linear Regression or Multiple Regression is a statistical method to forecast the outcome of a particular variable dependent on several other independent variables. Its purpose is to establish a linear relationship between a dependent variable and several other independent variables. It is the extended version of the Simple Linear Regression which uses only one independent variable.
Assume that there are two independent variables X1 and X2 affecting the value of a dependent variable Y. Then, the equation for Multiple Linear Regression will be:
Here,α stands for the Intercept (constant). α1and α2 represent the change in Y due to the changes in X1 and X2 respectively. On the other hand, ϵ signifies the residual or error.
Multiple Linear Regression assumes a linear relationship between the dependent and the independent variable. But, the relationship between the independent variables is statistically insignificant.
This technique is mainly and extensively used in financial inference and econometrics.
Multiple Linear Regression on Excel Data Sets: 2 Methods
Consider the following dataset. Here, the dataset contains a sample of 10 houses numbered 1 to 10. X1 indicates the age of the houses and X2 denotes the number of grocery stores near each of them. Then prices per unit area of the houses are expressed by Y. Therefore, Y is the dependent variable here. On the other hand, X1 and X2 are the independent variables.
You can perform a Multiple Regression on the above dataset in two ways in Excel. Follow the methods below to do that.
1. Multiple Linear Regression on Data Sets with Data Analysis
Performing a Multiple Linear Regression in Excel involves 3 easy steps as highlighted below.
⏩ Enable the Analysis ToolPak
- First, press ALT+F+T to open Excel Options. Next, go to the Add-ins tab. Then, select Excel Add-ins. Click on Go after that.
- Then, check the Analysis ToolPak checkbox. Next, select OK. After that, you can access the Data Analysis feature from the Data tab.
⏩ Perform the Regression Analysis
- Now, select Data >> Data Analysis as shown in the picture below.
- Next, scroll through the Analysis Tools in the Data Analysis dialog box. Then, select Regression and then click OK.
- Now, enter the entire range of Y ($E$4:$E$14) variables including the header cells for Input Y Range using the upward arrow. Next, do the same for Input X Range ($C$4:$D$14). Then, check the Labels checkbox. You can choose the Output Options as required. Finally, select the OK button.
- After that, you will see the analysis result in detail as shown in the following picture.
⏩ Interpret the Results
Here, I will explain the three components of the regression analysis: The Regression Statistics table, the ANOVA table, and the Regression Coefficients table.
Interpret the Regression Statistics Table:
- Let’s discuss the summary output or the Regression Statistics table first. Here, the R Square is of the greatest importance.
- R Square = 0.6998 means 69.98% of the variables can be explained by the regressors or the independent variables.
- The Standard Error signifies the estimated standard deviation for the residual or error.
Interpret the ANOVA Table:
- Now, the Analysis of Variance (ANOVA) table is given below.
- Here, df stands for the degree of freedom and SS signifies the sum of squares of variances.
- The Significance F column has a P-Value of 0148 which is less than 5%. So, we can reject the Null Hypothesis. It concludes that the impact of the independent variables on the dependent variable is statistically significant.
Interpret the Coefficients Table:
- The table below containing the coefficients and other outputs is of the most importance.
- We get the following coefficients from the table above. α0 = 43.11, α1 = – 0.82, α2 = 2.42, and ϵ = 6.58. Therefore, the equation becomes:
2. Multiple Linear Regression on Data Sets with LINEST Function
Alternatively, you can use the LINEST function in Excel to get those results. Enter the following formula in cell H5 to get the desired results.
Then, Excel may show errors in some cells. You can nest the LINEST function inside the IFERROR function to avoid that.
🔎 How Does the Formula Work?
The LINEST Function returns the statistics that describe a linear trend matching known data points, by fitting a straight line using the least-squares method.
- known_ys : Required. The dependent variable i.e. the Y range.
- [known_xs] : Optional. The independent variables i.e. the X1 and X2ranges.
- [const] : Optional. True – constants are calculated normally. False – constants are set equal to zero.
- [stats] : Optional. True – returns the additional regression statistics. False – do not return additional regression statistics.
Read More: How to Do Linear Regression in Excel
Things to Remember
- Excel is limited to a particular number of regressors, possibly 16. Therefore, you cannot use more independent variables than that.
- You must keep the regressors or the independent variables in adjoining columns.
- Excel assumes that the errors are independent with constant variance (homoskedastic). It does not provide any alternatives.
Download Practice Workbook
You can download the practice workbook from the download button below.
Now you know how to perform multiple linear regression on data sets in Excel. Please let us know if the methods were useful to you. You can also use the comment section below for further queries or suggestions. Stay with us and keep learning.