The article will show you some basic methods on how to do multiple regression analysis in Excel. This is a very important topic in the field of statistics. It helps us to predict a dependent variable concerning one or multiple dependent variables.
In the dataset, we have some information about some cars: their names, prices, maximum speeds in miles per hour, the peak power their engine can produce, and the maximum range of distance they can travel without refilling their tank.
Download Practice Workbook
What Is Multiple Regression?
Multiple regression is a statistical process by which we can analyze the relationship between a dependent variable and several independent variables. The purpose of regression is to predict the nature of dependent variables with respect to corresponding independent variables.
2 Steps to Do Multiple Regression Analysis in Excel
Step- 1: Enable the Data Analysis Tab
The Data Tab does not contain the Data Analysis ribbon by default. To activate this, go through the procedure below.
- First, go to File >> Options
- Then select Add-ins >> Excel Add-ins >> Go
- Check Analysis ToolPak in the Add-ins available: section and click OK.
After that, the Data Analysis Ribbon will appear in the Data Tab.
Step- 2: Creating the Multiple Regression Analysis in Excel
Here I’ll show you how to analyze multiple regression.
- From the Data tab >> select Data Analysis
- A dialog box will show up the select Regression and click OK.
A Regression dialog box will appear.
- We will predict the car price according to their maximum speed, peak power and range.
- Select the range of dependent variables (Input Y Range). In my case, it’s C4:C14.
- After that, select the range of independent variables (Input X Range). In my case, it’s D4:F14.
- Check Labels and select New Worksheet Ply: in the Output Options. If you want your regression analysis in the current sheet, put a cell reference where you want to start the analysis in the Output Range
You may choose Residuals if you want to do further analysis.
- After that, you will see the regression analysis in a new sheet. Format the analysis according to your convenience.
Thus you can do multiple regression analysis in Excel.
- How to Do Simple Linear Regression in Excel (4 Simple Methods)
- How to Interpret Regression Results in Excel (Detailed Analysis)
A Brief Discussion about Multiple Regression Analysis in Excel
The regression analysis leaves several values of certain parameters. Let’s see what they mean.
In the Regression Statistics portion, we see values of some parameters.
- Multiple R: This refers to the Correlation Coefficient that determines how strong the linear relationship among the variables is. The range of values for this coefficient is (-1, 1). The strength of the relationship is proportionate to the absolute value of Multiple R.
- R Square: It is another Coefficient to determine how well the regression line will fit. It also shows how many points fall on the regression line. In this example, the value of R2 is 86, which is good. It implies that 86% of the data will fit the multiple regression line.
- Adjusted R Square: This is the adjusted R squared value for the independent variables in the model. It is suitable for multiple regression analysis and so for our data. Here, the value of Adjusted R Square is 79.
- Standard Error: This determines how perfect your regression equation will be. As we are doing a random regression analysis, the value of Standard Error here is pretty high.
- Observations: The number of observations in the dataset is 10.
Analysis of Variance (ANOVA)
In the ANOVA analysis section, we also see some other parameters.
- df: The ‘degrees of freedom’ is defined by df. The value of df here is 3 because we have 3 types of independent variables.
- SS: SS refers to the sum of squares. If the Residual Sum of the Square is much smaller than the Total Sum of Square, your data will fit in the regression line more conveniently. Here, the Residual SS is much smaller than Total SS, so we can surmise that our data may fit in the regression line in a better way
- MS: MS is the mean square. The value of Regression and Residual MS is 78 and 5372210.11 respectively.
- F and Significance F: These values determine the reliability of the regression analysis. If the Significance F is less than 05, the multiple regression analysis is suitable to use. Otherwise, you may need to change your independent variable. In our dataset, the value of Significance F is 0.01 which is good for analysis.
Regression Analysis Output
Here, I will discuss the output of Regression Analysis.
- Coefficients and Others
In this section, we get the value of coefficients for the independent variables- Max. Speed, Peak Power and Range. We can also find the following information for each coefficient: its Standard Error, t Stat, P-value and other parameters.
2. Residual Output
The Residual Values help us to understand how much the predicted price deviates from its actual value and the standard value of residuals that would be acceptable.
The way the prediction by regression analysis works is given below.
Say, we want to predict the price of the first car according to its independent variables. The independent variables are the Max. Speed, Peak Power and Range whose values are 110 miles per hour, 600 horsepower and 130 miles, respectively. The corresponding regression coefficients are 245.43, 38.19 and 94.38. The y intercept value is -50885.73. So the predicted price will be 245.43*110+38.19*600+94.38*130-50885.73≈11295.
According to the dataset of this article, if you want to predict a car’s price which has a maximum speed of x mph, peak power of y hp and range of z miles, the predicted price will be 245.43*x+38.19*y+94.38*z.
Using Graph to Understand Multiple Linear Regression in Excel
If you want to visualize the regression line of your data, let’s go through the procedure below.
- First, from the Data tab >> Go to Data Analysis
- A Data Analysis dialog box will appear then select Regression.
- Finally, click OK.
Another dialog box of Regression will appear.
- Select Residual and Line Fit Plots.
- Click OK.
After that, you will see the graph of the regression line fits according to Max. Speed, Peak Power and Range in a new sheet along with analysis.
Below here, it represents the line fit according to Max. Speed.
And the following image shows the line fit according to Peak Power.
The below picture represents the line fit according to Range.
Please download the workbook and see the plots for a better understanding.
Here, I’m giving you the dataset of this article so that you can analyze multiple linear regression on your own.
Suffice to say, this article will help you understand how to do multiple regression analysis in Excel and it’s a brief description of the parameters. If you have any ideas or feedback regarding this article, please share them in the comment box. This will help me enrich my upcoming articles.