How to Create Multiple Regression Scatter Plot in Excel

When you are doing statistical tasks, I am pretty sure you need to perform multiple regression analysis when there are multiple independent variables that may be related to the dependent variable. This tutorial will demonstrate the steps to create a multiple regression scatter plot in Excel. Scatter plots (also referred to as XY scatter charts or x-y graphs) are useful for interpreting charts. You can use it to display the relationships between different variables. Let’s go through the whole article.


What Is Multiple Regression?

Multiple regression is a mathematical technique that makes statistically based predictions about a dependent variable by using several independent variables. The purpose of multiple regression is to predict the nature of dependent variables with respect to corresponding independent variables. Within the analysis, independent variables affect the dependent variable through the changes we make to them.


How to Create Multiple Regression Scatter Plot in Excel

To demonstrate how to create multiple regression scatter plots in Excel, I have taken a concise dataset of 11 rows and 6 columns, which are Student Name, Gender, Age, Height(cm), Weight(lbs), and Food Intake(gm). Here, Food Intake(gm) is the dependent variable and Age, Height(cm), and Weight(lbs) are the independent variables. In this article, we are going to use the Excel 365 version. You can use any other version according to your convenience.

Dataset for creating multiple regression scatter plots in Excel

Now, let’s go through the following part of this article.


Step-1: Enable the Data Analysis Tab

The Data tab does not contain the Data Analysis command by default. To activate this, go through the below steps.

📌 Steps:

  • First, go to File >> Options.
  • Select Add-ins >> Excel Add-ins >> Go
  • Check Analysis ToolPak in the Add-ins available: section and click OK.

As a result, the Data Analysis Ribbon will appear in the Data Tab.


Step-2: Create the Multiple Regression Analysis in Excel

In this part of the article, I’ll show you how to analyze multiple regression using the Data Analysis Toolpak. To know more, go through the following steps.

📌 Steps:

  • From the Data tab >> select Data Analysis
  • A dialog box will show up. Select Regression from Analysis Tools and then click OK.

selecting regression option from Analysis Tools

As a result, the Regression dialog box will appear. We have already mentioned that in our dataset, Food Intake(gm) is the dependent variable and Age, Height(cm), Weight(lbs) are the independent variables. We will predict the Food Intake(gm) according to Age, Height(cm), and Weight(lbs).

Now,

  • Select the range of dependent variables (Input Y Range). In my case, it’s G4:G14.
  • After that, select the range of independent variables (Input X Range). In my case, it’s D4:F14.
  • Check Labels and select Output Range: in the Output Options. I have selected $B$16 to show regression analysis in the current sheet. You can also select New Worksheet Ply: in the Output Options to show regression analysis.
  • Click OK.

selecting Y and X range to show regression analysis

At this point, you will see regression analysis in cell B16 of your current worksheet.

summary output of regression analysis


Step-3: Create the Multiple Regression Scatter Plot in Excel

Now in this part, we will create scatter plots based on the dependent variable and independent variables. To create a scatter plot, follow the easy steps below.

📌 Steps:

  • Check the Line Fit Plots of Regression dialog box and click OK.

checking Line Fit Plots of the Regression dialog box to create scatter plot

As a result, we have 3 scatter plots of 3 independent variables based on the dependent variable.

scatter plot of Age Vs Food Intake(gm)scatter plot of Height(cm) Vs Food Intake(gm)scatter plot of Weight(lbs) Vs Food Intake(gm)

From these scatter plots, it is clearly visible how these variables are connected with one another. There are three types of Correlation in scatter plotting:

  • Positive Correlation: If, with the increase in the x variable, the y variable also increases, then it is a positive correlation. In the Weight(lbs) Vs Food Intake(gm) scatter plot, we can see that with the increase in weight, the amount of food intake is also increasing.
  • Negative Correlation: If, with the increase in the x variable, the y variable also decreases, then it is a negative correlation.
  • No Correlation: If there is no connection between the variables, then there is no correlation. For example, students’ gender has no correlation with their food intake.

How to Interpret the Result of Excel Multiple Regression

The regression analysis that we have performed has left several values for certain parameters. Let’s see what they mean.

Regression Statistics:

In the Regression Statistics portion, we see values for some parameters.image of regression statistics portion

  • Multiple R: This refers to the Correlation Coefficient that determines how strong the linear relationship among the variables is. The range of values for this coefficient is (-1, 1). The strength of the relationship is proportionate to the absolute value of Multiple R.
  • R Square: It is another Coefficient to determine how well the regression line will fit. It also shows how many points fall on the regression line. In this example, the value of R2 is 94, which is good. It implies that 94% of the data will fit the multiple regression lines.
  • Adjusted R Square: This is the adjusted R squared value for the independent variables in the model. It is suitable for multiple regression analysis and so for our data. Here, the value of Adjusted R Square is 91.
  • Standard Error: This determines how perfect your regression equation will be.
  • Observations: The number of observations in the dataset is 10.

Analysis of Variance (ANOVA):

We can see some other parameters in the ANOVA analysis section.image of anova analysis section

  • df: The ‘degrees of freedom’ is defined by df. The value of df here is 3 because we have 3 types of independent variables.
  • SS: SS refers to the sum of squares. If the Residual Sum of the Square is much smaller than the Total Sum of Square, your data will fit in the regression line more conveniently. Here, the Residual SS is much smaller than Total SS, so we can predict that our data may fit in the regression line in a better way.
  • MS: MS is the mean square. The values of Regression and Residual MS are 411.2481309 and 12.37593454 respectively.
  • F and Significance F: These values determine the reliability of the regression analysis. If the Significance F is less than 05, the multiple regression analysis is suitable to use. Otherwise, you may need to change your independent variable. In our dataset, the value of Significance F is 0.000391613 which is good for analysis.

Regression Analysis Output:

In this part, we will discuss the output of Regression Analysis.

Coefficients and Others:

In this section, we get the values of the coefficients for the independent variables- Age, Height(cm) and Weight(lbs). We can also find the following information for each coefficient: its Standard Error, t Stat, P-value, and other parameters.

image of coefficient and others analysis output

Residual Output:

The Residual Values help us understand how much the predicted price deviates from its actual value and the standard value of residuals that would be acceptable.

image of residual out analysis


Difference Between Linear and Multiple Regression

Linear regression is a simple technique that involves fitting a straight line to a set of data points. It is used when there is only one independent variable and one dependent variable, and the relationship between the two variables is assumed to be linear. The goal of linear regression is to find the line of best-fit line that minimizes the distance between the observed data points and the predicted values.

On the other hand, multiple regression is used when there are multiple independent variables that may be related to the dependent variable. The goal of multiple regression is to find the best-fit equation that can predict the value of the dependent variable based on the values of the independent variables. Multiple regression can help us understand the relative importance of each independent variable in predicting the outcome.


Key Takeaways from This Article

  • In this article, I have shown how to create multiple regression scatter plots in Excel.
  • A chosen real-life dataset for better understanding.
  • Focusing on how to create different scatter plots based on independent and dependent variables.
  • Explained how to interpret the result of multiple regression.
  • Provide solutions to frequently asked questions by readers.
  • Overall, focused on procedures for creating multiple regression scatter plots in Excel.

Download Practice Workbook


Conclusion

In this article, I have tried to cover how you can easily create multiple regression scatter plots in Excel. I hope you found this article informative and enjoyable. For more Excel and Excel VBA-related articles, visit ExcelDemy. If you have any questions, comments, or recommendations, please let me know in the comment section.


Related Article


<< Go Back To Scatter Chart in Excel | Excel Charts | Learn Excel

Get FREE Advanced Excel Exercises with Solutions!
Nujat Tasnim
Nujat Tasnim

Nujat Tasnim, BSc in Computer Science and Engineering from American International University-Bangladesh, has a year-long tenure with ExcelDemy. Initially a Content Developer at SOFTEKO, she transitioned to the Software Team as Junior Software Quality Assurance Engineer. Currently serving as a Junior SQA Engineer at SOFTEKO, she excels in SDLC, STLC, and requirement analysis and possesses profound insights into OOP-related software development processes. Nujat is also proficient in design software like Photoshop and Illustrator, contributing diverse skills to her... Read Full Bio

We will be happy to hear your thoughts

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo