How to Find Outliers in Regression Analysis in Excel (3 Easy Ways)

Outliers in the Regression Analysis are a common issue. In most cases, Outliers are the result of experimental errors. Having this kind of Outliers can steer the direction of the analysis and create misconceptions about the outcome. If you are baffled to find out the Outliers from your Regression Analysis results, this article may come in handy for you. Here we discuss how you can find Outliers in Regression Analysis in Excel effectively.


How to Find Outliers in Regression Analysis in Excel: 3 Easy Ways

We are going to use 3 separate ways in order to distinguish the Outliers from the Regression Analysis results in Excel. The methods will involve using various types of functions and charts. For better understanding, prior knowledge of the Regression Analysis and basic charts is recommended.


1. Applying Combined Formula

Using the IF function we will get the Outliers from our dataset. We will also use the SQRT, ABS, and SUM functions. We can conduct the Regression Analysis and extract the Outliers from the output.

Steps

  • In the beginning, we need to conduct a Regression Analysis of our dataset.
  • In order to do this, go to Data > Data analysis.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Then in the new dialog box named Data Analysis, select the option Regression.
  • Then click OK.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • In the Input Y Range box, select the range of cells C4:C15.
  • Then in the Input X Range box, select the range of cells B4:B15.
  • Next tick on the Labels box.
  • Then in the Output options group, select the Output Range and then select cell G5 in the range box.
  • After then, check the Residuals, Standardize Residuals, and Line Fit Plots boxes.
  • Click OK after this.

  • Then you will notice that the Regression Analysis results are now presented in the Excel sheet starting from the G5 cells.

Now we will need the Squared Sum value of the residuals and then we will use that value to calculate the Standard Deviation of the residuals.

  • Select the cell K26, and enter the following formula:
=I26^2
  • Entering this formula will calculate the square of the residual at cell K26.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Drag the Fill Handle to cell K36.
  • Doing this will fill the range of cells K26:K36 with the squared value of the Residuals.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Then select the cell K37 and enter the following formula:
=SUM(K26:K36)
  • Doing this will calculate the sum of the range of cells K26:K36, in other words, we got the Sum of the Squared Residuals(SSE).

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Next, to calculate the Standard Deviation(S), we select the cell K38 and enter the following formula:
=2*SQRT(K37/(H10-2))

Using IF Formula to Find Outliers in Regression Analysis Excel

  • This value is the value of the Standard Deviation of the residuals of the given dataset.
  • Now we select the cell L26 and enter the following formula:
=IF(ABS(I26)>$K$38,"Outlier","Not Outlier")

Using IF Formula to Find Outliers in Regression Analysis Excel

  • This formula will check whether the absolute value of the residuals is greater than the Sum of the Squared Residuals(SSE) calculated in the earlier step. If the residual is greater than the SSE value, then the cell will show “Outlier”. Otherwise, it will show “Not Outlier”.
  • Drag the Fill Handle to the cell L36, and doing this will indicate whether the cell value is an Outlier or not in the range of cells L26:L36.
  • We can observe that the first value of the range of cell L26:L36 is showing Outlier And the rest of them are Not Outliers.

  • Next, select the cell D5 and enter the following formula:
=C5+$K$38

  • Then drag the Fill Handle to cell D15, which will fill the range of cell D5:D15 with the upper limit of the casualties Outlier value. Any value above the values will be considered an Outlier.

  • Similar way, select cell E5 and enter the following formula:
=C5-$K$38

  • Then drag the Fill Handle to cell E5. Doing this will fill the range of cell E5:E15 with the lower limit of the Outlier value of Casualties.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Then we created the graph with the values we obtained so far. Which constitutes the lower and the upper limit of the Outlier values.

Using IF Formula to Find Outliers in Regression Analysis Excel

We can see clearly from the graph that the first value is on the verge of the upper limit. So this value is the Outlier here.

Read More: How to Calculate Outliers in Excel


2. Applying Inter-Quartile Range

By calculating the first and the third Quartiles of a dataset and then estimating the Inter-quartile Range between them, we can separate the Outlier from the dataset. We will use the combination of QUARTILE.EXC and IF functions to achieve this.

Steps

  • In the beginning, select the cell F5 and enter the following formula:
=QUARTILE.EXC(C5:C11,1)

  • In the beginning, select the cell F5 and enter the following formula:
=QUARTILE.EXC(C5:C11,3)

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • In the beginning, select the cell F7 and enter the following formula:
=F6-F5

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • In the beginning, select the cell F8 and enter the following formula:
=F6+1.5*F7

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • In the beginning, select the cell F9 and enter the following formula:
=F5-(F7*1.5)

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • Then select the cell D5 and enter the following formula:
=IF(C5<G9,"Outlier","Not Outlier")

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • Then drag the Fill Handle to cell D11.

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • Doing this will fill the range of cell D5:D11 with text indicating whether the value is an Outlier or not.

Implementing IQR Method to Find Outliers in Regression Analysis Excel

When we created a scatter plot of the graph, we can easily distinguish the Outliers from the rest of the dataset in Excel while conducting regression analysis.

Read More: How to Show Outliers in Excel Graph


3. Using Charts to Find Outliers

Using charts is the fastest way to separate Outliers in Regression Analysis results. Although they can’t be 100 percent accurate and can vary from person to person. It can be used as a part of the primary screening process.

Steps

  • For this method, we need to create a Scatter chart.
  • To do this, go to the Insert tab and then click on the Scatter in the charts group.

Using Charts to Find Outliers in Regression Analysis Excel

  • Then right-click on the mouse and click on the Select Data.

Using Charts to Find Outliers in Regression Analysis Excel

  • After this, there will be a new dialog box named Select Data Source.
  • In that box click on the Add button.

  • Then in the Edit Series dialog box, click on the Series X values range box and select the range of cells B5:B11.
  • Then click on the Series Y value range box and select the range of cells C5:C11.
  • Click OK after this.

  • Then you will notice that the chart in the worksheet now has the data plotted.
  • In order to plot a Trendline, we need to click on the plus icon on the right side of the chart.
  • Then choose and tick the Trendline box.
  • There will be a Trendline after then.

Using Charts to Find Outliers in Regression Analysis Excel

  • Double click on the Trendline to select it and then right-click on it.
  • From the context menu, click on the Format Trendline.

Using Charts to Find Outliers in Regression Analysis Excel

  • On the side panel menu, tick the Display Equation on chart and the Display R Squared value boxes.

  • The equation and the r-squared value of the equations are now presented in the chart.

Using Charts to Find Outliers in Regression Analysis Excel

  • From the chart, it is quite evident the third value in the database is the Outlier, as it is quite a way from the Trendline.

💬 Note

The above method only works when the dataset is small. Where you can scan the value and the chart quickly. In the case of a large dataset, you better opt for the second and the first method.

And this is how we find out the Outliers in regression analysis in Excel using the scatter chart.

Read More: How to Find Outliers with Standard Deviation in Excel


Download Practice Workbook

Download this practice workbook below.


Conclusion

To sum it up, the question how to find Outliers in Regression Analysis Excel’ is answered here in 3 separate steps with elaborate explanations. For this problem, a workbook is available for download where you can practice these methods.


Related Articles


<< Go Back to Outliers in Excel | Excel for Statistics | Learn Excel

Get FREE Advanced Excel Exercises with Solutions!
Rubayed Razib Suprov
Rubayed Razib Suprov

Rubayed Razib, holding a BSC degree in Naval Architecture & Engineering from Bangladesh University of Engineering and Technology, serves as a devoted member of the ExcelDemy project. He has contributed significantly by authoring numerous articles and showcasing proficiency in VBA. Razib efficiently automates Excel challenges using VBA macros and actively participates in the ExcelDemy forum, providing valuable solutions for user interface challenges. Apart from creating Excel tutorials, he is interested in Data Analysis with MS Excel,... Read Full Bio

2 Comments
  1. This is very hard to follow for setting up with your own dataset. The column values are not consistent, so it is hard to know what you should be applying across different columns. The standard deviation calculation references G10 – but there is nowhere where G10 is referenced! The formula is also not the standard deviation formula I am familiar with, which has N on the bottom. Doing the maths to work out what G10 is – it is 9 – but there is NO VALUE 9 anywhere above – so it is very confusing. I’ve tried a number of different ways to work this out, and every time, either all my points are outliers or all are not. The graph it is very easy to see that there are outliers – but this regression method is very complicated to follow!

    • Hello KAREN W,
      First of all, we would like to apologize for the trouble. As you pointed out, there were in fact some issues with the cell referencing in the Regression mathod, luckily they have been updated.

      The Exceldemy team is grateful to you for sharing your thoughts and feedback. Hopefully, now you can obtain the desired result.

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo