How to Find Outliers in Regression Analysis in Excel (3 Easy Ways)

Outliers in the Regression Analysis are a common issue. In most cases, Outliers are the result of experimental errors. Having this kind of Outliers can steer the direction of the analysis and create misconceptions about the outcome. If you are baffled to find out the Outliers from your Regression Analysis results, this article may come in handy for you. Here we discuss how you can find Outliers in Regression Analysis in Excel effectively.


Download Practice Workbook

Download this practice workbook below.


3 Easy Ways to Find Outliers in Regression Analysis in Excel

We are going to use 3 separate ways in order to distinguish the Outliers from the Regression Analysis results in Excel. The methods will involve using various types of functions and charts. For better understanding, prior knowledge of the Regression Analysis and basic charts is recommended.


1. Applying Combined Formula

Using the IF function we will get the Outliers from our dataset. We will also use the SQRT, ABS, and SUM functions. We can conduct the Regression Analysis and extract the Outliers from the output.

Steps

  • In the beginning, we need to conduct a Regression Analysis of our dataset.
  • In order to do this, go to Data > Data analysis.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Then in the new dialog box named Data Analysis, select the option Regression.
  • Then click OK.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • In the Input Y Range box, select the range of cells C4:C15.
  • Then in the Input X Range box, select the range of cells B4:B15.
  • Next tick on the Labels box.
  • Then in the Output options group, select the Output Range and then select cell E2 in the range box.
  • After then, check the Residuals, Standardize Residuals, and Line Fit Plots boxes.
  • Click OK after this.

  • Then you will notice that the Regression Analysis results are now presented in the Excel sheet starting from the E2 cells.

Now we will need the Squared Sum value of the residuals and then we will use that value to calculate the Standard Deviation of the residuals.

  • Select the cell J26, and enter the following formula:
=H26^2
  • Entering this formula will calculate the square of the residual at cell H26.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Drag the Fill Handle to the cell J36.
  • Doing this will fill the range of cells J26:J36 with the squared value of the Residuals.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Then select the cell K37 and enter the following formula:
=SUM(K26:K36)
  • Doing this will calculate the sum of the range of cells K26:K36, in other words, we got the Sum of the Squared Residuals(SSE).

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Next, to calculate the Standard Deviation(S), we select the cell N26 and enter the following formula:
=2*SQRT(J37/(G10-2))

Using IF Formula to Find Outliers in Regression Analysis Excel

  • This value is the value of the Standard Deviation of the residuals of the given dataset.
  • Now we select the cell L26 and enter the following formula:
=IF(ABS(H26)>$N$26,"Outlier","Not Outlier")

Using IF Formula to Find Outliers in Regression Analysis Excel

  • This formula will check whether the absolute value of the residuals is greater than the Sum of the Squared Residuals(SSE) calculated in the earlier step. If the residual is greater than the SSE value, then the cell will show “Outlier”. Otherwise, it will show “Not Outlier”.
  • Drag the Fill Handle to the cell L28, and doing this will indicate whether the cell value is an Outlier or not in the range of cells L26:L36.
  • We can observe that the first value of the range of cell L26:K36 is showing Outlier And the rest of them are Not Outliers.

  • Next, select the cell D5 and enter the following formula:
=C5+$O$26

  • Then drag the Fill Handle to cell D15, which will fill the range of cell D5:D15 with the upper limit of the casualties Outlier value. Any value above the values will be considered an Outlier.

  • Similar way, select cell E5 and enter the following formula:
=C5-$O$26

  • Then drag the Fill Handle to cell E5. Doing this will fill the range of cell E5:E15 with the lower limit of the Outlier value of Casualties.

Using IF Formula to Find Outliers in Regression Analysis Excel

  • Then we created the graph with the values we obtained so far. Which constitutes the lower and the upper limit of the Outlier values.

Using IF Formula to Find Outliers in Regression Analysis Excel

We can see clearly from the graph that the first value is on the verge of the upper limit. So this value is the Outlier here.


2. Applying Inter-Quartile Range

By calculating the first and the third Quartiles of a dataset and then estimating the Inter-quartile Range between them, we can separate the Outlier from the dataset. We will use the combination of QUARTILE.EXC and IF functions to achieve this.

Steps

  • In the beginning, select the cell F5 and enter the following formula:
=QUARTILE.EXC(C5:C11,1)

  • In the beginning, select the cell F5 and enter the following formula:
=QUARTILE.EXC(C5:C11,3)

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • In the beginning, select the cell F7 and enter the following formula:
=F6-F5

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • In the beginning, select the cell F8 and enter the following formula:
=F6+1.5*F7

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • In the beginning, select the cell F9 and enter the following formula:
=F5-(F7*1.5)

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • Then select the cell D5 and enter the following formula:
=IF(C5<G9,"Outlier","Not Outlier")

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • Then drag the Fill Handle to cell D11.

Implementing IQR Method to Find Outliers in Regression Analysis Excel

  • Doing this will fill the range of cell D5:D11 with text indicating whether the value is an Outlier or not.

Implementing IQR Method to Find Outliers in Regression Analysis Excel

When we created a scatter plot of the graph, we can easily distinguish the Outliers from the rest of the dataset in Excel while conducting regression analysis.


3. Using Charts to Find Outliers

Using charts is the fastest way to separate Outliers in Regression Analysis results. Although they can’t be 100 percent accurate and can vary from person to person. It can be used as a part of the primary screening process.

Steps

  • For this method, we need to create a Scatter chart.
  • To do this, go to the Insert tab and then click on the Scatter in the charts group.

Using Charts to Find Outliers in Regression Analysis Excel

  • Then right-click on the mouse and click on the Select Data.

Using Charts to Find Outliers in Regression Analysis Excel

  • After this, there will be a new dialog box named Select Data Source.
  • In that box click on the Add button.

  • Then in the Edit Series dialog box, click on the Series X values range box and select the range of cells B5:B11.
  • Then click on the Series Y value range box and select the range of cells C5:C11.
  • Click OK after this.

  • Then you will notice that the chart in the worksheet now has the data plotted.
  • In order to plot a Trendline, we need to click on the plus icon on the right side of the chart.
  • Then choose and tick the Trendline box.
  • There will be a Trendline after then.

Using Charts to Find Outliers in Regression Analysis Excel

  • Double click on the Trendline to select it and then right-click on it.
  • From the context menu, click on the Format Trendline.

Using Charts to Find Outliers in Regression Analysis Excel

  • On the side panel menu, tick the Display Equation on chart and the Display R Squared value boxes.

  • The equation and the r-squared value of the equations are now presented in the chart.

Using Charts to Find Outliers in Regression Analysis Excel

  • From the chart, it is quite evident the third value in the database is the Outlier, as it is quite a way from the Trendline.

💬 Note

The above method only works when the dataset is small. Where you can scan the value and the chart quickly. In the case of a large dataset, you better opt for the second and the first method.

And this is how we find out the Outliers in regression analysis in Excel using the scatter chart.


Conclusion

To sum it up, the question “how to find Outliers in Regression Analysis Excel‘ is answered here in 3 separate steps with elaborate explanations.

For this problem, a workbook is available for download where you can practice these methods.

Feel free to ask any questions or feedback through the comment section. Any suggestion for the betterment of the Exceldemy community will be highly appreciable

Rubayed Razib Suprov
We will be happy to hear your thoughts

Leave a reply

ExcelDemy
Logo