How to Find Outliers in Excel (5 Useful Ways)

While dealing with datasets in Excel, it is often required to find outliers from the dataset. In this article, we are going to learn 5 effective methods on how to find the outliers in Microsoft Excel.


Download Practice Workbook


What Are the Outliers?

While dealing with datasets, by the word “Outliers” we mean data that are way higher or lower than the other data in the dataset. Outliers can quickly derail your analysis if they are not taken into account. In a word, finding outliers is therefore critical for any analysis.

As an illustration, think of a school bus full of 10 kids aged 8 years. But the bus driver is a very old man, who is 65 years old. Now, if we want to calculate the average age of the people on that bus, we get around 13 years, which is misleading information.


5 Ways to Find Outliers in Excel

In this article, we are going to learn 5 easy ways of finding Outliers in Excel. These methods are quick, precise, easy to use, and suitable for handling both small and large size of data sets.


1. Using Data Sorting Option to Find Outliers

Simply sorting the data and manually going through some of the top and bottom results is an easy approach to finding outliers in Excel.

Let me explain it to you with an example.

In the table below, there is data of some Swimmers about their swimming Time in seconds. Now let’s sort them and try to find the outliers.

Using Data Sorting Option to Find Outliers

Step 1: Select Header

First, select the header of the column that you want to sort. In this case, it is the Time (s) column (C4 cell).

Using Data Sorting Option to Find Outliers

Step 2: Select Home Tab

Next, click on the Home tab. It’s on the top-left side of your screen.

Using Data Sorting Option to Find Outliers

Step 3: Go to Sort & Filter Option

Then, click on the Sort & Filter. You can find it in the editing group at the top-right side of your screen.

Using Data Sorting Option to Find Outliers

Step 4: Choose Custom Sort Option

After that, click on Custom Sort from the drop-down.

Using Data Sorting Option to Find Outliers

Step 5: Edit Sort Dialogue Box

In the Sort dialogue box, from the Sort by drop-down select Time (s).

Now, from the Order drop-down, select Largest to Smallest.

Finally, click OK.

Using Data Sorting Option to Find Outliers

This should sort your data from the highest value to the lowest value. Now you can scan for the outliers at the top and the bottom of the column.

Using Data Sorting Option to Find Outliers

Here, you can see that the first figure is substantially larger than the remaining ones and the last two figures are substantially lower. So, we can identify them as Outliers.

Note: This method works well if you are dealing with a small data set. It’s not a scientific method but it can get the job done.


2. Applying QUARTILE Function in Excel

This method is better than the previous one. In statistics, quartile means one-fourth of a data set. For example, if you have 40 data points in a data set, then the bottom 10 values will be your 1st quartile (Q1), the next 10 data points will be the second quartile (Q2), and so on. To find outliers we need to calculate the 1st quartile, 3rd quartile, and the Inter-Quartile Range. Let’s proceed step by step.

Step 1: Calculate 1st and 3rd Quartile

Firstly, we will use the QUARTILE.INC function to calculate the quartiles. It needs two pieces of information. The range of data you want to select and which quartile of data you want.

To calculate Q1 we can use the following formula in cell G4.

=QUARTILE.INC(C5:C14,1)

Then press ENTER and here is our Q1.

Applying QUARTILE Function in Excel 

Similarly, we can find the 3rd quartile value by just changing the number from 1 to 3 in cell G5.

Applying QUARTILE Function in Excel

To calculate Q3 we can use the following formula in cell G5.

=QUARTILE.INC(C5:C14,3)

Applying QUARTILE Function in Excel

Step 2: Calculate Inter-Quartile Range

Now we will find the Inter-Quartile Range (IQR) by simply subtracting the 1st quartile from the 3rd  quartile. So, in cell G6 we can use the following formula.

=G5-G4

Press ENTER and we will get our IQR.

Applying QUARTILE Function in Excel

Step 3: Calculate Upper Limit and Lower Limit

For calculating the upper limit, we need to add the 3rd quartile value with 1.5 times the IQR. We can use the following formula in cell G7.

=G5+(1.5*G6)

Hit ENTER, and you will get the Upper Limit.

Applying QUARTILE Function in Excel

To calculate the Lower Limit we need to subtract 1.5 times IQR from the 1st Quartile. The formula for Lower Limit is given below.

=G4-(1.5*G6)

Now press ENTER, and you will get the Lower Limit.

Applying QUARTILE Function in Excel

Step 4: Identify the Outliers

Now we have all our necessary data to find the outliers. We will use the OR function here. In the OR function, we need to give some logical conditions. If these conditions are satisfied, it will return TRUE and if the conditions are not satisfied, it returns FALSE.

Now, any value that is greater than our Upper Limit or less than our Lower Limit, is an outlier. In cell D5 we can use the following formula.

=OR(C5>$G$7,C5<$G$8)

Now hit ENTER, and we will get the result for the first data point.

Applying QUARTILE Function in Excel

Note: We need to use absolute cell reference while selecting the cells of Upper Limit and Lower Limit. Because if we don’t then the cell reference will change when we drag the formula to find all the outliers. Just hit F4 after selecting the cell. This will make it an absolute cell reference.

Now just drag the Fill Handle up to the end of the data set or just double click it.

Applying QUARTILE Function in Excel

The cells in which TRUE is displayed are our Outliers.

Applying QUARTILE Function in Excel

Here, the times of Jonathan and Peter are lower than our Lower Limit and the time of Andy is higher than our Upper Limit. So, these three values are the Outliers.


3. Utilizing LARGE and SMALL Functions in Excel to Find Outliers

This comes in handy when we deal with a large data set. By utilizing the LARGE and SMALL functions in Excel, we can extract the highest 3 data or the lowest 3 data. We can extract more data if we want. After that, we can look into those data and see if there are any outliers.

The LARGE function needs two pieces of information. The range of data from where to extract and how much data you want to be extracted at a time.

Step 1: Extract Top 3 Values

We can use the following formula in cell E5. This will bring the top 3 values.

=LARGE($C$5:$C$14,ROW($1:3))

Press ENTER, and we will get our top 3 values.

Utilizing the LARGE and SMALL functions in Excel to Find Outliers

Note: If we wanted to bring the top 5 values, then ROW($1:3) would be changed to ROW($1:5).

Step 2: Extract Smallest 3 Values

We can use the following formula in cell F5. This will bring the bottom 3 values.

=SMALL($C$5:$C$14,ROW($1:3))

Press ENTER and we will get our bottom 3 values.

Utilizing the LARGE and SMALL functions in Excel to Find Outliers

Step 3: Identify the Outliers

Now, we can identify the Outliers by looking at our top and bottom 3 values.

Utilizing the LARGE and SMALL functions in Excel to Find Outliers


4. Applying Mean and Standard Deviation to Find Outliers

Another effective way to find the Outliers in Excel is by applying the concept of Mean and Standard Deviation.

Mean is simply the average of a data set. On the other hand, Standard Deviation is a measure of how much the points in a distribution deviate from the distribution’s mean value. Generally, a rule of thumb is that if a value is 2 to 3 times the standard deviation of the distribution, then it is considered an Outlier.

Let’s proceed one step at a time.

Step 1: Calculate the Mean

To calculate the Mean we need to use the AVERAGE function. This function needs only one piece of information, that is the range of the data. We can use the following formula at cell G4 to determine the mean of our data set.

=AVERAGE(C5:C14)

Now press ENTER, and we will have the mean of our data.

Applying Mean and Standard Deviation to Find Outliers

Step 2: Calculate Standard Deviation

In Excel to calculate standard deviation, we can use the STDEV.S function. It only requires the range of data set of which we want to calculate the standard deviation. We can use the following formula at cell G5 to determine the standard deviation of the distribution.

=STDEV.S(C5:C14) 

Now press ENTER.

Applying Mean and Standard Deviation to Find Outliers

Step 3: Evaluate the Upper Limit

The formula to evaluate the Upper Limit is Mean+(2*Standard Deviation). We can use the following formula at cell G6 to determine the Upper Limit.

=G4+(2*G5)

Press ENTER and here is our Upper Limit.

Applying Mean and Standard Deviation to Find Outliers

Step 4: Evaluate the Lower Limit

The formula to evaluate the Lower Limit is Mean-(2*Standard Deviation). We can use the following formula at cell G7 to determine the Lower Limit.

=G4-(2*G5)

Hit ENTER and here is our Lower Limit.

Applying Mean and Standard Deviation

Step 5: Identify the Outliers

Now, we will again use the OR function. So, the logic that we will use here is any value that is higher than our Upper Limit or lower than our Lower Limit is an Outlier. We can use the following formula in cell D5.

=OR(C5>$G$6,C5<$G$7)

Applying Mean and Standard Deviation to Find Outliers

Note: We need to use absolute cell reference while selecting the cells of Upper Limit and Lower Limit. Because if we don’t then the cell reference will change when we drag the formula to find all the outliers. Just hit F4 after selecting the cell. This will make it an absolute cell reference.

Finally, to obtain the rest of the values drag the Fill Handle or double-click it.

Applying Mean and Standard Deviation to Find Outliers

As a result, the cells that are displaying TRUE are our Outliers.


5. Using Z-Score to Find Outliers

Another effective way to find the outliers in Excel is by using the Z-Score. The z value is also known as the Standardized Value. We will again need the Mean and the Standard Deviation of the data set. The formula to calculate the Z-score of data is given below.

Z=( X- Mean ) / Starndard Deviation

We can also use the STANDARDIZE function in Excel to calculate the Z-score of data directly.

Step 1: Calculate Mean

Initially, we can use the following formula in cell H4 for calculating the Mean of our data set.

=AVERAGE(C5:C14)

After that, press ENTER we will have our Mean.

Using Z-score  to Find Outliers

Step 2: Calculate Standard Deviation

Next, we can use the following formula in cell H5 for calculating the Standard Deviation of our data set.

=STDEV.S(C5:C14)

Then hit ENTER.

Using Z-score to Find Outliers

Step 3: Evaluate Z-Score

The STANDARDIZE function needs 3 pieces of information to execute. 1st information is the data of which we are determining the Z-score, 2nd information is the Mean and the 3rd information is the Standard Deviation.

So, we will use the following formula in cell D5 to calculate the Z-score.

=STANDARDIZE(C5,$H$4,$H$5)

Using Z-score to Find Outliers

Note: We need to use absolute cell reference while selecting the cells of Mean and Standard Deviation. Because if we don’t then the cell reference will change when we drag the formula to find all the outliers. Just hit F4 after selecting the cell. This will make it an absolute cell reference.

Dragging the Fill Handle to the end of the data set will you the rest of the Z-scores.

Using Z-score to Find Outliers

Step 4: Identify Outliers

Again, we will use the OR function. Theoretically, for a value to be an Outlier, its Z-Score should be greater than +3 or less than -3. But in practice, the z-values that are close to 3 are pretty far from the mean and we can consider them as Outliers.

So, the logic here is any z-score that is greater than +2.5 or less than -2.5 will be an Outlier.

We can use the following formula in cell E5.

=OR(D5>2.5,D5<-2.5)

Using Z-score to Find Outliers

Eventually, after dragging the Fill Handle, you’ll get the following output.

Using Z-score to Find Outliers 

In the above screenshot, Our Outlier is the cell in column E with the value TRUE.


Conclusion

Thank you for taking the time to read this article. I’m hoping that the processes described above will be able to handle any difficulty with finding outliers in an Excel data set. Please feel free to leave a comment if you have any questions or any recommendations for improving the article.

Zahid
We will be happy to hear your thoughts

Leave a reply

ExcelDemy
Logo