While working with a large dataset in Excel, there is a possibility that you are getting the same duplicate values from different rows or columns. Sometimes we may need to remove those duplicate values based on some specific criteria to get a clear concept about the worksheet. Excel provides some built-in features and formulas from which you can easily remove duplicates based on criteria. Today, in this article, we will learn some methods to remove duplicates based on criteria in Excel.
Download Practice Workbook
Download this practice sheet to exercise while you are reading this article.
4 Easy Ways to Remove Duplicates Based on Criteria in Excel
Consider a situation where you are given a dataset of some customer names and sold products in the “Customer Name” and “Item Sold” columns. Now in the dataset, these rows and columns are containing some duplicate values. We need to remove those duplicate values based on some criteria. To do that, we will use four different criteria discussed below.
1. Remove Duplicates Based on Criteria with the Advanced Filter
One of the key features of the advanced filter is that it can filter out duplicate values very easily. Using that feature we can remove duplicates from the dataset. Let’s learn this method!
- To remove duplicates based on criteria using the advanced filter, select the whole dataset, go to the Data tab, then in the Sort & Filter group, click on Advanced.
Data → Sort & Filter → Advanced
- In the Advanced Filter window, check on Filter the List, in-Place to filter the dataset in its current location. Select your List-Range Check on Unique Records Only. Click OK to get the result.
- Our dataset is filtered and the duplicates are removed based on the selected criteria.
- If we want to make our unique dataset list in another place in the worksheet, just click on Copy to Another Place in the Advanced Filter Select your place by inserting values in the Copy to box. Click OK to continue.
- Our duplicate values from the dataset are removed and a new unique dataset is created.
Read more: How to Remove Duplicate Rows in Excel
2. Remove Duplicates Based on Columns in Excel
We can remove the duplicates based on single or multiple columns in Excel. These procedures are discussed below.
i. Remove Duplicates Based on a Single Column
The COUNTIF function can help you to find if there are any duplicates in the columns. Then you can filter the result to remove those duplicates based on the specific column.
- Create two new columns named Combined Text and Count in the existing dataset. Here we will apply the formula and filter the duplicate values based on criteria.
- In cell D4 of the Combine Text column, apply this formula,
- This formula will add cells B4, C4, D4
- Press Enter to apply the formula and copy this formula down to the last cell. Now we will remove duplicates from the dataset based on this new Combined Text column
- Now we will apply the COUNTIF function in cell F4. The formula is
- Where E$4:E4 is the range and the criterion is F4.
- Get the result by pressing Enter. Apply the same function to the rest of the cells. If the function returns more than one value then the value is a duplicate. That’s how we will find out the duplicates in the dataset.
- So we have got the status of our dataset. Now we will filter the dataset. Click on the Count column header and go to Sort & Filter in the Editing Group and click on Filter
Home → Sort & Filter → Filter
- Now we have the filter option in every column of our dataset.
- Click on the drop-down filter icon in the Count From the given option, uncheck 2 to filter the dataset, and click OK.
- We have successfully removed the duplicates from the dataset based on a single column.
ii. Remove Duplicates Based on Multiple Columns
We can remove duplicates based on criteria using the Remove Duplicates feature. The Remove Duplicates is an amazing feature by which you can easily remove your duplicates from the dataset based on criteria. Here we will use this feature to remove duplicated based on multiple columns.
- Select the whole dataset. Go to the Data tab then select the Remove Duplicates option in the Data Tools
Data → Remove Duplicates
- In the Remove Duplicates window, check on the My Data has Headers We want to remove our duplicate values based on the Customer Name and the Item Sold column. Make sure to check on those two columns and then click OK to continue.
- A new window appears telling that our duplicate values are removed. Click OK to proceed
- Now we have successfully removed all the duplicate values based on multiple columns.
- How to Remove Duplicate Rows Based on One Column in Excel
- How to Remove Duplicates and Keep the First Value in Excel (5 Methods)
- How to Remove Duplicates Using VLOOKUP in Excel (2 Methods)
3. Remove Duplicates Based on Rows in Excel
- Create a new column named Status in the existing dataset. Here we will apply the formula and filter the duplicate rows.
- In cell E4 of the Status column, combine the IF and the COUNTIFS Insert the values into the formula and the final form is,
- Where $B$4:$B4, $C$4:$C4, $D$4:$D4 are the ranges
- $B4, $C4, $D4 are the criteria
- If the argument is true then the formula will return Unique
- If the argument is false then the formula will return Duplicate Row
- Press Enter to apply the formula.
- And apply the same to the rest of the cells.
- So we have got the status of our dataset. Now we will filter the dataset. Click on the Status column header and go to Sort & Filter in the Editing Group and click on Filter
Home → Sort & Filter → Filter
- After adding the filter to the dataset, click on the drop-down filter icon in the Status column. From the given option, uncheck Duplicate Row to filter the dataset and click OK
- Our duplicate values are now filtered out based on rows
- In the previous steps, we removed duplicates based on rows without the first occurrence. You can easily filter the duplicates based on rows including the first occurrence too. To do this, the formula is,
- Where the new range is $B$4:$B$13, $C$4:$C$13, $D$4:$D$13. And the rest arguments are the same as before.
- Press Enter to apply this function and apply the same to the rest of the columns.
- We will now filter them as we used in the previous steps. Apply the filter in those columns and in the Status column, filter the data using only the Unique
- Click OK to remove those duplicate rows. And our final result is here.
4. Use VBA Codes to Remove Duplicates Based on Criteria
You can also use a VBA code to remove duplicates from the dataset. Let’s learn this method!
- Go to a net worksheet. We will use the previous dataset. Press “CTRL+F11” to open the VBA window
- In the new VBA window, select Insert and click on Module
- Now write down the VBA We have written the code for you. You can just copy-paste the code into your VBA window. The code is,
Sub Delete_duplicate_rows() Dim Rng As Range Set Rng = Selection Rng.RemoveDuplicates Columns:=Array(1), Header:=xlYes End Sub
- Close the VBA Now we will apply the code. To do that, select the range of cells that you want to deal with (B3:E13).
- Select the Developer tab and click on the Macros option
- This will open the Macro window where our written code is present. Now click on Run to apply the code.
- And our duplicate values are successfully removed.
Things to Remember
👉 When you select your Range, you have to use the absolute cell references ($) to block the array.
👉 You can filter your data in other places in the worksheet while using the Advanced Filter option.
In this article, we have shown four simple yet effective methods to remove duplicates based on criteria. We hope this article proves useful to you. If you have any ideas or suggestions, you are welcome to share your thoughts in the comment box.