Today I will be showing how you can compare rows in Excel for finding duplicates.
Download Practice Workbook
How to Compare Rows in Excel for Duplicates
Let us first have a look at this data set. We have the employee records of a company named Jupyter Group.
We have the First Names of the employees, the Last Names, their Starting Salaries and Present salaries in columns B, C, D and E respectively.
Now we will try to compare duplicate data in various rows and columns.
1. Duplicates in the Same Row but Different Columns
First of all, let us try to compare the duplicate data in the same row, but in different columns.
Let us try to find out which rows in columns D and E have the same data.
That means, which employees have the same starting salaries and present salaries.
We can do it in two ways.
- By examining each row and then deciding whether the two salaries are the same or not in a new column.
- By highlighting the rows with the same salaries.
1.1 Deciding in a New Column
First of all, let’s try to decide on a new column.
To accomplish that, take a new column and write this formula in the first cell of the column. I take a new column F, name it as Decision Column and write this formula in the cell F4:
Then drag the Fill Handle to decide for the rest of the rows.
See, you have identified pretty conveniently which rows have the same two salaries and which rows have not.
Note: Obviously, you can use anything other than “Same” and “Not Same” in the formula. Like “True” and “False”, “Match” and “Not Match” etc.
1.2 Highlighting the Particular Rows
- First of all select the column you want to compare. I select columns D and E.
- Then go to Home>Conditional Formatting>New Rule in the Excel Toolbar.
- You will get the New Formatting Rule dialogue box. Select Use a formula to determine which cells to format. Then in the formula box, write this formula:
Then click on Format.
- You will get the Format Cells Box. Choose any format you like. I chose the color Red from the Fill menu.
- Then click OK. You will find the cells where the salaries are the same highlighted in red.
2. Duplicates in Different Rows and Different Columns
Now consider a different scenario. Jupyter group has scheduled an interview session for some candidates.
Now we will try to find out whether any candidate called on the first week has been mistakenly called on the second week too.
2.1 Deciding in a New Column
Like section 1.1, take a new column. I take a new column C in between the two columns
Then insert this formula in the first cell of the new column:
Then drag the Fill Handle to execute it for the rest of the candidates.
If you want, instead of deciding whether it matches or not, you can directly determine how many times each name of the first week appears in the second week.
The formula will be:
See, the formula determines how many times each name of the first week appears in the second week.
2.2 Highlighting the Cells with Duplicates
Like section 1.2, select column B (Only column B, not B and C), then go to Home>Conditional Formatting>New Rule option in the Excel Toolbar.
In the New Formatting Rule box, choose to Use a formula to determine which cells to format.
Then in the formula box, write this formula:
Then click on Format and select your desired format.
And then click OK.
You will get the names in column B also appeared in column C highlighted like this.
3. Duplicates in the Same Column
Now we come back to our original data set, the Employee Record of Jupyter Group.
This time we focus on finding out the duplicates in the same column.
For example, first, we try to sort out which names have duplicates in the First Name column.
3.1 Deciding in a New Column
Take a new column and insert this formula in the first cell of the column:
=IF(COUNTIF(B$4:B$20,B4)>1,"Have Duplicates","Don't Have Duplicates")
Then drag the Fill Handle to decide for the rest of the names.
Now if you wish, you can drag the Fill Handle row-wise to decide for the Last Names too.
And again, if you want, you can count how many duplicates each name has in the columns rather than taking only a decision.
To execute that, enter this formula in cell F4 and then drag the Fill Handle through the rows and columns.
See, we have got the number of duplicates of both the first names and the last names.
3.2 Highlighting the Cells with Duplicates
First select columns B and C.
Then like section 1.1, go to Home>Conditional Formatting>New Rule.
And then in the New Formatting Rule box, select Use a formula to determine which cells to format.
Then in the formula box, write this formula:
Then click on Format. Choose your desired format.
After that, click OK.
You will find the duplicate values in each column have been highlighted like this.
4. Duplicates in the Whole Dataset
Finally, if you wish you can highlight the duplicate values in the whale data set. This is pretty simple.
First, select the whole data set.
Then go to Home>Conditional Formatting>Highlight Cells Rules>Duplicate Values.
Select Duplicate Values. You will get the Duplicate Values dialogue box.
Then select Duplicate. And then select any format you like. I am selecting Red Text.
You will get the duplicate values from the whole data set highlighted like this.
Using these methods, you can use any value in a set of data to search for duplicates in both rows and columns. Do you know any other method? Let us know in the comment section.