In this article, we are going to analyze how to find unique values in Excel. Duplication of values in Excel can occur in a single column or across multiple columns and rows in Excel. It is often necessary to eliminate duplication, for correct data analysis and data integration purposes.
Other times, one may just want to note recurring values, rather than remove them. There are few key ways to get rid of duplicate values in Excel or to detect unique values. So, we are going to examine a few of these in this post.
Right, let’s get started with an example.
High-Performance Liquid Chromatography (HPLC) is a well-known and widely utilized analytical chemistry technique. It allows for the separation, identification, and quantification of compounds in a mixture. In this tutorial, we will be working on a spreadsheet in which different research assistants will be conducting different HPLC columns that can detect different compounds. We will be analyzing the duplications and uniqueness of the columns in different processes.
The source data is shown below, there are 200 rows of data with 5 columns.
Excel List Of Unique Values Based On Criteria
1) First things first, create five new sheets in order to store the extracted unique values, in these sheets. It is always a good idea to keep the original data intact.
2) We now have a sheet called UniversitiesResearchInstitutes, and we are going to extract all the unique university and research institution names to this sheet using Advanced Filter.
3) So we start on our destination sheet, which is the UniversitiesResearchInstitutes sheet as shown below.
4) Go to Data>Sort & Filter> Advanced.
5) Select the Copy to another location checkbox.
6) For the Copy to option, select $A$1 on the UniversitiesResearchInstitutes sheet, and tick unique records only as shown below.
7) Select the List range, and then navigate to the OriginalDataset sheet and select column $A$4:$A$204, which is our Name of University or Research Institute column, as shown below.
8) Click Ok, while still on the OriginalDataset sheet.
9) The unique university and research institute names are then extracted to the UniversitiesResearchInstitutes sheet.
10) We then follow the same process for all the other columns in the original dataset.
11) We have now removed the duplication to obtain the unique list of universities, HPLC Columns, compounds advertised by the suppliers, the compound identification and detection actually taking place at the labs, and research assistants using Advanced Filter.
This is one process of extracting the unique values from a spreadsheet.
Remove Duplicate in Excel to Find The Unique Values
One can also remove duplicate values, using the Remove Duplicates Feature on the Data Tab.
1) First things first, make a copy of the Original Dataset sheet, since it is always best to keep the source data intact as mentioned and call this copy, the RemoveDuplicates1 sheet.
2) Select the range, go to Data>Data Tools>Remove Duplicates.
3) Make sure the My data has headers checkbox is ticked, and then choose the column that you’d like the duplicates to be removed from or leave all the columns checked.
4) If you choose to remove duplication for one particular column in the data set, you’ll get the same results as for the Advanced Filter, unique extracts example above.
5) For example, choosing Name of University or Research Institute as shown below.
6) Results in a dialog box popping up, showing that based on the Name of University or Research Institute column 194 duplicate values were removed, and six unique values remain.
7) This confirms the Advanced Filter results and additionally highlights the number of duplicate values, within this specific column.
8) Doing the same thing for the HPLC Column yields 182 duplicate values found, 18 unique values remain.
9) Removing duplicates and ensuring all the columns are checked as shown below.
10) This results in Excel returning an answer that says that there were 3 duplicate values and 197 unique values. This means that there were three rows in the dataset where the values for Name of University or Research Institute column, HPLC Column column, Compounds the HPLC Column can Detect (Supplier) column, Compounds the HPLC Column is actually used to detect (specific lab) column and Research Assistant column were all equal.
11) We now want to look at the data in context and find out, outside of the Research Assistant column, how many values are duplicated.
12) We get 23 duplicate values removed, and 177 unique values remain in the data set as shown below.
After this target, these duplicate values and simply delete them to get the unique values.
Find Excel Unique Values with Formula
One can also use the COUNTIF function to detect duplication or recurring values.
1) For example in our Compounds the HPLC Column is actually used to detect (specific lab) column, we want to see the number of times C8 compounds, are repeated.
2) We will make a copy of our OriginalDataset sheet and work on this copy. In a cell in the sheet, we enter the formula:
3) We get 18 results returned.
We can then use the EXACT function, in conjunction with conditional formatting in order to show us where exactly these values in the column are occurring.
1) We highlight the Compounds the HPLC Column is actually used to detect (specific lab) column as shown below.
2) We select the following Home>Styles >Conditional Formatting.
3) We select New Rule from the drop-down options given.
4) We choose the Use a formula to determine which cells to format option.
5) In the Format values where this formula is true: text box, enter the following formula:
6) Choose Format…….
7) Choose a dark blue fill, with white bold text from the options given, as shown below.
8) Click Ok once done, in order to preview what the formatting will look like.
9) Click Ok again.
10) The recurring instances of C8 compounds are now highlighted through conditional formatting.
11) In order to quickly see all the rows, containing the formatting, highlight the entire range, and go to Data>Sort & Filter>Sort.
12) In the Sort dialog box, for the Sort by option select the Compounds the HPLC Column is actually used to detect (specific lab) column. For Sort On, select the Cell Color option. For the Order, choose the cell color as shown, and then select on top and click Ok.
13) Now you can see all the C8 compound rows, on top, sorted by this color formatting. This type of conditional formatting, followed by a color sort to get all the values formatted on top, is ideal for data sets, which have large numbers of rows.
Filter Unique Values in Excel
You can also use the built-in features of Excel Tables, in order to detect and remove duplication
1) Make a copy of the sheet containing the original data.
2) With a cell in the data set selected, press CTRL-T on your keyboard or go to Insert>Tables >Table.
3) Using the filtering options, on the table also allows for isolation of certain values. Clicking on the drop-down arrow next to the Name of University or Research Institute column, allows one to select certain universities. Ticking just University ABC isolates all the records for University ABC in the data set as shown below.
4) One can also filter a table using more than one column, for example, let’s say one wanted to see only the records in the table from University ABC, that were compiled by research assistant Jennifer Smith. One would first tick University ABC as in the above example and then click the drop-down arrow next to the Research Assistant column and select Jennifer Smith as shown.
5) Going to Data>Sort & Filter and clicking on Clear, in order to toggle custom filters you have applied, return the table back to its original unfiltered state.
Using Slicers with Tables in order to filter and detect repeat values
1) Slicers provide a quick way to filter tables visually. In order to insert a slicer, select a cell in the table, go to the Design tab in the Table Tools context-sensitive menu>Tools> Insert Slicer.
2) In the Insert Slicers dialog box, tick the Compounds the HPLC Column can Detect (Supplier) column and the Compounds the HPLC Column is actually used to detect (specific lab) column as shown below.
3) The slicers below are inserted.
4) Selecting C8 compounds from the Compounds the HPLC Column can Detect (Supplier) slicer, filters the table instantly to show only the rows where C8 compounds were listed, in this column as shown below.
5) We then choose C8 compounds again in the Compounds the HPLC Column is actually used to detect (specific lab) slicer as shown below. We want to check, where the value is repeated in order to see which labs are actually following the specifications set by the supplier.
6) We can now also see, which labs are not using the HPLC column in the manner specified by the manufacturer, by clearing the filter on the slicer on Compounds the HPLC Column is actually used to detect (specific lab) slicer. Consequently, we then see all the other carbon number compounds the HPLC Column is being used for, in addition to the one specified by the supplier.
7) We can see from just applying the filter to Compounds the HPLC Column can Detect (Supplier) slicer, that the compounds this HPLC Column, is also used for in addition to the supplier’s specification are C10 compounds, C11 compounds, C12 compounds, C15 compounds, C16 compounds, C17 compounds, C18 compounds, C19 compounds, C22 compounds, C5 compounds, C7 compounds, and C9 compounds. The greyed-out values indicate the column is not being used to analyze these other compounds. One can thus use slicers to conduct quick visual analysis and filtering.
Remove Duplicates in Excel Table
1) One can also remove duplicates in an Excel Table. With a cell in the table selected, go to the Design tab in the Table Tools context-sensitive menu>Tools > Remove Duplicates as shown below.
2) We can then select the columns, we’ll just leave all of the columns checked for this example.
3) We then get the number of duplicates detected using these settings, which is the same value, that was returned using the Remove Duplicates Feature on the Data Tab example above
Download Excel File
Download the file that has been used to make this tutorial.
Excel provides a wide array of tools to detect recurring values and/or remove duplicate values. It is sometimes necessary when cleaning datasets in Excel for future database integration (which requires removal of redundancy and duplication) to use more than one tool. As mentioned and illustrated in some of the examples above, we don’t always want to remove recurring values instead we just want to count them or note them, scenarios that are catered to in Excel as well. Hope this article will help you to find unique values in excel.
Please feel free to comment on the attachment and tell us your tips for removing duplicate values/detecting recurring values and if you use HPLC analysis in your lab.