Nowadays Microsoft Excel is the most widely used data-handling application. It is frequently necessary to clean data in Excel before we begin working with it.
Download Practice Workbook
Why Is Data Cleaning So Crucial?
Data is the foundation of all we do in Excel. While working in Excel, errors might occur if data hasn’t been appropriately cleaned. As a result, clumsy mistakes like incorrect spacing, misaligned formatting, and other such errors occur. For this, in excel it is essential that we thoroughly clean our data before we begin using it.
Read More: How to Clean Survey Data in Excel (with Detailed Steps)
7 Methods to Clean Data in Excel
In this section, we’ll apply 7 simple and effective ways to ensure that our data is clean and ready to be used. Use these strategies to get the job done quickly and easily!
1. Using TRIM Function to Remove Extra Space in Excel
The TRIM function is a straightforward function to use. It eliminates any unnecessary space in a document. To operate, this function only needs the text that we want to correct.
Let’s go one step at a time.
Some of the texts in the dataset below have erroneous spacing in various places. We need to eliminate these spaces.
- To trim the first text we can use the following formula in cell C5.
Now hit ENTER.
- Then drag the Fill Handle to the end of the data. This will copy the formula and remove all the extra spaces from the texts.
2. Removing Duplicates to Clean Data in Excel
To remove the duplicates, we will identify the duplicates in our datasheet first, then we will remove the duplicates.
There are some Sales data for some Items in a shop, in the following dataset.
- First, select the data set and click on Conditional Formatting from the Home
- After that select Highlight Cell Rules from the drop-down, and then select Duplicate Values.
- Next, choose your preferred highlighting option from the Duplicate Values dialogue box and click OK.
- Now, you will be able to see that the duplicate values are highlighted in the worksheet.
- At this stage, we will again select the data set and click on Remove Duplicates from the Data
- Then, from the Remove Duplicates dialogue box, you should check the box My data has headers. Then you should check the boxes of Columns that you want to search for duplicates.
- Press OK. This will remove the duplicate data.
Read More: Using Excel to Clean and Prepare Data for Analysis (10 Examples)
3. Utilizing Find and Replace Option to Clean Data in Excel
Data can be easily manipulated by using Excel’s Find and Replace tool. We’ve listed the marks of a group of students from a certain class in the set of data below. There were a few who didn’t show up for the test. Absent is the word we use to identify these. “N/A” should be used in place of this. After that, we’ll replace the headers’ formatting as well.
3.1 Using Find and Replace Feature for Replacing Data
First, we will learn how we can replace data by using Find and Replace feature in Excel.
- First, we select our data and use a shortcut key for Find & Replace. It is CTRL + H. In the Find & Replace dialogue box, type Absent in the Find What. Then type N/A in the Replace With box.
- Afterward, press Replace All.
- After that, a dialogue box will open, and click OK.
Afterward, Excel will replace words automatically as shown in the following screenshot.
3.2 Using Find and Replace Feature for Replacing Formatting
Now we will try to replace the formatting of the Headers.
- First, we will select our data set and use the keyboard shortcut key CTRL + H. Then click on Format. Afterward, select Choose Format From Cell.
- Second, select the Format option from the Format drop-down.
- Third, select the desired Color and Formatting that you want to replace with. Then click Replace All.
Eventually, your desired format should replace all of your selected cells.
Read More: How to Clean Up Raw Data in Excel (10 Suitable Ways)
4. Deleting All Formatting to Clean Data in Excel
To delete all the formattings we can utilize a very simple operation of Excel. It is the Clear Formats option.
- Select the data set and click on the Clear option from the Editing Group. Next, select Clear Formats from the drop-down.
At this stage, you will find that all of the formatting from the data has been removed.
Read More: 19 Practical Data Cleaning Techniques in Excel
5. Applying Find & Select Feature to Treat All Blank Cells in Excel
Also, we can apply the Find & Select tool of Excel for treating all the blank cells of a data set. Let’s proceed step by step.
- We start by selecting our data and then click on the Find & Select from the Editing Group. Next, select Go To Special from the drop-down.
- Then check the Blanks option and press OK.
- Now just type the text you want to display in the Blank cells. Then press CTRL + ENTER. After that, your desired text should replace all the blanks.
6. Adjusting Text to Lower, Upper, and Proper Case to Clean Data in Excel
Here, we are going to use 3 functions of Excel. These are LOWER, UPPER, and PROPER functions. These functions convert a text in lower case, upper case, and proper case texts respectively. It is enough to know only the text to convert to work with these functions.
In the following data, we have some texts and we will convert these texts to lower case, upper case, and proper case texts respectively.
- In cell C5 we can use the following formula to convert the texts into lowercase texts.
- Then we drag the Fill Handle to the end of the data set to obtain the rest of the lowercase texts.
- To convert the texts into uppercase texts, we will use the following formula in cell D5.
By dragging the fill handle we will get the rest of our data.
- We can use the following formula in cell E5 to change the texts into Proper form.
- Now we can see our texts converted to Lowercase, Uppercase, and Proper Case forms.
7. Marking Errors to Clean Data in Excel
In Excel, it is often required to mark the cells where errors have occurred. At first, we will highlight the errors with a color. After that, we will replace the errors with our desired texts.
7.1 Highlighting Error in Data Set
- First, we select our data and then we select Conditional Formatting. From the Conditional Formatting drop-down select New Rule.
- Then choose Format only cells that contain and after that select Errors from the drop-down.
- Thereafter click on Format.
- Then choose your desired color for formatting and press OK.
- At this point, you will be able to see the errors in your data set highlighted with your specified color.
7.2 Replacing Error with a Text in Data Set
Now we will replace the errors with our desired texts. Let’s proceed one step at a time.
- After selecting the data set hit F5 and a Go To dialogue box will open. Then click on Special.
- Now check the Formulas option and untick everything except Errors. Then press OK.
- Then simply type the text you want to display. In this case, we are typing Error.
- Then press CTRL + ENTER. Then you will get cells with errors replaced with your desired texts.
Converting Numbers Stored as Text into Numbers in Excel
It’s not uncommon for us to insert the incorrect data type into the cells. For instance, storing the number in the form of texts. So, it’s very important that we clean our data before we use it.
In the following data set, we have 3 data. 1st data is a normal number, the second data is recorded as text from the number group under the Home tab, and the 3rd data has a leading apostrophe. So this will store as a text. Our goal is to convert these two text data into numbers.
- First, Copy the cells B5:B7 and Paste them into cells C5:C7.
- Now, we take 1 as our reference number. Now, select cell D5 and copy it. Then select cells C5:C7 and click on PASTE.
- After that, click on Paste Special from the drop-down.
- Now, in the Paste Special dialogue box, check Multiply from the Operation group and click OK.
After that, all the data that were sorted as text is converted into a number format.
How to Use Spell Check in Excel
Excel does not do a spell check on the data you enter into the cells. However, there is a way to check your spelling in Excel if you so desire.
Some words have been misspelled in the following data. Excel’s Spell Check tool will be used to see if it can help us fix the errors.
- At first, Copy the cells from B5:B8 and Paste them to the cells C5:C8.
- Then select all of the misspelled words and press F7 on your keyboard.
- Later, a Spelling dialogue box will pop up. Excel will give suggestions to your misspelled words sequentially.
- If you find your desired word in the pop-up box then select that word and click Change.
- Then you will be moved to the next misspelled word.
- When spell check is complete, Excel will show a pop-up message saying Spell check complete. You’re good to go.
How to Break Data Using Text to Column Feature in Excel
In Excel, there are occasions when we need to break up a large amount of data into multiple cells. Excel’s Text To Column tool is what we’ll be using to do this. We’ll work our way through it step by step now.
In the set of data given below, we have words cramped together by commas. Our aim is to divide these words into separate cells.
- Using the Data tab, choose our data set and then click on Text to Column.
- After that, a dialogue box will open. As our data is not of a fixed length, we need to check the Delimited option from that dialogue box. Then click Next.
- Now we need to tick the option by which we are going to separate our words. In this case, we will select Comma. Once you’ve done that, simply click Next.
- To begin storing the words, we must now define the Initial Cell. On the screenshot, there is a marked area where you should click.
- Now, select cell C5 and then click on the marked portion of the screenshot.
- Then click on Finish.
After that, you will see that the words are separated into different cells.
Things to Remember
- Carefully select My data has header in the Remove Duplicates dialogue box when deleting duplicates in order to avoid errors.
- Your Cell Borders and Text Alignments will be gone once you’ve cleared all formatting. Then simply select the desired border and alignment options to get them back.
- When you are inserting the desired text while using Find and Replace, make sure you don’t click anything. Just type the text and then press CTRL + ENTER.
- After selecting the Delimiter (Comma) in the Text to Column tool, make sure you choose the Initial Cell correctly.
Finally, we’ve arrived at the end of our article. Thank you for your time and attention for reading this far! That being said, I genuinely hope that you’ve received the answers you were looking for. Let us know if you have any other questions or concerns about this topic in the comments below. We welcome your thoughts and ideas on how to improve the articles more. Happy learning!