While working with larger data sets, many times we try to filter out similar types of values. One of these types of matching is called Fuzzy Match, where the values are not exactly the same, but still, they are matched based on their similarity.
Today I will be showing how you use the VLOOKUP function of Excel to search for Fuzzy Match.
An Introduction to Fuzzy Match
A Fuzzy Match is a type of partial match.
In these types of matches, one text does not match fully to the other text. But the important sections of the text do match the other text.
In the given example, the book “The History of India during the World War” contains three important sections: History, India, and World War.
Therefore all the books that contain either one or more than one of these sections will fuzzily match the book.
So, the fuzzy matches are:
- The History of the Second World War
- History of the Ancient Greece
- World War: Causes and Effects
- The Indus Civilization: An Ancient History
- India Wins Freedom
- Adolf Hitler: Before and After the World War
- The Discovery of India
Download Practice Workbook
VLOOKUP Fuzzy Match in Excel
Here we’ve got a data set with the Names of some books of a bookshop called Kingfisher Bookstore.
Our objective today is to use the VLOOKUP function of Excel to generate some Fuzzy Matches.
1. VLOOKUP Fuzzy Match Using Wildcards (Matching the Entire Lookup_Value)
First of all, we shall generate some fuzzy matches using the wildcard character the Asterisk (*) symbol.
But remember, you have to match the entire lookup_value in this method, not the separate parts of the lookup_value.
For example, we can find a book containing the text “Second World War” in this way.
Only the books having the full text “Second World War” will match.
The formula is simple. Place an Asterisk (*) symbol on both ends of the lookup_value text.
The formula will be:
=VLOOKUP("*Second World War*",B4:B21,1,FALSE)
Or you can use a cell reference in place of the original text too. Use the Ampersand (&) symbol to merge them into a single text. Like this:
2. Fuzzy Match Using VBA
The method in the previous section fulfills our purpose partially, but not to the fullest.
Now we will derive a formula using a VBA code that will fulfill our purpose almost completely.
Open a VBA window and inserts the following VBA code in a new module:
Function FUZZYMATCH(str As String, rng As Range) str = LCase(str) Dim Remove_1(5) As Variant Remove_1(0) = "," Remove_1(1) = "." Remove_1(2) = ":" Remove_1(3) = "-" Remove_1(4) = ";" Remove_1(5) = "?" Dim Rem_Str_1 As String Rem_Str_1 = str Dim rem_count_1 As Variant For Each rem_count_1 In Remove_1 Rem_Str_1 = Replace(Rem_Str_1, rem_count_1, "") Next rem_count_1 Words = Split(Rem_Str_1) Dim i As Variant For i = 0 To UBound(Words) If Len(Words(i)) = 1 Or Len(Words(i)) = 2 Then Words(i) = Replace(Words(i), Words(i), " bt ") End If Next i Dim Final_Remove(26) As Variant Final_Remove(0) = "the" Final_Remove(1) = "and" Final_Remove(2) = "but" Final_Remove(3) = "with" Final_Remove(4) = "into" Final_Remove(5) = "before" Final_Remove(6) = "after" Final_Remove(7) = "beyond" Final_Remove(8) = "here" Final_Remove(9) = "there" Final_Remove(10) = "his" Final_Remove(11) = "her" Final_Remove(12) = "him" Final_Remove(13) = "can" Final_Remove(14) = "could" Final_Remove(15) = "may" Final_Remove(16) = "might" Final_Remove(17) = "shall" Final_Remove(18) = "should" Final_Remove(19) = "will" Final_Remove(20) = "would" Final_Remove(21) = "this" Final_Remove(22) = "that" Final_Remove(23) = "have" Final_Remove(24) = "has" Final_Remove(25) = "had" Final_Remove(26) = "during" Dim w As Variant Dim ww As Variant For w = 0 To UBound(Words) For Each ww In Final_Remove If Words(w) = ww Then Words(w) = Replace(Words(w), Words(w), " bt ") Exit For ' End If Next ww Next w Dim Lookup As Variant Dim x As Integer x = rng.Rows.count ReDim Lookup(x - 1) Dim j As Variant j = 0 Dim k As Variant For Each k In rng Lookup(j) = k j = j + 1 Next k Dim Lower As Variant ReDim Lower(UBound(Lookup)) Dim u As Variant For u = 0 To UBound(Lookup) Lower(u) = LCase(Lookup(u)) Next u Dim out As Variant ReDim out(UBound(Lookup), 0) Dim count As Integer co = 0 mark = 0 Dim m As Variant For m = 0 To UBound(Lower) Dim n As Variant For Each n In Words Dim o As Variant For o = 1 To Len(Lower(m)) If Mid(Lower(m), o, Len(n)) = n Then out(co, 0) = Lookup(m) co = co + 1 mark = mark + 1 Exit For End If Next o If mark > 0 Then Exit For End If Next n mark = 0 Next m Dim output As Variant ReDim output(co - 1, 0) Dim z As Variant For z = 0 To co - 1 output(z, 0) = out(z, 0) Next z FUZZYMATCH = output End Function
This code builds a function called FUZZYMATCH.
Save it (Visit this article to see how to save a VBA code in Excel)
This FUZZYMATCH function finds out all the Fuzzy Matches of a Lookup Value directly.
The Syntax of this FUZZYMATCH function is:
To find out the Fuzzy Matches of the book “The History of India during the World War”, enter this lookup_value in a cell (D4 in this example) and enter this formula in another cell:
See, we have found out all the Fuzzy Matches of the book “The History of India during the World War”
- Here D4 is the cell reference of the lookup_value (“The History of India during the World War”).
- B4:B21 is the lookup_range.
Let’s find out the Fuzzy Matches of another book called “A Notebook of the Causes behind the Crime of Big Cities”.
Enter this lookup_value in a cell (D4 in this example) and enter this formula in another cell:
Explanation of the Formula
- The FUZZYMATCH function is the function we built in VBA. It takes a string called lookup_value and a range of cells called lookup_range and returns an array of all the Fuzzy Matches of the string.
- Therefore FUZZYMATCH(D4,B4:B21) returns an array of all the Fuzzy Matches of the string in cell D4 from the range B4:B21.
3. Fuzzy Match Using Excel’s Fuzzy Lookup Add-in
Microsoft Excel provides an Add-in called Fuzzy Lookup. Using it, you can match two tables for Fuzzy Lookup.
Download and install the Add-in from this link.
After downloading and installing it successfully, you will find the Fuzzy Lookup Add-in your Excel Toolbar like so.
Then arrange the data sets into two tables that you want to match.
Here I’ve got two tables containing two lists of books from two bookshops called Robert Bookshop and Martin Bookshop.
Next, go to Fuzzy Lookup>Fuzzy Lookup in Excel Toolbar.
Click it. You will get a Fuzzy Lookup table created in the side panel of your workbook.
In the Left Table and Right Table options, choose the names of the two tables.
For the sake of this example, choose Robert and Martin.
Then in the Columns section, choose the names of the columns of each table.
In the Match Column section, select the type of match that you want between the two columns. For Fuzzy Match, select Default.
Finally, click on Go. You will get the matching ratio of the tables in a new table.
Using these methods, you can use the VLOOKUP function of Excel to search for Fuzzy Match. Though these methods are not 100% efficient, still they are very useful. Do you have any questions? Feel free to ask us.