How to Calculate Levenshtein Distance in Excel (4 Easy Methods)

In this article, we will show you 4 methods of how to calculate Levenshtein distance in Excel. To demonstrate our methods, we have selected a dataset with 3 columns: “Company”, “Name (Actual)”, and “Name (Automated)”. This dataset represents employee names collected through 2 methods, and we are going to find the distance between these two.

how to calculate levenshtein distance in excel Intro


Download Practice Workbook


What Is Levenshtein Distance?

In 1965, Soviet mathematician Vladimir Levenshtein first shared the thought about the distance between two words. Because of that, this distance is called so. This distance tells us how many edits need to be done to make two strings the same.

For example, “Ross Geller” and “Ros Gella” needs the following edits –

  • First, we need to add an “s”.
  • Next, we need to replace a” with “e”.
  • Finally, “r” is added to make both words the same.

Thus the Levenshtein distance between those two words is 3. Similarly,  “Apex” and “Apox” will have 1 Levenshtein distance between them (need to change “e” or “o”).

Moreover, we use this distance in the field of Computer Science and information technology. This distance is also known as the Edit distance.


4 Ways to Calculate Levenshtein Distance in Excel

1. Using Custom Function to Calculate Levenshtein Distance

For the first method, we will create a User Defined function to calculate Levenshtein distance in Excel. We can create a custom function inside the VBA Module. Moreover, we have added a new column Distance” in our dataset.

Steps:

Before, typing our code we need to bring up the VBA Module. To do that –

  • First, from the Developer tab >>> select Visual Basic.

Alternatively, you can press ALT+F11 to do this too. The “Microsoft Visual Basic for Application” will appear after this.

how to calculate levenshtein distance in excel VBA Module

  • Secondly, from Insert >>> select Module.

Here, we will type our code.

  • Thirdly, type the following code inside the Module.
Option Explicit
Public Function Levenshtein_Distance(x1 As String, x2 As String)
Dim p As Integer
Dim q As Integer
Dim r As Integer
Dim s As Integer
Dim lDistance() As Integer
Dim t As Integer
Dim u As Integer
r = Len(x1)
s = Len(x2)
ReDim lDistance(r, s)
For p = 0 To r
    lDistance(p, 0) = p
Next
For q = 0 To s
    lDistance(0, q) = q
Next
For p = 1 To r
    For q = 1 To s
        If Mid(x1, p, 1) = Mid(x2, q, 1) Then
            lDistance(p, q) = lDistance(p - 1, q - 1)
        Else
            t = lDistance(p - 1, q) + 1
            u = lDistance(p, q - 1) + 1
            If u < t Then
                t = u
            End If
            u = lDistance(p - 1, q - 1) + 1
            If u < t Then
                t = u
            End If
            lDistance(p, q) = t
        End If
    Next
Next
Levenshtein_Distance = lDistance(r, s)
End Function

how to calculate levenshtein distance in excel VBA Code

VBA Code Breakdown

  • First, we are calling our Function Levenshtein_Distance.
  • Then, we define the variable types.
  • After that, we are finding the string lengths of two values, which we will provide inside the function.
  • Then, we use the For Next Loop to go through our all alphabets.
  • Thus, we can find our target distance, we just need to run this function inside our dataset.

Now, we are going to type the function.

  • First, select cell E5 and type the following formula.

=Levenshtein_Distance(C5,D5)

We are providing the two strings in cells C5 and D5 to find our distance.

How this distance is found, we have already discussed earlier. Thus, we have shown you how to use a custom function to calculate Levenshtein distance in Excel. Moreover, the final step should look like this.

how to calculate levenshtein distance in excel


2. Calculate Levenshtein Distance by Creating Matrix

In this section, we will create a Matrix to find out the Levenshtein distance between two strings. We have already added a few cells to our dataset. Moreover, we are using MID, IF, OR, LEN, MIN, and OFFSET functions here.

We will calculate the distance between “Exceldemy” and “Microsoft”. There are 12 rows and columns in the Matrix, you can add as many as you want to facilitate your need.

Steps:

  • First, select the cell range D4:O4.
  • Secondly, type the following formula.

=MID($R$11,D5,1)

The MID function returns a number of characters from a value. Here, the value of cell D5 is 1. Therefore, we are telling the function to return the first character from the “Exceldemy” string. Hence, we will get “E” as the output.

how to calculate levenshtein distance in excel Matrix

  • Next, press CTRL+ENTER.

This will AutoFill our formula to the selected cells.

Then, we want that a similar result for the string “Microsoft” on columnB”.

  • Select the cell range B6:B17 and type this formula.

=MID($R$12,C6,1)

Again, the MID function returns a number of characters from a value. Here, the value of cell C6 is 1. Therefore, we are telling the function to return the first character from the “Microsoft” string. Hence, we will get “M” as the output.

how to calculate levenshtein distance in excel Matrix 2

  • Next, press CTRL+ENTER.

This will AutoFill our formula to the selected cells. We will type another formula.

  • Select the cell range D6:O17 and type this formula.

=IF(OR(D$5>LEN($R$11),$C6>LEN($R$12)),"",IF(D$4=$B6,OFFSET(D6,-1,-1,1,1), MIN(OFFSET(D6,-1,0,1,1)+1,OFFSET(D6,0,-1,1,1)+1,OFFSET(D6,-1,-1,1,1)+1) ))

Formula Breakdown

  • We are using nested IF functions here. The first part is checking if our value lengths are more than the Matrix size or not. If it is then, we will output a blank cell. Else, another IF function will kick in.
  • OR(D$5>LEN($R$11),$C6>LEN($R$12))
    • Output: FALSE.
    • We are checking whether both strings are more than the row and column number respectively. Value of D5 and C6 both are 1. And the length of the strings is more than 1 also. Therefore it will provide False output. So, the next part of the formula will run.
  • Then we check if the single character of the two strings matches by position.
  • IF(D$4=$B6,OFFSET(D6,-1,-1,1,1), MIN(OFFSET(D6,-1,0,1,1)+1,OFFSET(D6,0,-1,1,1)+1,OFFSET(D6,-1,-1,1,1)+1) )
    • Output: 1.
    • This part checks whether the strings match or not. As our first character does not match then the MIN portion will execute. From that part, we will get MIN(2,2,1). The smallest value is 1. Therefore we will get 1.

  • Next, press CTRL+ENTER.

This will AutoFill our formula to the selected cells.

how to calculate levenshtein distance in excel Matrix 3

Again, we will use another formula.

  • Type this formula in cell R13.

=OFFSET($C$5,LEN($R$12),LEN($R$11),1,1)

Formula Breakdown

  • The LEN function returns the string lengths.
  • Our formula reduces to -> OFFSET($C$5,9,9,1,1)
    • Output: 8.
    • Our reference point is cell C5. From that cell, it will move 9 cells down and then 9 cells to the right. The two ones at the end refer to the height and width of the reference. This will refer to cell L14, which has the value 8. Thus, we are getting the output.

  • Next, press ENTER.

It will calculate the Levenshtein distance as 8, meaning we need to perform 8 changes to make both strings the same.

how to calculate levenshtein distance in excel Matrix Final

Additionally, we can change the strings to verify our Matrix works perfectly to calculate the distance.

Read More: How to Calculate Mahalanobis Distance in Excel (Step by Step)


Similar Readings


3. Use of LAMBDA Function to Calculate Levenshtein Distance

For the third method, we can use the LAMBDA function to calculate the Levenshtein distance in Excel. This function helps us to create a custom function without using VBA. However, this function is only available on Office 365.

Steps:

  • First, from the Formulas tab >>> select Name Manager.

how to calculate levenshtein distance in excel Name Range

A window will appear.

  • Then, select New.

Another window will appear.

  • Next, type “LEVDISTANCE” inside the Name field.
  • After that, type this formula in the “Refers to” field.

=LAMBDA(x,y,[p],[q],[z],
        LET(
            i,IF(ISOMITTED(p),1,p),
            j,IF(ISOMITTED(q),1,q),
            x_i,MID(x,i,1),
            y_j,MID(y,j,1),
            init_array,MAKEARRAY(
                    LEN(x)+1,
                    LEN(y)+1,
                    LAMBDA(r,c,IFS(r=1,c-1,c=1,r-1,TRUE,0))
                    ),
            qq,N(NOT(x_i=y_j)),
            this_z,IF(ISOMITTED(z),init_array,z),
            option_x,INDEX(this_z,i+1-1,j+1)+1,
            option_y,INDEX(this_z,i+1,j+1-1)+1,
            option_c,INDEX(this_z,i+1-1,j+1-1)+qq,
            new_val,MIN(option_x,option_y,option_c),
            overlay,MAKEARRAY(
                    LEN(x)+1,
                    LEN(y)+1,
                    LAMBDA(r,c,IF(AND(r=i+1,c=j+1),new_val,0))
                    ),
            new_z,this_z+overlay,
            new_i,IF(i=LEN(x),IF(j=LEN(y),i+1,1),i+1),
            new_j,IF(i<>LEN(x),j,IF(j=LEN(y),j+1,j+1)),
            is_end,AND(new_i>LEN(x),new_j>LEN(y)),
            IF(is_end,new_val,LEVDISTANCE(x,y,new_i,new_j,new_z))
            )
)

Here, we are creating a LAMBDA function called “LEVDISTANCE”. It has 5 arguments, and out of those only 2 are mandatory. We are mainly implementing the Wagner-Fischer algorithm to find out the Levenshtein distance here.

  • Then, press OK.

how to calculate levenshtein distance in excel LAMBDA Function

Now, our LAMBDA function is ready. We can start typing inside our dataset to calculate the Levenshtein distance.

  • Type the following formula in cell E5.

=LEVDISTANCE(C5,D5)

This function finds the Levenshtein distance between two strings.

  • Then, press ENTER.

how to calculate levenshtein distance in excel LAMBDA Function 2

We can AutoFill the formula, our output looks like this.


4. Applying VBA to Find Levenshtein Distance in Excel

For the last method, we will use Excel VBA to calculate the Levenshtein distance.

Steps:

Sub L_Distance()
Dim p As String, q As String
    Application.ScreenUpdating = False
    With Sheets("VBA")
        For t = 5 To 10
            If Trim(.Cells(t, "C").Value) = Trim(.Cells(t, "D").Value) Then
                .Cells(t, "E").Value = 1
                GoTo Done
            End If
            p = Trim(.Cells(t, "C").Value)
            q = Trim(.Cells(t, "D").Value)
            Len1 = Len(p)
            Len2 = Len(q)
            Same = 0
            For z = 1 To Len2
                If InStr(1, p, Mid(q, z, 1), 1) Then
                    Same = Same + 1
                    p = Replace(p, Mid(q, z, 1), "*", 1, 1)
                End If
            Next z
            .Cells(t, "E").Value = Same / Len1
Done:
            p = LCase(.Cells(t, "C"))
            q = LCase(.Cells(t, "D"))
            maxlen = IIf(Len(p) > Len(q), Len(p), Len(q))
            .Cells(t, "F").Value = (maxlen - VBA_Lev_Distance(p, q)) / maxlen
        Next t
    End With
    Application.ScreenUpdating = True
End Sub

Public Function VBA_Lev_Distance(p As String, q As String)
Dim L_D() As Long, x As Long, y As Long, z As Long, v As Long, w As Long
    x = Len(p)
    y = Len(q)
    ReDim L_D(0 To x, 0 To y)
    For z = 1 To x
        L_D(z, 0) = z
    Next z
    For v = 1 To y
        L_D(0, v) = v
    Next v
    For v = 1 To y
        For z = 1 To x
            w = IIf(Mid(p, z, 1) = Mid(q, v, 1), 0, 1)
            L_D(z, v) = WorksheetFunction.Min(L_D(z - 1, v) + 1, _
            L_D(z, v - 1) + 1, L_D(z - 1, v - 1) + w)
        Next z
    Next v
    VBA_Lev_Distance = L_D(x, y)
End Function

how to calculate levenshtein distance in excel Code part 1

This is the second part of the code.

VBA Code Breakdown

  • First, we are calling our Sub Procedure L_Distance.
  • Then, we define the variable types.
  • After that, we are setting our Sheet as “VBA”.
  • Then, we use the For Next Loop to go through all alphabets in our strings.
  • Next, we call a function named  VBA_Lev_Distance.
  • We have declared the variable types, and used another For Next Loop to find the Levenshtein distance.

This is what our original dataset looks like.

how to calculate levenshtein distance in excel VBA Dataset

Now, we will execute our code.

  • First, Save this Module.
  • Secondly, click inside our code.
  • Finally, press the Run button.

After executing the code, it will tell us the normal similarity between the two strings in column E, and the Levenshtein distance similarity in column F. In conclusion, we have shown you all 4 methods of how to achieve our goal of finding Levenshtein distance.

how to calculate levenshtein distance in excel Outro


Practice Section

We have added a practice dataset for each method in the Excel file. Therefore, you can follow along with our methods easily.


Conclusion

We have shown you 4 quick-and-easy to understand methods of how to calculate Levenshtein distance in Excel. If you face any problems regarding these methods or have any feedback for me, feel free to comment below. Moreover, you can visit our site Exceldemy for more Excel-related articles. Thanks for reading, keep excelling!


Related Articles

Rafiul Haq

Rafiul Haq

Hello! This is Rafiul. I am passionate about all things related to data and MS Excel is my favorite application. I want to make people's life easier by writing easy-to-follow and in-depth guides here at Exceldemy.

We will be happy to hear your thoughts

Leave a reply

ExcelDemy
Logo