RegEx in Excel (For Matching, Extracting and Replacing)

In this Excel tutorial, you will learn how to use RegEx in Excel.  We will first learn about RegEx and then explore how we can use different VBA user-defined functions that use RegEx in Excel for various kinds of applications, such as finding strings and numbers, extracting matched strings or texts, and replacing matched texts with other texts.

Here, we prepared the dataset using the Microsoft 365 version of Excel. However, you can also use older versions of Excel, starting from Excel 2007,  to use the functions and features shown in this article.

RegEx is very useful when it comes to matching patterns, extracting data, and replacing strings. Unfortunately, Excel doesn’t have any built-in functions that can use RegEx. Hence, we need to use some user-defined functions to utilize the power of RegEx in Excel.

 RegEx in Excel


Download Practice Workbook


What Is RegEx in Excel?

RegEx stands for “Regular Expression”. It refers to a series of characters that form a search pattern. Numerous programming languages, text editors, and tools make extensive use of regular expressions for operations like searching, validating, extracting data, and manipulating text.


Are There RegEx Functions in Excel?

Unfortunately, there is no RegEx function in Excel. But there is a library called Microsoft VBScript Regular Expression 5.5. Using this library, we can write custom or User-defined functions that can work with RegEx. To enable the library, follow the steps below.

  • Open the VBA Editor window by clicking Alt+F11 and then go to Tools >> References to open the reference library.

Opening VBA Reference Library

  • Now click on the Microsoft VBScript Regular Expressions 5.5 Reference and then click OK.

Adding Microsoft VBScript Regular Expressions 5.5 Reference

Now you can use the RegExp type object in your code, which will allow us to create RegEx in custom functions.


How to Match Text Using Regex in Excel?

To search for a specific pattern using RegEx, we created a custom function named RegEx_Match. To create the function, write down the following code in the Visual Basic Editor in Excel.

VBA Code for Creating Custom Function to Match RegEx

Function RegEx_Match( _
ByVal My_Text As Variant, _
ByVal Text_Pattern As Variant, _
Optional IfMatched As Variant = "Matched", _
Optional IFUnMatched As Variant = "Unmatched", _
Optional CaseMatch As Boolean = True)
Dim regEX As New RegExp
On Error GoTo ErrHndlr

    If TypeName(My_Text) = "Range" Then
        My_Text = My_Text.Value
    End If
    If TypeName(Text_Pattern) = "Range" Then
        Text_Pattern = Text_Pattern.Value
    End If
    
    If Text_Pattern <> "" Then
        With regEX
            .Global = True
            .MultiLine = True
            .pattern = Text_Pattern
            If CaseMatch = True Then
                .IgnoreCase = False
            Else
                .IgnoreCase = True
            End If
        End With
        
        If regEX.Test(My_Text) Then
        RegEx_Match = IfMatched
        Else
            RegEx_Match = IFUnMatched
        End If
    Else
    RegEx_Match = "Invalid Pattern"
    End If
    Exit Function
    
ErrHndlr:
    RegEx_Match = "Value Error"
End Function

How Does the Code Work?

Function RegEx_Match(ByVal My_Text As Variant,ByVal Text_Pattern As Variant,Optional IfMatched As Variant = "Matched",Optional IFUnMatched As Variant = "Unmatched",Optional CaseMatch As Boolean = True)
  • Here, I have declared a public function named RegEx_Match. The syntax of the function is as follows:

RegEx_Match(My_Text,Text_Pattern,IfMatched,IFUnMatched,CaseMatch)

This function will have 5 arguments, of which 3 are optional. The details of the arguments are as follows:

  1. My_Text= This argument is mandatory. It takes a text or reference to a cell containing texts where we need to search a RegEx pattern.
  2. Text_Pattern= This argument is mandatory. It takes the RegEx expression or a reference to a cell containing RegEx.
  3. IfMatched = This argument is optional. It is the text that we will get in return when My_Text matches the Text_Pattern. The default value is “Matched”.
  4. IFUnMatched =  This argument is optional. It is the text that we will get in return when the My_Text does not match the Text_Pattern. The default value is “Unmatched”.
  5. CaseMatch = It is an optional boolean-type argument. Set this argument to “TRUE” if we want to match the case. Otherwise, set the argument to “FALSE”. The default value is TRUE.
Dim regEX As New RegExp
  • This line creates a new RegExp object. We use the RegExp object to perform regular expression operations in VBA.
On Error GoTo ErrHndlr
  • If any error occurs while running the code, it will jump to another section named ErrHndlr.
If TypeName(My_Text) = "Range" Then
My_Text = My_Text.Value
End If
If TypeName(Text_Pattern) = "Range" Then
Text_Pattern = Text_Pattern.Value
End If
  • These lines of code check to see if the My_Text and Text_Pattern arguments are ranges. If they are, then the values of the arguments are set to the values of the cells in the ranges.
If Text_Pattern <> "" Then
    With regEX
        .Global = True
        .MultiLine = True
        .pattern = Text_Pattern
        If CaseMatch = True Then
            .IgnoreCase = False
        Else
            .IgnoreCase = True
        End If
    End With
  • These lines of code check to see if the Text_Pattern argument is empty. If it is not empty, then the RegExp object is configured with the settings specified by the Global, MultiLine, IgnoreCase, and CaseMatch arguments. If the CaseMatch is TRUE then the IgnoreCase is set to False. Else, it is set to TRUE.
        If regEX.Test(My_Text) Then
        RegEx_Match = IfMatched
        Else
            RegEx_Match = IFUnMatched
        End If
  • In these lines of code, we check to see whether My_Text matches Text_Pattern. If they are matched, then IfMatched is returned. Otherwise, IFUnMatched is returned.
 Else
    RegEx_Match = "Invalid Pattern"
    End If
    Exit Function
  • If the Text_Pattern is empty, then the function will return “Invalid Pattern”. Then the function will exit.
ErrHndlr:
    RegEx_Match = "Value Error"
End Function
  • In the ErrHndlr section, the RegEx_Match will return “Value Error” and end the function.

Now, we will apply the RegEx_Match function on nine different occasions. Let’s explore them one by one.


How Many Ways Can We Use RegEx in Excel?

There are at least nine different applications of RegEx in Excel. Some of them are:

1. Matching Valid Email Addresses
2. Matching SKU Codes
3. Checking Social Security Number
4. Validating Phone Numbers
5. Applying Excel IF Function with RegEx


1. How to Use RegEx for Matching Valid Email Addresses in Excel?

Sometimes we need to verify whether a cell contains a valid mail address or not. To do so, we can use the RegEx pattern of email address.

  • A mail address usually starts with alphanumeric characters, which is the local part, then there is a @ sign and finally a dot(.) and the top-level domain (TDL) at the end. Hence, the RegEx of an email address can be written as ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$. Here,
    • ^ states the starting point of the string.
    • [a-zA-Z0-9._%+-]+ matches single or multiple instances of an alphanumeric character, dot (.), underscore (_), percent (%), plus (+), or hyphen (). This section represents the part of the email address that comes before the @
    • @ matches the literal @
    • [a-zA-Z0-9.-]+ matches single or multiple instances of any alphanumeric character, dot (.), or hyphen (). This part refers to the email address’s domain name.
    • \. corresponds to the literal dot (.) character. Because dot (.) is a special metacharacter in RegEx, it must be escaped with a backslash (\).
    • [a-zA-Z]{2,} matches two or more occurrences of any alphabetic character. This part represents the email address’s top-level domain (TLD).
    • $ asserts the end of the string.
  • In the following dataset, we verify whether cell C5 contains the RegEx of a valid email address (C16). We use the following formula to detect valid email addresses:
=RegEx_Match(C5,$C$16,"Valid","Invalid")
Using RegEx to Validate Mail Addresses

Click here to enlarge the image

Here, we set the IfMatched argument to “Valid” so that when the address matches with the RegEx, it results in “Valid”. For the same reason,  we set the IfUnmatched argument to “Invalid”.

Read More: Use REGEX without VBA


2. Can We Use RegEx for Matching SKU Codes?

Sure. In this example, we have a bunch of SKU codes for products in a shop. As we know, every category item has its own SKU pattern. By matching a specific SKU pattern, we can determine which category a product is in.

  • Here, we need to identify the codes that match a certain pattern, COM-\d{3}-[A-Z]{3}$ which is the SKU pattern of the Computer category.
  • In this situation, we use the following formula:
=RegEx_Match(C5,$C$16)

Using RegEx_Match function to Match SKU Codes

As we can see, the bottom two SKU codes do not match with the RegEx, hence we get the “Unmatched” result.

Read More: Perform Pattern Matching in Excel


3. How to Use RegEx for Checking Social Security Numbers?

Different countries provide their citizens with a unique number that can be used to identify them and provide different kinds of services. In the USA, the unique number is called the Social Security Number (SSN). The social security number has a unique pattern that contains a total of nine numbers.

  • The RegEx pattern of a SSN is \d{3}-\d{2}-\d{4}$.
  • Now, to verify whether a given number is actually a valid SSN or not, we can use our custom function RegEx_Match.
  • In the dataset below, we checked whether the given numbers are of SSN pattern.
  • The SSN RegEx pattern is in cell C16. I have used the following formula to verify the pattern in cell C5:
=RegEx_Match(C5,$C$16)

Using RegEx_Match function to Check Social Security Numbers

Here we can see that the numbers in cells C8 and C10 are not valid SSNs. So, we got the result “Unmatched”.


4. Is It Possible to Validate Phone Numbers Using Regex?

Yes, it is easily possible. But to do that, we need to write the RegEx of valid phone numbers. As we know, every country has its own unique country code for phone numbers within that country.

  • For example, let’s assume the phone number in the USA starts with 555. We can use the RegEx ^\(555\) \d{3}-\d{4}$ to check whether a number is from the USA or not.
  • In the dataset below, I verified the number in cell C5 using the RegEx in cell C16. To do that, I have used the following formula:
=RegEx_Match(C5,$C$16)

Using RegEx_Match function to Validate the Number of USA

Read More: Use REGEX to Match Patterns


5. Can We Combine Excel IF Function with RegEx?

Certainly. The IF function allows us to set the custom returning values for both TRUE and FALSE cases. Using this function, we can customize the returning value of the RegEx_Match function.

  • In the dataset below, we have a list of people with their respective addresses.
  • Now we need to find out who is from outside the state of Texas.
  • We can combine the IF function with our user-defined RegEx_Match function.
  • The RegEx in this context is ^(?!.*Texas).*$.
  • The combined formula is as follows:
=IF(RegEx_Match(C5,$C$16)="Matched","Yes","No")

Using IF Function with RegEx

In this formula, when the RegEx_Match function returns “Matched”, the IF function will return “Yes” dictating that the address is not from Texas. Otherwise, the IF function returns “No”.

Read More: How to Find RegEx Patterns in Excel


6. How to Use RegEx to Lookup Unmatching String?

In the previous example, instead of using the IF function to show who is from outside the state of Texas, we could use the optional arguments of the RegEx_Match function: IfMatched and IfUnmatched. These two arguments allow us to set the custom return value for both Matched and Unmatched cases.

  • In the dataset below, I have again found out who is outside the Texas state but using the IfMatched and IfUnmatched arguments.
  • The formula that we have used in this case is as follows:
=RegEx_Match(C5,$C$16,"Yes","No")

Using RegEx to Look up Unmatched String

Here, we set the IfMatched argument to “No” so that when the address matches with the RegEx, it results in “No”. For the same reason, we set the IfUnmatched argument to “Yes”.


7. When to Use Excel COUNTIF Function with RegEx?

We can use the COUNTIF function with RegEx when we need to count anything that meets a certain criterion.

  • For example, in the below dataset, I have calculated how many people were from outside Texas.
  • Here, I used the following formula:
=COUNTIF(D5:D14,"Yes")

Using COUNTIF & RegEx

As there were a total of 7 “Yes” in the D5:D14 range, we got the result 7.


8. How to Use RegEx to Lookup Unmatching Character?

In RegEx, we can use the “[^ ]” sign to match any string that does not contain any of the characters that are inside the square bracket after the ^ sign.

  • For example, the RegEx ^[^\+]*$ matches any strings that don’t have the “+” at the beginning.
  • Using this RegEx, we can verify whether a phone number contains the Country Code or not as a phone number with a country code will always start with a “+” sign.
  • In the dataset below, I have used the following formula to verify phone numbers that don’t begin with a + sign:
=RegEx_Match(C5,$C$16,"No","Yes")

Using RegEx to Check Unmatched Character

Here, the strings that don’t have the “+” sign at the beginning resulted in “No” as expected.


9. How to Perform Case-Insensitive Matching with RegEx?

Our custom function RegEx_Match is by default a case-sensitive function, as the optional argument CaseMatch has the default value of “TRUE”. But if we can assign “FALSE” to the CaseMatch argument, then it will perform Case Insensitive matching.

  • In the dataset of the 2nd example, we altered the case of a character in the first Product SKU cell.
  • Then, by keeping the same RegEx, we used the following formula to perform case-insensitive matching:
=RegEx_Match(C5,$C$16,"Yes","No",TRUE)

Performing Case Insensitive Matching with RegEx

As a result, we can see that the result for the first cell comes out to be “No” as the SKU is not in the same case as compared to the RegEx.


How to Extract Strings Using RegEx in Excel?

On occasion, we not only need to check whether a string matches a pattern but also extract the matched portion of the string. To extract the matched portion, we write another VBA custom function named RegExtract. The VBA code for this function is as follows:

VBA Code of Custom Function to Extract Strings Using RegEx

Click here to enlarge the image.

Function RegExtract(My_Text As Variant, _
Text_Pattern As Variant, _
Optional Inst_Num As Variant = 0, _
Optional CaseMatch As Boolean = True)
Dim regEX As New RegExp
Dim Txt_Match_Array() As String
Dim match_indx_num As Integer
On Error GoTo ErrHndlr
If TypeName(My_Text) = "Range" Then
    My_Text = My_Text.Value
End If
If TypeName(Text_Pattern) = "Range" Then
    Text_Pattern = Text_Pattern.Value
End If
If Text_Pattern <> "" Then
    With regEX
        .Global = True
        .MultiLine = True
        .pattern = Text_Pattern
        If CaseMatch = True Then
            .IgnoreCase = False
        Else
            .IgnoreCase = True
        End If
    End With
    Set matches_list = regEX.Execute(My_Text)
    If matches_list.Count > 0 Then
      If (Inst_Num = 0) Then
        ReDim Txt_Match_Array(matches_list.Count - 1, 0)
        For match_indx_num = 0 To matches_list.Count - 1
          Txt_Match_Array(match_indx_num, 0) _
          = matches_list.Item(match_indx_num)
        Next match_indx_num
        RegExtract = Txt_Match_Array
      Else
        RegExtract = matches_list.Item(Inst_Num - 1)
      End If
    End If
End If
Exit Function
ErrHndlr:
RegExtract = "Value Error"
End Function
  • In the example below, we have a dataset containing some names of employees in a company and their Unique ID/Code.
  • The unique code is formatted in such a way that it contains their first and last names’ initial letters, their date of joining (DoJ), and their period of service (PoS).
  • In the unique code, the RegEx pattern of Date of Joining can be written as \d{2}-\d{2}-\d{4}.
  • Now, our target is to extract the joining date part from the unique code.
  • To do that, we can use the following formula:
=RegExtract(C5,$C$16)

Using RegEx to Extract Strings

As a result, we were able to extract the joining date successfully.

How Does the Code Work?

Function RegExtract(My_Text As Variant, _
Text_Pattern As Variant, _
Optional Inst_Num As Variant = 0, _
Optional CaseMatch As Boolean = True)
  • Here, I have declared a public function named RegExtract. The syntax of the function is given below.

RegExtract(My_Text,Text_Pattern,Inst_Num,CaseMatch)

This function will have 4 arguments, of which 2 are optional. The details of the arguments are given as follows:

  1. My_Text= This argument is mandatory. It takes a text or reference to a cell containing texts where a RegEx pattern will be searched.
  2. Text_Pattern= This argument is mandatory. It takes the RegEx expression or a reference to a cell containing RegEx.
  3. Inst_Num= This argument is optional. It denotes the serial number of the match, which will be returned when multiple matches are found. The default value is 0 which will return a column array of all the matches.
  4. CaseMatch = It is an optional boolean-type argument. Set this argument to “TRUE” if we want to match the case. Otherwise, set the argument to “FALSE”. The default value is TRUE.
Dim regEX As New RegExp
  • This line creates a new RegExp object. The RegExp object is used to perform regular expression operations in VBA.
Dim Txt_Match_Array() As String
Dim match_indx_num As Integer
  • Here, we declared Txt_Match_Array() which is a string type array. We will use it to store all the matches. We also declared match_indx_num to use it as an integer.
On Error GoTo ErrHndlr
  • If any error occurs while running the code, it will jump to another section named ErrHndlr.
If TypeName(My_Text) = "Range" Then
    My_Text = My_Text.Value
End If
If TypeName(Text_Pattern) = "Range" Then
    Text_Pattern = Text_Pattern.Value
End If
  • These lines of code check to see if the My_Text and Text_Pattern arguments are Range type. If they are, then the values of the arguments are set to the values of the cells in the ranges.
If Text_Pattern <> "" Then
    With regEX
        .Global = True
        .MultiLine = True
        .pattern = Text_Pattern
        If CaseMatch = True Then
            .IgnoreCase = False
        Else
            .IgnoreCase = True
        End If
    End With
  • These lines of code check to see if the Text_Pattern argument is empty. If it is not empty, then the RegExp object is configured with the settings specified by the Global, MultiLine, IgnoreCase, and CaseMatch arguments. If the CaseMatch is TRUE then the IgnoreCase is set to False. Else, it is set to TRUE.
    Set matches_list = regEX.Execute(My_Text)
  • This line of code executes the regular expression against My_Text and returns a Matches collection which is stored in matches_list.
    If matches_list.Count > 0 Then
  • This statement checks if the Matches collection has any matches. If it does, it will execute the following code:
      If (Inst_Num = 0) Then
        ReDim Txt_Match_Array(matches_list.Count - 1, 0)
        For match_indx_num = 0 To matches_list.Count - 1
          Txt_Match_Array(match_indx_num, 0) _
          = matches_list.Item(match_indx_num)
        Next match_indx_num
        RegExtract = Txt_Match_Array
      Else
        RegExtract = matches_list.Item(Inst_Num - 1)
      End If
    End If
End If
Exit Function
  • This code checks if Inst_Num is 0. If it is, the matches are stored in an array, and RegExtract is set to the array. If Inst_Num is not 0, RegExtract is set to the Inst_Num’th match in the Matches collection,matches_list.
ErrHndlr:
RegExtract = "Value Error"
End Function
  • In the ErrHndlr section, the RegExtract will return “Value Error” and end the function.

How to Replace or Remove Strings Using RegEx in Excel?

On many occasions, we not only need to detect strings with a specific pattern but also to replace those matched parts with another substring. To do this, we have created a custom function named RegReplace. The VBA code of this function is given below.

VBA Code of Custom Function to Replace Strings Using RegEx

Click here to enlarge the image.

Function RegReplace(My_Text As Variant, Text_Pattern As Variant, _
Replaced_Text As String, Optional Inst_Num As Variant = 0, _
Optional CaseMatch As Boolean = True) As Variant
    Dim regEX As New RegExp
    Dim matches_list As Object
    Dim match_indx_num As Integer
    Dim replaced_text_result As String
    On Error GoTo ErrHndlr
    If TypeName(My_Text) = "Range" Then
        My_Text = My_Text.Value
    End If
    If TypeName(Text_Pattern) = "Range" Then
        Text_Pattern = Text_Pattern.Value
    End If
    If Text_Pattern <> "" Then
        With regEX
            .Global = True
            .MultiLine = True
            .pattern = Text_Pattern
            If CaseMatch = True Then
                .IgnoreCase = False
            Else
                .IgnoreCase = True
            End If
        End With
        Set matches_list = regEX.Execute(My_Text)
        If matches_list.Count > 0 Then
            replaced_text_result = My_Text
            For match_indx_num = 0 To matches_list.Count - 1
                replaced_text_result _
                = regEX.Replace(replaced_text_result, Replaced_Text)
            Next match_indx_num
            RegReplace = replaced_text_result
        Else
            RegReplace = My_Text
        End If
    Else
        RegReplace = My_Text
    End If
    Exit Function
ErrHndlr:
    RegReplace = "Value Error"
End Function
  • In the example below, we have the same dataset with employee names and their corresponding unique codes of a special pattern.
  • Now, our target is to encrypt some parts of the unique code that match a certain format or pattern. We are actually concerned about encrypting the date of joining(DoJ) and years of service(YoS) parts.
  • The RegEx of this pattern is \d{2}-\d{2}-\d{4}-PoS-\d. And we want to replace it with the following RegEx: xx-xx-xxxx-PoS-x.
  • For that, we used the following formula:
=RegReplace(C5,$C$16,$C$17)

Using RegEx to Replace Strings

As a result, we can see that we have successfully encrypted the unique codes.

How Does the Code Work?

Function RegReplace(My_Text As Variant, Text_Pattern As Variant, _
Replaced_Text As String, Optional Inst_Num As Variant = 0, _
Optional CaseMatch As Boolean = True) As Variant
  • Here, I have declared a public function named RegReplace. The syntax of the function is given below.

RegReplace(My_Text,Text_Pattern,Replaced_Text,Inst_Num,CaseMatch)

This function will have 4 arguments, of which 2 are optional. The details of the arguments are given below:

  1. My_Text= This argument is mandatory. It takes a text or reference to a cell containing texts where a RegEx pattern will be searched.
  2. Text_Pattern= This argument is mandatory. It takes the RegEx expression or a reference to a cell containing RegEx.
  3. Replaced_Text= This argument is mandatory. It is the text by which the matched string will be replaced.
  4. Inst_Num= This argument is optional. It denotes the serial number of the match, which will be replaced when multiple matches are found. The default value is 0 which will replace all the matches.
  5. CaseMatch = It is an optional boolean-type argument. Set this argument to “TRUE” if we want to match the case. Otherwise, set the argument to “FALSE”. The default value is TRUE.
    Dim regEX As New RegExp
    Dim matches_list As Object
    Dim match_indx_num As Integer
    Dim replaced_text_result As String
  • Here we declared one RegExp type variable, regEX, an object named matches_list which will store all the matched strings in it. The match_indx_num  is a counter variable. We use this in the loop to iterate over the matches. The replaced_text_result will store the intermediate result of the replaced text.
    On Error GoTo ErrHndlr
  • If any error occurs while running the code, it will jump to another section named ErrHndlr.
    If TypeName(My_Text) = "Range" Then
        My_Text = My_Text.Value
    End If
    If TypeName(Text_Pattern) = "Range" Then
        Text_Pattern = Text_Pattern.Value
    End If
  • These lines of code check to see if the My_Text and Text_Pattern arguments are ranges. If they are, then we set the values of the arguments to the values of the cells in the ranges.
    If Text_Pattern <> "" Then
        With regEX
            .Global = True
            .MultiLine = True
            .pattern = Text_Pattern
            If CaseMatch = True Then
                .IgnoreCase = False
            Else
                .IgnoreCase = True
            End If
        End With
  • These lines of code check to see if the Text_Pattern argument is empty. If it is not empty, then the RegExp object is configured with the settings specified by the Global, MultiLine, IgnoreCase, and CaseMatch arguments. If the CaseMatch is TRUE then the IgnoreCase is set to False. Else, it is set to TRUE.
        Set matches_list = regEX.Execute(My_Text)

This line of code executes the regular expression against My_Text and returns a Matches collection which we stored in matches_list.

        If matches_list.Count > 0 Then
  • This statement checks if the Matches collection has any matches. If it does, it will execute the following:
            replaced_text_result = My_Text
            For match_indx_num = 0 To matches_list.Count - 1
                replaced_text_result _
                = regEX.Replace(replaced_text_result, Replaced_Text)
            Next match_indx_num
            RegReplace = replaced_text_result
  • Here, We initiate the temporary string that will store the texts after a replacement with My_Text value. Then, We execute For Loop to loop through each matching string in replaced_text_result and replace it with the Replaced_Text using the Replace method of regEx object. Finally, the RegReplace function returns the resulting replaced_text_result.
    Else
        RegReplace = My_Text
    End If
    Exit Function
  • If it does not find any match, RegReplace returns the original My_Text. Then the function exits.
ErrHndlr:
    RegReplace = "Value Error"
End Function
  • In the ErrHndlr section, the RegReplace will return “Value Error” and end the function.

Can We Use Regular Expression Formula Without VBA in Excel?

Yes, we can. In this example, we will see an alternative way of matching strings using RegEx without VBA in Excel.

  • For illustration,  I have taken the same dataset that we used in the matching SKU example.
  • As we know, the SKU codes have a certain pattern. It has a total of 11 characters, of which the first three are uppercase letters. Then there is a hyphen () and 3-digit numbers. The codes finish with another hyphen () and three uppercase letters.
  • Before using a formula, we need to create two named lists in Excel, which we will use later in the formula.
  • For that, first, we need to use the Named Range feature in Excel to name the list of numbers (0-9) as Numbers and the list of uppercase characters (A-Z) as Letters.
  • To do that, go to the Formula tab in the ribbon and select the Name Manager option.

Opening Name Manager from Formulas Tab

  • Now, click on the New option from the pop-out dialogue box.

Creating New Named List

  • Now, write the Name and its elements on their corresponding boxes. Finally, click OK. For the Letters list, the element will be all the uppercased alphabets (A-Z). For Numbers, it will be “1234567890”.

Creating New Named List Letters

  • After creating two lists, we need to click on Close to go back to the worksheet.

Two Newly Created Named Lists Letters and Numbers

  • The formula that we used here to match the format is below:
=AND(LEN(C5)=11, COUNT(FIND(MID(LEFT(C5,3),ROW(INDIRECT("1:"&LEN(LEFT(C5,3)))),1),UPPER(Letters)))=LEN(LEFT(C5,3)), COUNT(FIND(MID(MID(C5,5,3),ROW(INDIRECT("1:"&LEN(MID(C5,5,3)))),1), Numbers))=LEN(MID(C5,5,3)), ISNUMBER(FIND(RIGHT(C5,3), Letters)))

Regular Expression Formula Without VBA

  • Without using any regular expression directly, we matched each SKU code with this format using the above code. As we can see, the codes in cells C10 and C14 came out to be incompatible with the format.

How Does the Formula Work?

The entire formula has 4 parts that check 4 distinct criteria and the AND function connects all those logical operations. We will understand those 4 parts separately.

1st Part: Checking Entire length of the SKU codes

  • LEN(C5)=11

This condition checks if the length of cell C5 is equal to 11 characters. It ensures that the input has a length of exactly 11 characters.

2nd Part: Checking the first three characters whether they are uppercase letters

  • LEFT(C5,3)

Extracts the first three characters of C5.

  • ROW(INDIRECT(“1:”&LEN(LEFT(C5,3))))

Generates an array of sequential numbers from 1 to the length of the first three characters.

  • MID(LEFT(C5,3),ROW(INDIRECT(“1:”&LEN(LEFT(C5,3)))),1)

Retrieves each individual character from the first three characters using the generated array.

  • FIND(MID(LEFT(C5,3),ROW(INDIRECT(“1:”&LEN(LEFT(C5,3)))),1),UPPER(Letters))

Searches for each character in the named range of uppercase letters, Letters

  • COUNT(FIND(MID(LEFT(C5,3),ROW(INDIRECT(“1:”&LEN(LEFT(C5,3)))),1),UPPER(Letters)))

Counts the number of matches found.

  • COUNT(FIND(MID(LEFT(C5,3),ROW(INDIRECT(“1:”&LEN(LEFT(C5,3)))),1),UPPER(Letters)))=LEN(LEFT(C5,3))

Compare the count with the length of the first three characters of C5, which returns True if all the characters are indeed from the Letters list.

3rd Part: Checks Whether the 5th to 7th Characters are digits or not

  • MID(C5,5,3)

Extracts three characters starting from the fifth position of C5 which is after the hyphen(-).

  • ROW(INDIRECT(“1:”&LEN(MID(C5,5,3))))

Generates an array of sequential numbers from 1 to the length of the extracted three characters.

  • MID(MID(C5,5,3),ROW(INDIRECT(“1:”&LEN(MID(C5,5,3)))),1)

Retrieves each individual character from the extracted three characters using the generated array.

  • FIND(MID(MID(C5,5,3),ROW(INDIRECT(“1:”&LEN(MID(C5,5,3)))),1), Numbers)

Searches for each character in the named range of numbers, Numbers.

  • COUNT(FIND(MID(MID(C5,5,3),ROW(INDIRECT(“1:”&LEN(MID(C5,5,3)))),1), Numbers))

Counts the number of matches found.

  • COUNT(FIND(MID(MID(C5,5,3),ROW(INDIRECT(“1:”&LEN(MID(C5,5,3)))),1), Numbers))=LEN(MID(C5,5,3))

Compare the count with the length of the extracted three characters.

4th Part: Check Whether the last 3 Characters are uppercase letters

  • RIGHT(C5,3)

Extract the last three characters of C5.

  • MID(RIGHT(C5,3),{1,2,3},1)

Retrieves each individual character from the last three characters.

  • FIND(MID(RIGHT(C5,3),{1,2,3},1),Letters)

Searches for each character in the range of uppercase letters (Letters).

  • COUNT(…)

Counts the number of matches found.

  • COUNT(FIND(MID(MID(C5,5,3),ROW(INDIRECT(“1:”&LEN(MID(C5,5,3)))),1), Numbers))=LEN(MID(C5,5,3))

Compare the count with 3. If the number of matches found is 3 then returns True. That means all the characters are from the Letters named range.


RegEx Cheat Chart for Excel

Cheat Sheet of RegEx
Pattern Elements
. Matches any single character except newline (\n)
[ ] Match exactly one of the objects inside the brackets
[a-z] Matches any single lowercase letter from a to z
[0-9] Matches any single digit from 0 to 9
[^ ] Match any character not in the brackets
^ Matches the start of the string
$ Matches the end of the string
Quantifiers
* Match zero or more occurrences of the preceding pattern
=+ Match one or more occurrences of the preceding pattern
? Match zero or one occurrence of the preceding pattern
{m} Match exactly m occurrences of the preceding pattern
{m,n} Match between m and n occurrences of the preceding pattern
{m,n}? Match between m and n occurrences of the preceding pattern (non-greedy)
Predefined Character Abbreviations
\d Matches any single digit (equivalent to [0-9])
\D Matches any single character that’s not a digit
\w Matches any word character (letters, digits, underscore)
\W Matches any non-word character
\s Matches any whitespace character (space, tab, newline)
\S Matches any non-whitespace character
\n Matches a newline character (\n)
Grouping and Capturing
( ) Groups different matches for return purposes
(?: ) Groups the pattern without capturing it
Alternation
| OR operator, matches either the pattern before or after it
(a|b) Matches either “a” or “b”
Modifiers
i Case-insensitive matching
g Global matching (continues to search for all occurrences)
m Multi-line matching
Escape Characters
\ Escapes special characters (., [, ], (, ), ?, +, *, {, }, |, ^, $, \)
Examples
^\d{3}-\d{3}-\d{4} Matches a phone number in the format ###-###-####
[A-Za-z]+ Matches one or more consecutive letters
\d{2,4} Matches a number with 2 to 4 digits

Which Things to Remember While Using RegEx in Excel?

  • You have to save the Excel file as a macro-enabled file (.xlsm) to use the newly created custom functions after reopening the file. Otherwise, you will lose your written functions while closing the file.
  • You have to be careful enough to use the correct RegEx format, otherwise, you can get incorrect results.
  • In all three custom functions(RegEx_Match, RegExtract, and RegReplace), there are multiple optional arguments that you can use to get results according to your specific requirements.

Frequently Asked Questions

1. Can we use RegEx in VBA?

Answer: Yes, we can use RegEx by calling a library called Microsoft VBScript Regular Expression 5.5 from the reference library.

2. Can I use RegEx in Excel to replace the matched string?

Answer: Yes, you can use the provided RegReplace function to replace the matched substring with another substring.

3. Can I perform Case Insensitive RegEx operation in Excel?

Answer: Sure, in our provided custom functions, you can set the CaseMatch argument to False for performing case-insensitive operations.


Conclusion

In this article, we gave a brief idea of RegEx and provided 3 VBA codes of user-defined functions to match, extract, and replace strings using RegEx. We also gave multiple examples of using the functions in different circumstances such as validating phone numbers, matching SKU codes, performing case-sensitive searches, etc. In addition to that, we also showed how to use RegEx without VBA in Excel. Lastly, we presented a valuable cheat sheet of RegEx. I hope that, after going through the article, you built a solid foundation for using RegEx in Excel. If you like our article, you can share it with your friends and colleagues. Moreover, feel free to share your thoughts and suggestions in the comment section of this post. Goodbye!


RegEx in Excel: Knowledge Hub


<< Go Back to String Manipulation | Learn Excel

Get FREE Advanced Excel Exercises with Solutions!
Aniruddah Alam
Aniruddah Alam

Md. Aniruddah Alam, a graduate of Bangladesh University of Engineering & Technology, Dhaka, holds a BSc.Engg in Naval Architecture & Marine Engineering. Joining Softeko in August 2022 as an Excel and VBA content developer, he has authored 90+ articles, covering fundamental to advanced Excel topics. He actively develops VBA codes for Excel automation and offers solutions in the Exceldemy Forum. Beyond work, he finds leisure in reading books, and taking relaxing walks, showcasing well-rounded blend of professional and... Read Full Bio

2 Comments
  1. I can’t run the same code for any of the 3 RegEx here, in a second xlsm file. Can the code be amended so it will work in multiple files?
    Error in a second file is “Compile error: User defined type not defined”
    “regEx As New RegExp” is highlighted.

    • Reply Lutfor Rahman Shimanto
      Lutfor Rahman Shimanto Apr 18, 2024 at 11:55 AM

      Hello James Martin

      Thanks for visiting our blog and sharing your problem. When working with this code in another workbook, you are facing “Compile error: User-defined type not defined“. This error typically occurs when the required library or reference is not activated in the VBA Editor.

      Initial Problem:

      To overcome the problem, you must refer to the Microsoft VBScript Regular Expressions 5.5 library in VBA Editor.

      SOLUTION Overview:
      Press Alt+F11 >> Tools >> References >> Microsoft VBScript Regular Expressions 5.5 >> OK.

      I have improved the codes and solved your problem in a workbook. You can find these codes in the modules. Hopefully, you have found the solution helpful. I have attached the solution workbook as well; good luck.

      DOWNLOAD SOLUTION WORKBOOK

      Regards
      Lutfor Rahman Shimanto
      Excel & VBA Developer
      ExcelDemy

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo