
This is our Regex crash course for Excel power users.
What Is Regex?
Regex (regular expressions) is a pattern language for text. Instead of searching or matching an exact word, regex lets you describe the shape of text. In Excel, regex is like a super-powered FIND + MID + SUBSTITUTE combined, where one pattern can validate, extract, or clean text.
You can match common structures in Excel using regex: email addresses, phone numbers, dates in many formats, or any ID that follows a rule.
In modern Excel, you use it mainly with:
- REGEXTEST (check if a pattern exists)
- REGEXEXTRACT (pull the matching part)
- REGEXREPLACE (replace/clean matches)
Regex Basics: Building Patterns
Regex patterns are strings that define what you’re looking for. They combine literal characters (e.g., “abc”) with special metacharacters. You can use this table as a quick reference.
Key Metacharacters and Syntax:
| Building Block | Meaning | Quick Example Pattern | Matches |
|---|---|---|---|
| abc | Literal text | TX- | TX- |
| . | Any single character | a.c | abc, a-c |
| \d | Digit (0–9) | \d{4} | 2026 |
| \D | Not a digit | \D+ | TX- |
| \w | Word char (A–Z, a–z, 0–9, _) | \w+ | amina_01 |
| \W | Not a word char | \W+ | – ( ) . |
| \s | Whitespace | \s+ | spaces/tabs |
| \S | Not whitespace | \S+ | [email protected] |
| [ABC] | One of these characters | [AEIOU] | A |
| [A-Z] | Range (uppercase) | [A-Z]{2} | TX |
| [a-z] | Range (lowercase) | [a-z]+ | rahman |
| [0-9] | Digit range | [0-9]+ | 1054 |
| [^0-9] | NOT these (negated class) | [^0-9]+ | TX- |
| * | 0 or more | \d* | “”, 7, 1054 |
| + | 1 or more | \d+ | 7, 1054 |
| ? | Optional (0 or 1) | -? | – or nothing |
| {n} | Exactly n times | \d{2} | 12 |
| {n,} | At least n times | \d{4,} | 202601 |
| {n,m} | Between n and m | \d{2,4} | 12, 2026 |
| ^ | Start of text | ^TX- | TX-1054 (at start) |
| $ | End of text | \.com$ | ends with .com |
| \b | Word boundary | \bTX-\d{4}\b | TX-1054 as a whole token |
| ( … ) | Capturing group | (\d{4}) | captures 1054 |
| (?: … ) | Non-capturing group | (?:TX|CA)-\d{4} | TX-1054 or CA-1054 |
| \ | Escape special chars | company\.com | literal company.com |
| .* | Any chars (greedy) | \(.*\) | matches from first ( to last ) |
| .*? | Any chars (non-greedy) | \(.*?\) | matches smallest ( … ) |
| (?i) | Case-insensitive mode (not supported in Excel regex patterns; use the function’s case_sensitivity argument instead) | — | Use REGEXTEST(text, pattern, 1) |
- Quantifiers (*, +, ?, {}) apply to the element before them.
- Flags: Excel’s regex functions don’t support inline flags like (?i) in patterns; use the function argument for case sensitivity (e.g., the third argument of REGEXTEST), or use character classes (e.g., [Aa]).
- Greedy vs. Lazy: By default, quantifiers are greedy (match as much as possible). Add ? after a quantifier to make it lazy (e.g., .*?).
Excel’s Regex Functions
Excel offers three regex functions natively. These functions take a text string and a regular expression pattern as inputs and validate or transform the text based on the pattern. Let’s get to know each function before diving into the course.
Syntax:
=REGEXTEST(text, pattern, [case_sensitivity])
This function checks whether the pattern matches any part of the provided text. Returns TRUE if the pattern matches anywhere in the text; FALSE otherwise.
- text: The text or cell reference you want to test.
- pattern: The regular expression (regex) pattern that describes the text you want to match.
- [case_sensitivity]: Determines whether the match is case-sensitive. By default, the match is case-sensitive.
- 0: Case sensitive
- 1: Case insensitive
=REGEXTEST(A1, "\d{3}")
This function returns TRUE if A1 contains three consecutive digits.
Syntax:
=REGEXEXTRACT(text, pattern, [return_mode], [case_sensitivity])
This function extracts the matching text.
- [return_mode]: A number that specifies what strings you want to extract. By default, return mode is 0.
- 0: Return the first string that matches the pattern
- 1: Return all strings that match the pattern as an array
- 2: Return capturing groups from the first match as an array
=REGEXEXTRACT(A1, "\d{3}-\d{3}-\d{4}")
Extracts a phone number like “123-456-7890”.
Syntax:
=REGEXREPLACE(text, pattern, replacement, [occurrence], [case_sensitivity])
This function replaces matches with new text.
- replacement: The text you want to use in place of the pattern. You can reference capture groups with $1, $2, etc.
- occurrence: Specifies which instance of the pattern you want to replace. By default, occurrence is 0, which replaces all instances. A negative number replaces that instance, searching from the end.
=REGEXREPLACE(A1, "\d{3}", "***")
Masks every three-digit sequence.
These functions spill if returning arrays (e.g., multiple extractions).
Practical Examples for Excel Power Users
Suppose you have messy data. Let’s apply regex to common scenarios.
Example 1: Validating Email Addresses
You can use REGEXTEST to flag valid company emails (case-insensitive matches are often safer).
Use the following pattern to validate emails.
- Pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
=REGEXTEST(A2, "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$")
Use Conditional Formatting to Highlight Invalid Emails:
- Go to the Home tab >> select Conditional Formatting >> select New Rule
- Choose Use a formula to determine which cells to format
- Insert the following formula
- Select Format >> choose a Fill Color >> click OK

- Invalid email addresses are highlighted in red

You can also use REGEXTEST to flag valid company emails.
=REGEXTEST(B2,"@company\.com$",1)
- The third argument (1) makes the match case-insensitive, so Company.com still passes
- \. escapes the dot. A plain . means “any character”
Example 2: Extracting Data From Messy Notes
Suppose you have mixed data, like an order ID or a phone number, in notes. You can extract the phone numbers or order IDs by using the REGEXEXTRACT function.
Extract Order ID:
=REGEXEXTRACT(D2,"TX-\d{4}")
- If some rows don’t have it, wrap with IFERROR:
=IFERROR(REGEXEXTRACT(D2,"TX-\d{4}"),"")

Extract Phone Numbers:
- Data: “Call me at 123-456-7890 or (987) 654-3210.”
- Pattern: (\d{3}[-. )]+){2}\d{4}
=REGEXEXTRACT(A2, "(\d{3}[-. )]+){2}\d{4}", 0, 1)
This formula extracts phone numbers from the text.
Example 3: Standardizing Phone Numbers by Removing Non-Digits
- Create a digits-only version first.
=REGEXREPLACE(C2,"[^\d]","")
- Now you can format (example: last 10 digits as a US number when available):
=LET(x,REGEXREPLACE(C2,"[^\d]",""),
d,RIGHT(x,10),
IF(LEN(d)=10,"("&LEFT(d,3)&") "&MID(d,4,3)&"-"&RIGHT(d,4),""))
This approach removes all non-digit characters and formats the resulting phone number.

Example 4: Parsing Dates and Reformatting
You can parse the date and reformat it using regex.
- Pattern: (\d{1,2})/(\d{1,2})/(\d{4})
=REGEXREPLACE(A2, "(\d{1,2})/(\d{1,2})/(\d{4})", "$3-$2-$1")
This formula parses the date and then reformats it into a valid ISO-like format. You can also use capture groups () to reference parts in replacements.
Example 5: Cleaning Messy Data (e.g., Removing Extra Spaces)
You can use regex to clean messy data, such as removing extra spaces and normalizing formatting.
To remove extra spaces, use the following pattern.
- Pattern: \s+
=REGEXREPLACE(A2, "\s+", " ")
This formula replaces runs of whitespace with a single space.
Example 6: Removing Anything in Parentheses (Including the Parentheses)
- Select a cell and insert the following formula:
=REGEXREPLACE(D2,"\s*\(.*?\)","")
- .*? is a non-greedy match. It stops at the first closing parenthesis instead of the last.
This formula removes all data within parentheses, including the parentheses themselves.

Example 7: Filtering Rows That Contain an Order ID
=FILTER(A2:F16,REGEXTEST(D2:D16,"TX-\d{4}"))
This formula filters data based on the regex pattern.

Advanced Regex Techniques in Excel
- Lookaheads/Lookbehinds: Assert conditions without consuming text.
- Positive lookahead: (?=…) e.g., \d+(?=%)$ for numbers before a % sign.
- Use in REGEXEXTRACT to extract prices: \d+\.\d{2}(?=\sUSD)
- Non-Capturing Groups: (?:…) for grouping without capture.
- Alternation: For options, e.g., (http|https)://\S+ for URLs.
- Handling Arrays: If REGEXEXTRACT returns multiple groups, use INDEX or spill ranges.
- Combining With Other Functions: Nest with IF, FILTER, or LAMBDA for powerful workflows.
Tips and Best Practices
- Test Patterns: Use online tools like regex101.com with the “ECMAScript” flavor (closest to Excel’s regex).
- Performance: Regex can be slow on large datasets; test on samples first.
- Errors: If there is no match, REGEXEXTRACT returns #N/A; handle with IFNA or IFERROR.
- Limitations: Excel’s regex is based on a subset, without full PCRE support, so avoid advanced features like recursion.
- Learning More: Practice with real datasets. Regex gets easier with use!
Download Practice Workbook
Wrapping Up
Excel power users can follow our regex crash course to accelerate data cleaning, validation, and automation. Once you’re comfortable with the basic building blocks, explore lookaheads and lookbehinds for more sophisticated pattern matching. This crash course gets you started: first understand the expression, then experiment with patterns in a blank workbook. Regex can transform how you handle text in Excel!
Get FREE Advanced Excel Exercises with Solutions!


Perfect
Please also share data file for better practice.
Hello Usman,
Thanks for your feedback and appreciation. Glad to hear that you liked our tutorial.
We attached the sample file in our article for better practice.
Regards,
ExcelDemy