{ sniptools }
Text

Regex Tester

Build, test, and debug regular expressions in real time

What is Regex?

A regular expression (regex) is a sequence of characters that defines a search pattern. Originating from formal language theory in the 1950s, regexes are used in virtually every programming language and text editor to find, match, and manipulate strings. They range from simple literal matches like "hello" to complex patterns that validate email formats or parse structured log output.

At their core, regular expressions describe the shape of text rather than its exact content. A pattern like \d{3}-\d{4} does not search for one specific phone number — it matches any string of three digits, a hyphen, and four more digits. This declarative approach makes regex a uniquely powerful tool for validation, extraction, and transformation across codebases of any size.

Despite their power, regexes have a reputation for being difficult to read. The syntax is dense — a single token like \b or (?=) carries significant meaning — and complex patterns quickly become deeply nested. A regex tester with real-time match highlighting and capture group inspection turns the development process from guesswork into a tight, interactive feedback loop.

How it works

Regular expression engines generally use one of two strategies: backtracking (NFA-based) or deterministic matching (DFA-based). JavaScript's built-in RegExp uses a backtracking NFA engine, meaning it tries to match the pattern left to right, explores alternatives at each branch point, and retreats when a path fails. This design supports powerful features like backreferences and lookaheads, but can exhibit poor performance on pathological patterns with excessive branching.

When you write a regex like /^([a-z]+)@([a-z]+)\.com$/i, the engine compiles it into an internal state machine and walks through the input string character by character. The ^ and $ anchors constrain the match to span the entire string. Parenthesized groups (...) record the username and domain substrings independently, enabling you to extract structured data from flat text via numbered or named references.

Flags modify how the engine processes the pattern at a fundamental level. The global flag (g) continues searching after the first match. The case-insensitive flag (i) folds letter matching to ignore case. The multiline flag (m) redefines ^ and $ to match line boundaries rather than string boundaries. The dotAll flag (s) allows the . metacharacter to match newline characters, which it normally skips.

How to use this tool

  1. Enter your regular expression in the pattern field at the top. Do not include surrounding slashes — type the pattern directly (e.g., \d+\.\d+ rather than /\d+\.\d+/).
  2. Paste or type your test string in the input area. Matches highlight in real time as you type, giving immediate visual feedback on what your pattern captures.
  3. Toggle regex flags (g, i, m, s) using the flag buttons to change matching behavior. Hover over each flag for a short description of its effect.
  4. Examine capture groups in the results panel. Groups are numbered and color-coded so you can see which parenthesized subpattern captured which substring.
  5. Refine your pattern iteratively — modify any character and watch the match highlights update instantly. This tight loop is the fastest way to develop and debug complex regex.
  6. When your pattern works correctly, copy it into your codebase. Remember that backslashes may need doubling in string literals depending on your language (e.g., "\\d+" in Java).

Examples

Validating email addresses

The pattern ^[\w.-]+@[\w.-]+\.[a-zA-Z]{2,}$ matches typical email addresses. \w covers word characters, the dot and hyphen are literal within the class, and {2,} ensures the TLD is at least two letters. This handles most real-world addresses, though the full RFC 5322 grammar is far more complex.

Extracting ISO dates from log lines

The pattern (\d{4})-(\d{2})-(\d{2}) matches dates like 2025-03-15, with three capture groups for year, month, and day. Wrapping in \b word boundaries prevents partial matches against longer numeric strings in log output.

Matching IPv4 addresses

Use \b(?:\d{1,3}\.){3}\d{1,3}\b to find IP-shaped strings like 192.168.1.1. The non-capturing group (?:...) repeats "one-to-three digits plus dot" three times. Note this validates format only — 999.999.999.999 also matches, so range validation should happen in application code.

Extracting URLs from prose

The pattern https?://[\w.-]+(?:/[\w./?&=%#-]*)? captures HTTP and HTTPS URLs embedded in plain text. The s? makes "https" match both protocols, [\w.-]+ covers the domain, and the optional trailing group handles paths, query strings, and fragment identifiers.

About this tool

Write and test regular expressions against live input with real-time match highlighting, capture group inspection, and plain-English explanations of what each part of your pattern does.

Common use cases

  • Testing and refining regex patterns for form validation (emails, phone numbers, URLs)
  • Building patterns to extract data from log files or structured text
  • Debugging capture groups and backreferences in complex search-and-replace operations
  • Learning regular expression syntax with real-time match highlighting and explanations

Frequently asked questions

Which regex flavor does this tool use?
This tool uses JavaScript's built-in RegExp engine, which supports most Perl-compatible syntax including lookaheads, lookbehinds (ES2018+), named capture groups, and Unicode property escapes.
Can I test regex with flags like global or case-insensitive?
Yes. You can toggle common flags (g for global, i for case-insensitive, m for multiline, s for dotAll) to see how they affect matches in real time.
Will my regex work the same in other languages?
Most basic patterns are portable across languages, but advanced features like lookbehinds, Unicode categories, and named groups may differ. Always test in your target language for production use.
What are capture groups and how do I use them?
Capture groups are portions of a regex enclosed in parentheses — e.g., (\d{3})-(\d{4}). Each group records a substring you can reference by index or name in replacement strings and application code. They are essential for extracting structured data from text (like pulling area codes from phone numbers) or rearranging matched segments during search-and-replace. Named groups like (?<year>\d{4}) make complex patterns self-documenting.
How do I avoid catastrophic backtracking?
Catastrophic backtracking happens when the engine tries exponentially many ways to match — typically caused by nested quantifiers like (a+)+. To prevent it, avoid nested repetition, prefer specific character classes over broad wildcards, and use atomic groups or possessive quantifiers where available. In this browser-based tool a pathological pattern will freeze the tab rather than crash a server, but efficient patterns are still essential for production.
What is the difference between greedy and lazy quantifiers?
Greedy quantifiers (*, +, {n,m}) consume as much text as possible, while lazy variants (*?, +?, {n,m}?) consume as little as possible. For example, given "<b>bold</b>", the greedy <.*> matches from the first < to the last >, but the lazy <.*?> stops at the first >. Use lazy quantifiers when you need the shortest match between delimiters.