https://github.com/ieviev/resharp-dotnet
RE# - A high-performance, automata based regex engine with first-class support for intersection and complement operations.
https://github.com/ieviev/resharp-dotnet
regex regex-engine
Last synced: 4 months ago
JSON representation
RE# - A high-performance, automata based regex engine with first-class support for intersection and complement operations.
- Host: GitHub
- URL: https://github.com/ieviev/resharp-dotnet
- Owner: ieviev
- License: mit
- Created: 2026-02-21T19:59:20.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-02-26T11:55:44.000Z (4 months ago)
- Last Synced: 2026-02-26T14:24:58.587Z (4 months ago)
- Topics: regex, regex-engine
- Language: F#
- Homepage:
- Size: 27 MB
- Stars: 39
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# RE#
[](https://www.nuget.org/packages/Resharp)
A high-performance, automata based regex engine with first-class support for **intersection** and **complement** operations.
RE# compiles patterns into deterministic automata. All matching is non-backtracking with guaranteed linear-time execution. RE# extends `System.Text.RegularExpressions` syntax with intersection (`&`), complement (`~`), and a universal wildcard (`_`), enabling patterns that are impossible or impractical to express with standard regex.
[web playground](https://ieviev.github.io/resharp-webapp/) | [paper](https://dl.acm.org/doi/10.1145/3704837) | [blog post](https://iev.ee/blog/resharp-how-we-built-the-fastest-regex-in-fsharp/)
## Install
```
dotnet add package Resharp
```
## Usage
```csharp
// contains "cat", "dog", AND is 8-15 characters long
var re = new Resharp.Regex(@".*cat.*&.*dog.*&.{8,15}");
// instances are thread-safe, compile once and reuse
re.Matches("the cat and the dog");
```
## Syntax extensions
RE# supports standard .NET regex syntax plus three extensions:
### `_` -- universal wildcard
Matches any character including newlines (`[\s\S]`).
### `&` -- intersection
Both sides must match. Intersection has higher precedence than alternation.
```
_*cat_*&_*dog_* contains both "cat" and "dog"
_*cat_*&_*dog_*&_{5,30} ...and is 5-30 characters long
```
### `~(...)` -- complement
Matches everything the inner pattern does not match.
```
~(_*\d\d_*) does not contain two consecutive digits
~(_*\n\n_*) does not contain a double newline
```
why `_*` and not `.*`?
`.*` does not match newlines, so it does not mean "any string" (rather "any one line").
we specifically included `_` in the syntax so it's more intuitive to use with complement.
### Combining operators
```
F.*&~(_*Finn) starts with 'F', does not end with "Finn"
~(_*\d\d_*)&[a-zA-Z\d]{8,} 8+ alphanumeric, no consecutive digits
```
## Performance
RE# uses several optimizations: start-set inference, literal prefix scanning, and optional full DFA precompilation. RE# shares many optimizations with .NET's (and `RegexOptions.NonBacktracking` even shares some RE# techniques and strengths that many do not know of!) but RE# is designed from the ground up and returns a different kind of matches (leftmost-longest).
RE# particularly excels with large patterns and will often outperform .NET's regex engine (and all others) for complex patterns, especially those with a large set of alternatives, loops or using context-awareness - [RE# supports lookarounds](./docs/syntax.md#lookarounds), which is unique among automata engines.
To illustrate, here is a little comparison of RE# with .NET's most used compiled and source-generated regex engines on these patterns, you can also find wider comparisons [in the paper](https://dl.acm.org/doi/10.1145/3704837):
On [curated benchmarks from rebar](https://github.com/BurntSushi/rebar) (AMD Ryzen 7 5800X, .NET 10.0):
| Pattern | RE# | .NET Compiled | .NET SourceGenerated | Speedup |
|---------|----:|-----:|-----:|--------:|
| date validation | 1,737 us | 273,822 us | 318,070 us | 158x |
| dictionary search | 105 us | 45,832 us | 26,410 us | 252x |
And on some extensions we added ourselves:
| Pattern | RE# | .NET Compiled | .NET SourceGenerated | Speedup |
|---------|----:|-----:|-----:|--------:|
| dictionary, case-insensitive | 576 us | 29,368 us | 21,146 us | 37x |
| unicode dictionary | 336 us | 62,053 us | 38,613 us | 115x |
| unicode dictionary, case-insensitive | 321 us | 484,135 us | 537,814 us | 1,508x |
| dictionary + context window | 621 us | 48,893 us | 55,383 us | 79x |
| dictionary + context window, unicode | 692 us | 24,105,091 us | 34,706,982 us | 34,833x |
Where is RegexOptions.NonBacktracking?
Conveniently left out for shock effect `:^)`. `NonBacktracking` is actually much closer to RE#, but still behind. See the [paper](https://dl.acm.org/doi/10.1145/3704837) for a fairer comparison.
For critical paths, you can use `ValueMatches` for memory-pooled matching and `ResharpOptions.HighThroughputDefaults` for more aggressive optimization.
```csharp
var re = new Resharp.Regex("pattern", ResharpOptions.HighThroughputDefaults);
using var slices = re.ValueMatches(chars);
foreach (var s in slices)
Console.WriteLine($"match at {s.Index}..{s.Index + s.Length}");
```
## Documentation
- [Syntax reference](docs/syntax.md) -- full pattern syntax including `&`, `~`, `_`
- [API reference](docs/api.md) -- all public types and methods
## Examples
Runnable scripts in [`examples/`](examples/):
| File | Description |
|------|-------------|
| [basic-syntax.fsx](examples/basic-syntax.fsx) | wildcards, intersection, complement |
| [paragraph.fsx](examples/paragraph.fsx) | paragraph extraction with complement and intersection |
| [validation.fsx](examples/validation.fsx) | date, IP, password validation with intersection |
| [replace.fsx](examples/replace.fsx) | string and function-based replacement |
| [lookaround.fsx](examples/lookaround.fsx) | lookahead, lookbehind, combined with intersection |
| [high-throughput.fsx](examples/high-throughput.fsx) | zero-allocation matching for large inputs |
| [Basic.cs](examples/Basic.cs) | C# usage |
Have fun!