Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/davisjam/safe-regex

Detect possibly catastrophic, exponential-time regular expressions
https://github.com/davisjam/safe-regex

Last synced: 3 months ago
JSON representation

Detect possibly catastrophic, exponential-time regular expressions

Awesome Lists containing this project

README

        

# safe-regex

Detect potentially
[catastrophic](http://regular-expressions.mobi/catastrophic.html)
[exponential-time](http://perlgeek.de/blog-en/perl-tips/in-search-of-an-exponetial-regexp.html)
regular expressions by limiting the
[star height](https://en.wikipedia.org/wiki/Star_height) to 1.

WARNING: This module has both false positives and false negatives.
Use [vuln-regex-detector](https://github.com/davisjam/vuln-regex-detector) for improved accuracy.

[![Build Status](https://travis-ci.org/davisjam/safe-regex.svg?branch=master)](https://travis-ci.org/davisjam/safe-regex)

## Example

Suppose you have a script named `safe.js`:

``` js
var safe = require('safe-regex');
var regex = process.argv.slice(2).join(' ');
console.log(safe(regex));
```

This is its behavior:

```
$ node safe.js '(x+x+)+y'
false
$ node safe.js '(beep|boop)*'
true
$ node safe.js '(a+){10}'
false
$ node safe.js '\blocation\s*:[^:\n]+\b(Oakland|San Francisco)\b'
true
```

## Methods

``` js
const safe = require('safe-regex')
```

### const ok = safe(re, opts={})

Return a boolean `ok` whether or not the regex `re` is safe and not possibly
catastrophic.

`re` can be a `RegExp` object or just a string.

If the `re` is a string and is an invalid regex, returns `false`.

* `opts.limit` - maximum number of allowed repetitions in the entire regex.
Default: `25`.

## Install

With [npm](https://npmjs.org) do:

```
npm install safe-regex
```

## Resources

### What should I do if my project has a super-linear regex?

1. Confirm that it is *reachable* by untrusted input.
2. If it is, you can consider whether you can prevent worst-case behavior by trimming the input, revising the regex, or replacing the regex with another algorithm like string functions. For examples, see Table 5 in [this article](http://people.cs.vt.edu/davisjam/downloads/publications/DavisCoghlanServantLee-EcosystemREDOS-ESECFSE18.pdf).
3. If none of those solutions looks feasible, you might also consider changing regex engines. The [RE2 bindings](https://www.npmjs.com/package/re2) might work, though test carefully to confirm there are no [semantic portability problems](https://medium.com/@davisjam/why-arent-regexes-a-lingua-franca-esecfse19-a36348df3a2?source=friends_link&sk=d21be7f8f723e2080dc993385c6973d1).

### Further reading

The following documents may be edifying:

- [Research brief on the extent of super-linear regexes in practice](https://medium.com/@davisjam/introduction-987fdc4c7b0?source=friends_link&sk=ceefa4a4ca9617e08ab782c3b1580aea)
- [Research brief on the variability of super-linear regex behavior across programming languages](https://medium.com/@davisjam/why-arent-regexes-a-lingua-franca-esecfse19-a36348df3a2?source=friends_link&sk=d21be7f8f723e2080dc993385c6973d1)
- [Comparing regex matching algorithms](https://swtch.com/~rsc/regexp/regexp1.html)

## Project policies

### Versioning

This project follows [Semantic Versioning 2.0 (semver)](https://semver.org/).

Here are the project-specific meanings of MAJOR, MINOR, and PATCH updates:

- MAJOR: "Incompatible" API changes were introduced. There are two types in this module:
- Changes that modify the interface
- Changes that cause any regexes to be marked as unsafe that were formerly marked as safe
- MINOR: Functionality was added in a backwards-compatible manner. There are two types in this module:
- Refactoring the analyses but not changing their results
- Modifying the analyses to reduce false positives, without affecting negatives (false or true)
- PATCH: I don't anticipate using PATCH for this module

### License

[MIT](https://github.com/davisjam/safe-regex/blob/master/LICENSE)