Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/davisjam/safe-regex
Detect possibly catastrophic, exponential-time regular expressions
https://github.com/davisjam/safe-regex
Last synced: 3 months ago
JSON representation
Detect possibly catastrophic, exponential-time regular expressions
- Host: GitHub
- URL: https://github.com/davisjam/safe-regex
- Owner: davisjam
- License: other
- Fork: true (inno-v/safe-regex)
- Created: 2017-05-09T18:56:03.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2022-11-01T15:52:16.000Z (about 2 years ago)
- Last Synced: 2024-10-29T13:07:50.067Z (3 months ago)
- Language: JavaScript
- Homepage:
- Size: 73.2 KB
- Stars: 167
- Watchers: 5
- Forks: 10
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
- awesome-hacking-lists - davisjam/safe-regex - Detect possibly catastrophic, exponential-time regular expressions (JavaScript)
README
# safe-regex
Detect potentially
[catastrophic](http://regular-expressions.mobi/catastrophic.html)
[exponential-time](http://perlgeek.de/blog-en/perl-tips/in-search-of-an-exponetial-regexp.html)
regular expressions by limiting the
[star height](https://en.wikipedia.org/wiki/Star_height) to 1.WARNING: This module has both false positives and false negatives.
Use [vuln-regex-detector](https://github.com/davisjam/vuln-regex-detector) for improved accuracy.[![Build Status](https://travis-ci.org/davisjam/safe-regex.svg?branch=master)](https://travis-ci.org/davisjam/safe-regex)
## Example
Suppose you have a script named `safe.js`:
``` js
var safe = require('safe-regex');
var regex = process.argv.slice(2).join(' ');
console.log(safe(regex));
```This is its behavior:
```
$ node safe.js '(x+x+)+y'
false
$ node safe.js '(beep|boop)*'
true
$ node safe.js '(a+){10}'
false
$ node safe.js '\blocation\s*:[^:\n]+\b(Oakland|San Francisco)\b'
true
```## Methods
``` js
const safe = require('safe-regex')
```### const ok = safe(re, opts={})
Return a boolean `ok` whether or not the regex `re` is safe and not possibly
catastrophic.`re` can be a `RegExp` object or just a string.
If the `re` is a string and is an invalid regex, returns `false`.
* `opts.limit` - maximum number of allowed repetitions in the entire regex.
Default: `25`.## Install
With [npm](https://npmjs.org) do:
```
npm install safe-regex
```## Resources
### What should I do if my project has a super-linear regex?
1. Confirm that it is *reachable* by untrusted input.
2. If it is, you can consider whether you can prevent worst-case behavior by trimming the input, revising the regex, or replacing the regex with another algorithm like string functions. For examples, see Table 5 in [this article](http://people.cs.vt.edu/davisjam/downloads/publications/DavisCoghlanServantLee-EcosystemREDOS-ESECFSE18.pdf).
3. If none of those solutions looks feasible, you might also consider changing regex engines. The [RE2 bindings](https://www.npmjs.com/package/re2) might work, though test carefully to confirm there are no [semantic portability problems](https://medium.com/@davisjam/why-arent-regexes-a-lingua-franca-esecfse19-a36348df3a2?source=friends_link&sk=d21be7f8f723e2080dc993385c6973d1).### Further reading
The following documents may be edifying:
- [Research brief on the extent of super-linear regexes in practice](https://medium.com/@davisjam/introduction-987fdc4c7b0?source=friends_link&sk=ceefa4a4ca9617e08ab782c3b1580aea)
- [Research brief on the variability of super-linear regex behavior across programming languages](https://medium.com/@davisjam/why-arent-regexes-a-lingua-franca-esecfse19-a36348df3a2?source=friends_link&sk=d21be7f8f723e2080dc993385c6973d1)
- [Comparing regex matching algorithms](https://swtch.com/~rsc/regexp/regexp1.html)## Project policies
### Versioning
This project follows [Semantic Versioning 2.0 (semver)](https://semver.org/).
Here are the project-specific meanings of MAJOR, MINOR, and PATCH updates:
- MAJOR: "Incompatible" API changes were introduced. There are two types in this module:
- Changes that modify the interface
- Changes that cause any regexes to be marked as unsafe that were formerly marked as safe
- MINOR: Functionality was added in a backwards-compatible manner. There are two types in this module:
- Refactoring the analyses but not changing their results
- Modifying the analyses to reduce false positives, without affecting negatives (false or true)
- PATCH: I don't anticipate using PATCH for this module### License
[MIT](https://github.com/davisjam/safe-regex/blob/master/LICENSE)