https://github.com/chen0040/java-regex-cultivator
Regex generator which use genetic programming evolve grok and and to automatically discover regex given a set of texts having similar structure
https://github.com/chen0040/java-regex-cultivator
evolutionary-computation expression-generator genetic-programming grok regex
Last synced: 11 months ago
JSON representation
Regex generator which use genetic programming evolve grok and and to automatically discover regex given a set of texts having similar structure
- Host: GitHub
- URL: https://github.com/chen0040/java-regex-cultivator
- Owner: chen0040
- License: mit
- Created: 2017-06-21T15:06:07.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2017-06-23T08:17:03.000Z (about 9 years ago)
- Last Synced: 2025-06-27T16:50:55.110Z (12 months ago)
- Topics: evolutionary-computation, expression-generator, genetic-programming, grok, regex
- Language: Java
- Homepage:
- Size: 73.2 KB
- Stars: 4
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# java-regex-cultivator
Regex generator which use genetic programming to evolve grok and automatically discover regex given a set of texts having similar structure.
# Install
Add the following dependency to your POM file:
```xml
com.github.chen0040
java-regex-cultivator
1.0.1
```
# Usage
The sample code below shows how the gp regex cultivator discover the regex for the message "":
```java
GpCultivator generator = new GpCultivator();
generator.setDisplayEvery(2);
generator.setPopulationSize(1000);
generator.setMaxGenerations(50);
List trainingData = new ArrayList<>();
trainingData.add("user root login at 127.0.0.1");
Grok generated_grok = generator.fit(trainingData); // this is the grok interpreter generated
System.out.println("user root login at 127.0.0.1");
System.out.println(generator.getRegex()); // this is the regex generated
Match matched = generated_grok.match("user root login at 127.0.0.1");
matched.captures();
System.out.println(matched.toJson());
```
Below is the print out from the sample code above:
```bash
...
Generation: 4 (Pop: 1000), elapsed: 3 seconds
Global Cost: 0.2 Current Cost: 0.2
...
Global Cost: 0.14285714285714285 Current Cost: 0.16666666666666666
user root login at 127.0.0.1
%{LOGLEVEL} %{USER} %{URIPROTO} %{URIHOST} %{IPV4}
{"IPORHOST":"at","IPV4":"127.0.0.1","LOGLEVEL":"er","URIHOST":"at","URIPROTO":"login","USER":"root"}
```