https://github.com/idf/rake4j
A Java implementation of the Rapid Automatic Keyword Extraction (RAKE)
https://github.com/idf/rake4j
Last synced: 6 months ago
JSON representation
A Java implementation of the Rapid Automatic Keyword Extraction (RAKE)
- Host: GitHub
- URL: https://github.com/idf/rake4j
- Owner: idf
- License: mit
- Created: 2014-12-19T05:23:28.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2017-05-30T15:16:27.000Z (about 9 years ago)
- Last Synced: 2025-04-03T19:38:15.356Z (about 1 year ago)
- Language: Java
- Homepage:
- Size: 206 KB
- Stars: 7
- Watchers: 4
- Forks: 5
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
rake4j
======
This is a re-write of [Python RAKE](https://github.com/aneesha/RAKE) in Java.
An implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm as described in: [Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic Keyword Extraction from Individual Documents](http://scholar.google.com.sg/scholar?q=Automatic+Keyword+Extraction+from+Individual+Documents&btnG=&hl=en&as_sdt=0%2C5&as_vis=1)
# Run
## Sample
Normal run
```java
Document doc = new Document(text);
RakeAnalyzer rake = new RakeAnalyzer();
rake.loadDocument(doc);
rake.runWithoutOffset();
System.out.println(doc.termListToString());
```
Run with offset information and stemming
```java
Document doc = new Document(text);
RakeAnalyzer rake = new RakeAnalyzer();
rake.loadDocument(doc);
rake.run();
System.out.println(doc.termMapToString());
```
# Features
Recognized keywords from the algorithm based on stop words
* Adjoining keywords to recognized "axis of evil".
* KStemming algorithm ported from Lucene, to stem "university students" to "university student".
* Construct index of keywords with term frequency `tf` and document frequency `df`.
# Dependencies
In pom.xml, another custom maven module dependency is required:
```xml
io.deepreader.java.commons
commons-util
1.0-SNAPSHOT
```
You can get the module manually by:
```
git clone https://github.com/idf/commons-util
```
, which is hosted [here](https://github.com/idf/commons-util).
# References
[Python RAKE](https://github.com/aneesha/RAKE)
[Python RAKE (forked)](https://github.com/idf/RAKE)
[Java RAKE](https://github.com/Neuw84/RAKE-Java)