https://github.com/idf/rake4j

A Java implementation of the Rapid Automatic Keyword Extraction (RAKE)
https://github.com/idf/rake4j

Last synced: 6 months ago
JSON representation

A Java implementation of the Rapid Automatic Keyword Extraction (RAKE)

Host: GitHub
URL: https://github.com/idf/rake4j
Owner: idf
License: mit
Created: 2014-12-19T05:23:28.000Z (over 11 years ago)
Default Branch: master
Last Pushed: 2017-05-30T15:16:27.000Z (about 9 years ago)
Last Synced: 2025-04-03T19:38:15.356Z (about 1 year ago)
Language: Java
Homepage:
Size: 206 KB
Stars: 7
Watchers: 4
Forks: 5
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          rake4j

======

This is a re-write of [Python RAKE](https://github.com/aneesha/RAKE) in Java.  

An implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm as described in:  [Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic Keyword Extraction from Individual Documents](http://scholar.google.com.sg/scholar?q=Automatic+Keyword+Extraction+from+Individual+Documents&btnG=&hl=en&as_sdt=0%2C5&as_vis=1)

# Run

## Sample

Normal run 

```java

        Document doc = new Document(text);

        RakeAnalyzer rake = new RakeAnalyzer();

        rake.loadDocument(doc);

        rake.runWithoutOffset();

        System.out.println(doc.termListToString());

```

Run with offset information and stemming 

```java

        Document doc = new Document(text);

        RakeAnalyzer rake = new RakeAnalyzer();

        rake.loadDocument(doc);

        rake.run();

        System.out.println(doc.termMapToString());

```

# Features

Recognized keywords from the algorithm based on stop words

* Adjoining keywords to recognized "axis of evil".

* KStemming algorithm ported from Lucene, to stem "university students" to "university student".

* Construct index of keywords with term frequency `tf` and document frequency `df`.

# Dependencies

In pom.xml, another custom maven module dependency is required:

```xml

        

            io.deepreader.java.commons

            commons-util

            1.0-SNAPSHOT

        

```

You can get the module manually by:

```

git clone https://github.com/idf/commons-util

```

, which is hosted [here](https://github.com/idf/commons-util).

# References

[Python RAKE](https://github.com/aneesha/RAKE)  

[Python RAKE (forked)](https://github.com/idf/RAKE)  

[Java RAKE](https://github.com/Neuw84/RAKE-Java)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/idf/rake4j

Awesome Lists containing this project

README