Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eellak/nlpbuddy
A text analysis application for performing common NLP tasks through a web dashboard interface and an API
https://github.com/eellak/nlpbuddy
fasttext gensim natural-language-processing spacy text-analysis text-classification
Last synced: 3 months ago
JSON representation
A text analysis application for performing common NLP tasks through a web dashboard interface and an API
- Host: GitHub
- URL: https://github.com/eellak/nlpbuddy
- Owner: eellak
- License: agpl-3.0
- Created: 2018-07-27T10:23:42.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-01-18T18:06:35.000Z (about 6 years ago)
- Last Synced: 2024-10-01T05:41:24.334Z (4 months ago)
- Topics: fasttext, gensim, natural-language-processing, spacy, text-analysis, text-classification
- Language: HTML
- Homepage: http://www.nlpbuddy.io/
- Size: 929 KB
- Stars: 124
- Watchers: 21
- Forks: 28
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# NLPBuddy - Open Source Text Analysis Tool
## About the project
NLPBuddy is a text analysis application for performing common NLP tasks through a web dashboard interface and an API.
It leverages [Spacy](https://spacy.io) for the NLP tasks plus [Gensim's](https://github.com/RaRe-Technologies/gensim) implementation of the TextRank algorithm for text summarization.
It supports texts in the following languages: Greek, English, German, Spanish, Portoguese, French, Italian and Dutch. Language identification is performed automatically through [langid](https://github.com/saffsd/langid.py)
Tasks include:
1. Text tokenization
2. Sentence splitting (lemmatized sentences too)
3. Part of Speech tags identification (verbs, nouns etc)
4. Named Entity Recognition (Location, Person, Organisation etc)
5. Text summarization (using TextRank algorithm, implemented by Gensim)
6. Keywords extraction
7. Language identification
8. For the Greek language, Categorization of textText can either be provided or imported after specifying a url - we use library [python readability](https://github.com/buriy/python-readability) for this plus [BeautifulSoup4](https://www.crummy.com/software/BeautifulSoup/)
The Greek classifier is built with [FastText](https://fasttext.cc) and is trained in 20.000 articles labeled in these categories.
## Demo
A working demo can be found on [http://www.nlpbuddy.io/](http://www.nlpbuddy.io/)## Usage
Enter text and hit 'Analyze it',![alt text](https://raw.githubusercontent.com/eellak/text-analysis/master/static/img/screenshot1.jpg)
## API Usage
[https://github.com/eellak/text-analysis/wiki/API-usage](https://github.com/eellak/text-analysis/wiki/API-usage)## Installation
Find development and deployment instructions here: https://github.com/eellak/text-analysis/wiki/Install## License
The code is provided under the GNU AGPL v3.0 License.