Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/softmarshmallow/inked-engine
π€ natural language processing out of the box
https://github.com/softmarshmallow/inked-engine
django nlp nltk python
Last synced: about 1 month ago
JSON representation
π€ natural language processing out of the box
- Host: GitHub
- URL: https://github.com/softmarshmallow/inked-engine
- Owner: softmarshmallow
- Created: 2018-07-13T11:33:22.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-11-22T02:47:41.000Z (almost 2 years ago)
- Last Synced: 2024-10-03T13:24:07.235Z (about 1 month ago)
- Topics: django, nlp, nltk, python
- Language: Jupyter Notebook
- Homepage:
- Size: 87.7 MB
- Stars: 9
- Watchers: 2
- Forks: 0
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# inked engine π€π€
| λ΄μ€ λΆμμ μν ν΄ν·μ λλ€.
Main features
* news data indexing
* news data processing
* provide api for service server
Inked-news-crawler μμ μλ‘μ΄ λ΄μ€λ°μ΄ν°λ₯Ό λ°μμ¨ν, μΈλ±μ±κ³Ό pre-proccessing μ ν©λλ€. μλΉμ€ μλ²μμ μμ²νλ μ 보λ₯Ό λΆμνμ¬ μλΉμ€ μλ²λ‘ μ λ¬νλ©°, μλΉμ€ μλ²μμ ν΄λΌμ΄μΈνΈλ‘ λ΄μ€ μ 보λ₯Ό μ 곡ν©λλ€.
## News data model
- tags : { company : [], namedEntities: [], keywords: []}
- content
- origin
- title
- time# How to install virtualenv:
### Install **pip** first
sudo apt-get install python3-pip
### Then install **virtualenv** using pip3
sudo pip3 install virtualenv
### Now create a virtual environment
virtualenv venv
## KoNlPy setup
http://konlpy.org/en/v0.4.4/install/
`sudo apt-get install g++ openjdk-8-jdk`
`bash <(curl -s https://raw.githubusercontent.com/konlpy/konlpy/master/scripts/mecab.sh)`## start the engine server
`daphne server.asgi:application`## supervisor ctrl
restart server
`sudo supervisorctl restart asgi_daphne`## IMPORTANT:: seed credential files
you can see
```gitignore
server/settings/production.py
credentials/db-connection.json
```
from `.gitignore` which two files you will have to provide manually to run this project.## modules
- duplicate news checker β
- spam news detector π«
- word2vec β (wiki) π« (news)## used by
* [wor.io](https://github.com/softmarshmallow/wor.io)
* [inked-server](https://github.com/softmarshmallow/inked-server)## developed by
develped by softmarshmallow