Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/oroszgy/hungarian-text-mining-workshop
Materials for the Text Mining workshop held in the HuNLP meetup, June 2017
https://github.com/oroszgy/hungarian-text-mining-workshop
classification hungarian information-extraction keyword-extraction machine-learning meetup natural-language-processing nlp python scikit-learn sentiment-analysis spacy spacy-models text-mining text-mining-workshop textacy tutorial workshop
Last synced: 3 months ago
JSON representation
Materials for the Text Mining workshop held in the HuNLP meetup, June 2017
- Host: GitHub
- URL: https://github.com/oroszgy/hungarian-text-mining-workshop
- Owner: oroszgy
- License: mit
- Created: 2017-06-22T20:11:07.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2022-04-06T18:45:01.000Z (almost 3 years ago)
- Last Synced: 2024-10-26T23:37:34.536Z (3 months ago)
- Topics: classification, hungarian, information-extraction, keyword-extraction, machine-learning, meetup, natural-language-processing, nlp, python, scikit-learn, sentiment-analysis, spacy, spacy-models, text-mining, text-mining-workshop, textacy, tutorial, workshop
- Language: Jupyter Notebook
- Homepage:
- Size: 18.6 MB
- Stars: 20
- Watchers: 3
- Forks: 6
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Text mining workshop
## Preparation for the workshop
Please be prepared with
* basic knowledge of Python
* experience in using Jupyter notebooksDuring the course we will use little bit of Pandas ([10 minute intro](https://pandas.pydata.org/pandas-docs/stable/10min.html)) and [scikit-learn](http://scikit-learn.org/stable/) to build simple machine learning models.
## Install dependencies and run the notebooks
### The easy way: using Docker
Get the docker image: `docker pull oroszgy/hungarian-text-mining-workshop`
Start Jupyter Notebook: `make start`
### The hard way: installing the packages manually
0. Make sure you have Python 3.5+ installed (preferably a conda distribution)
1. Clone this repository: `git clone http://github.com/oroszgy/hungarian-text-mining-workshop && cd hungarian-text-mining-workshop`
2. Install the necessary packages: `pip install -r requirements.txt`
3. Download the Enlgish and the Hungaruan NLP models for spaCy:
* `python -m spacy download en`
* `pip install https://github.com/oroszgy/spacy-hungarian-models/releases/download/hu_tagger_web_md-0.1.0/hu_tagger_web_md-0.1.0.tar.gz`
4. Install HuNlpy
* `pip install https://github.com/oroszgy/hunlp/releases/download/0.2/hunlp-0.2.0.tar.gz`Start Jupyter Notebook: `jupyter notebook`
## Table of Contents
1. [Practical NLP in Python: `spaCy` and `textacy`, Describing documents with words](./1_Intro.ipynb)
2. [Document categorization, Sentiment analysis](./2_TextCategorization.ipynb)
3. [Extracting named entities and concepts](./3_EntitiesAndConcepts.ipynb)## Softwares used
* [spaCy](https://spacy.io)
* [Hungarian model for spaCy](https://github.com/oroszgy/spacy-hungarian-models)
* [textacy](http://textacy.readthedocs.io/)
* [scikit-learn](http://scikit-learn.org/stable/)
* [HuNlp](https://github.com/oroszgy/hunlp)
* [DBpedia Spotlight](http://www.dbpedia-spotlight.org/)---
(c) Gyorgy Orosz, 2017