{"id":15519784,"url":"https://github.com/oroszgy/hungarian-text-mining-workshop","last_synced_at":"2025-04-23T04:17:54.240Z","repository":{"id":22153224,"uuid":"95152596","full_name":"oroszgy/hungarian-text-mining-workshop","owner":"oroszgy","description":"Materials for the Text Mining workshop held in the  HuNLP meetup, June 2017","archived":false,"fork":false,"pushed_at":"2022-04-06T18:45:01.000Z","size":19461,"stargazers_count":20,"open_issues_count":7,"forks_count":5,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-23T04:17:45.334Z","etag":null,"topics":["classification","hungarian","information-extraction","keyword-extraction","machine-learning","meetup","natural-language-processing","nlp","python","scikit-learn","sentiment-analysis","spacy","spacy-models","text-mining","text-mining-workshop","textacy","tutorial","workshop"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oroszgy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-06-22T20:11:07.000Z","updated_at":"2023-09-08T17:26:41.000Z","dependencies_parsed_at":"2022-08-07T10:01:34.505Z","dependency_job_id":null,"html_url":"https://github.com/oroszgy/hungarian-text-mining-workshop","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oroszgy%2Fhungarian-text-mining-workshop","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oroszgy%2Fhungarian-text-mining-workshop/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oroszgy%2Fhungarian-text-mining-workshop/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oroszgy%2Fhungarian-text-mining-workshop/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oroszgy","download_url":"https://codeload.github.com/oroszgy/hungarian-text-mining-workshop/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250366715,"owners_count":21418772,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","hungarian","information-extraction","keyword-extraction","machine-learning","meetup","natural-language-processing","nlp","python","scikit-learn","sentiment-analysis","spacy","spacy-models","text-mining","text-mining-workshop","textacy","tutorial","workshop"],"created_at":"2024-10-02T10:22:42.458Z","updated_at":"2025-04-23T04:17:54.213Z","avatar_url":"https://github.com/oroszgy.png","language":"Jupyter Notebook","funding_links":[],"categories":["Learning resources"],"sub_categories":["Tutorials"],"readme":"# Text mining workshop\n\n## Preparation for the workshop\n\nPlease be prepared with\n\n* basic knowledge of Python\n* experience in using Jupyter notebooks\n\nDuring the course we will use little bit of Pandas ([10 minute intro](https://pandas.pydata.org/pandas-docs/stable/10min.html)) and [scikit-learn](http://scikit-learn.org/stable/) to build simple machine learning models.\n\n## Install dependencies and run the notebooks\n\n### The easy way: using Docker\n\nGet the docker image: `docker pull oroszgy/hungarian-text-mining-workshop`\n\nStart Jupyter Notebook: `make start`\n\n### The hard way: installing the packages manually\n\n0. Make sure you have Python 3.5+ installed (preferably a conda distribution)\n1. Clone this repository: `git clone http://github.com/oroszgy/hungarian-text-mining-workshop \u0026\u0026 cd hungarian-text-mining-workshop`\n2. Install the necessary packages: `pip install -r requirements.txt`\n3. Download the Enlgish and the Hungaruan NLP models for spaCy:\n    * `python -m spacy download en`\n    * `pip install https://github.com/oroszgy/spacy-hungarian-models/releases/download/hu_tagger_web_md-0.1.0/hu_tagger_web_md-0.1.0.tar.gz`\n4. Install HuNlpy\n    * `pip install https://github.com/oroszgy/hunlp/releases/download/0.2/hunlp-0.2.0.tar.gz`\n\nStart Jupyter Notebook: `jupyter notebook`\n\n## Table of Contents\n\n1. [Practical NLP in Python: `spaCy` and `textacy`, Describing documents with words](./1_Intro.ipynb)\n2. [Document categorization, Sentiment analysis](./2_TextCategorization.ipynb)\n3. [Extracting named entities and concepts](./3_EntitiesAndConcepts.ipynb)\n\n## Softwares used\n\n* [spaCy](https://spacy.io)\n* [Hungarian model for spaCy](https://github.com/oroszgy/spacy-hungarian-models)\n* [textacy](http://textacy.readthedocs.io/)\n* [scikit-learn](http://scikit-learn.org/stable/)\n* [HuNlp](https://github.com/oroszgy/hunlp)\n* [DBpedia Spotlight](http://www.dbpedia-spotlight.org/)\n\n---\n\n(c) Gyorgy Orosz, 2017\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foroszgy%2Fhungarian-text-mining-workshop","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foroszgy%2Fhungarian-text-mining-workshop","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foroszgy%2Fhungarian-text-mining-workshop/lists"}