{"id":13935101,"url":"https://github.com/bonzanini/nlp-tutorial","last_synced_at":"2025-07-19T20:30:37.523Z","repository":{"id":55872926,"uuid":"63003667","full_name":"bonzanini/nlp-tutorial","owner":"bonzanini","description":"Tutorial: Natural Language Processing in Python","archived":false,"fork":false,"pushed_at":"2018-05-29T11:14:48.000Z","size":2114,"stargazers_count":276,"open_issues_count":0,"forks_count":152,"subscribers_count":31,"default_branch":"master","last_synced_at":"2024-08-08T23:20:03.800Z","etag":null,"topics":["natural-language-processing","nlp","python"],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bonzanini.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-07-10T14:52:20.000Z","updated_at":"2024-03-07T15:39:01.000Z","dependencies_parsed_at":"2022-08-15T08:20:13.650Z","dependency_job_id":null,"html_url":"https://github.com/bonzanini/nlp-tutorial","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonzanini%2Fnlp-tutorial","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonzanini%2Fnlp-tutorial/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonzanini%2Fnlp-tutorial/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonzanini%2Fnlp-tutorial/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bonzanini","download_url":"https://codeload.github.com/bonzanini/nlp-tutorial/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":226666518,"owners_count":17665040,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["natural-language-processing","nlp","python"],"created_at":"2024-08-07T23:01:23.877Z","updated_at":"2025-07-19T20:30:37.513Z","avatar_url":"https://github.com/bonzanini.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook"],"sub_categories":[],"readme":"Tutorial: Natural Language Processing in Python\n=====\n\nThis repo contains material for a workshop on Natural Language Processing with Python.\n\nAudience\n-----\n\nThe target audience of this workshop are students, researchers, developers, hobbyists and anyone interested in knowing more about Natural Language Processing and Text Analytics.\n\nSome very basic knowledge of Python is assumed (e.g. if you have seen some Python script before, you're good to go), but no previous NLP knowledge is required.\n\n\nPresentations\n-----\n\nDifferent versions of this workshop have been delivered at different events:\n\n- PyCon UK 2016: 3h session (slides ``presentations/2016-pyconuk-slides.pdf``)\n- PyCon Ireland 2016: 1.5h session (slides ``presentations/2016-pyconie-slides.pdf``)\n- PyCon Italy 2017: 3.5h session (slides ``presentations/2017-pyconitaly-slides.pdf``)\n\n\nEnvironment Set up\n-----\n\nThe code has been tested with Python ``3.4`` and ``3.5``. Support for Python ``2.7`` is best-effort, if you find an issue please report it.\n\nThis paragraph describes how to set up your environment locally.\n\nStep 1 - clone this repo::\n\n    git clone https://github.com/bonzanini/nlp-tutorial\n    cd nlp-tutorial\n\nStep 2 - create and activate a Python virtual environment::\n\n    virtualenv nlp-venv\n    source nlp-venv/bin/activate\n\nStep 2 (alternative) - create a Conda environment::\n\n    conda create --name nlp-venv python=3.5\n    source activate nlp-venv\n\nStep 3 - install libraries::\n\n    pip install -r requirements.txt\n\nThis will download and install NLTK, scikit-learn and jupyter (plus dependencies).\n\nNLTK requires some data to be installed separately (more details on `the NLTK website \u003chttp://www.nltk.org/data.html\u003e`_).\n\nFrom the command line, you can download the required packages::\n\n    python -m nltk.downloader punkt stopwords reuters\n\nAlternatively, from a Python interactive shell::\n\n    \u003e\u003e\u003e import nltk\n    \u003e\u003e\u003e nltk.download()\n\nThen use the GUI to select the requires packages (punkt, stopwords, reuters).\n\nTip: even if you can use \"all\" as package name to install all the NLTK data, it's not a great thing to do over a flakey conference wi-fi. This will download approx. 2Gb and if we all do it at the same time we'll kill the conference wi-fi :)\n\nFinally - run Jupyter::\n\n    jupyter notebook\n\nIn order to test that your environment is correctly set. Please open the notebook \"00 Environment Test\" and follow the instructions.\n\n\nmatplotlib backend issues\n-----\n\nThere might be a few issues related to ``matplotlib`` backends as described `on their documentation \u003chttp://matplotlib.org/faq/virtualenv_faq.html\u003e`_, especially on macOS.\n\nBy editing/creating the file ``~/.matplotlib/matplotlibrc`` with the following line::\n\n    backend: TkAgg\n\nthe issue should be fixed. If not, please refer to the `matplotlib docs \u003chttp://matplotlib.org/faq/virtualenv_faq.html\u003e`_\n\n\nAuthors\n-----\n\nMain authors:\n\n- Marco Bonzanini (`@MarcoBonzanini \u003chttp://www.twitter.com/marcobonzanini\u003e`_)\n- Miguel Martinez-Alvarez (`@MiguelMAlvarez \u003chttp://www.twitter.com/miguelmalvarez\u003e`_)\n\n\nLicense\n-----\n\nCode (mainly in `notebooks` folder) under MIT license.\n\nDocumentation and slides under CC-BY license.\n\n\nData\n-----\n\n- Documents in `data/recipes` are public domain from Project Gutenberg\n- Documents in `data/pyconuk2016` are the abstracts from https://github.com/PyconUK/2016.pyconuk.org\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbonzanini%2Fnlp-tutorial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbonzanini%2Fnlp-tutorial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbonzanini%2Fnlp-tutorial/lists"}