https://github.com/md-emon-hasan/nlp-codebasics
Collection of basic Natural Language Processing examples that cover essential techniques like tokenization, text representation, and text classification.
https://github.com/md-emon-hasan/nlp-codebasics
bag-of-words bow gensim gensim-word2vec lematization nlp nlp-library nlp-machine-learning nltk nltk-python python3 spacy text-classification text-processing tokenization
Last synced: 6 months ago
JSON representation
Collection of basic Natural Language Processing examples that cover essential techniques like tokenization, text representation, and text classification.
- Host: GitHub
- URL: https://github.com/md-emon-hasan/nlp-codebasics
- Owner: Md-Emon-Hasan
- License: apache-2.0
- Created: 2024-12-11T14:06:47.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2025-01-17T17:51:03.000Z (about 1 year ago)
- Last Synced: 2025-09-04T00:43:16.945Z (7 months ago)
- Topics: bag-of-words, bow, gensim, gensim-word2vec, lematization, nlp, nlp-library, nlp-machine-learning, nltk, nltk-python, python3, spacy, text-classification, text-processing, tokenization
- Language: Jupyter Notebook
- Homepage:
- Size: 21.1 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Roadmap: Roadmap/Roadmap.md
Awesome Lists containing this project
README
# NLP CodeBasics
This repository provides fundamental code examples and techniques for Natural Language Processing (NLP). It covers essential concepts, tools, and methods such as tokenization, part-of-speech tagging, named entity recognition, and text classification. The project employs popular Python libraries like **spaCy**, **NLTK**, and **Gensim** to implement various NLP tasks.
Key Features:
- Tokenization with **spaCy**
- **Part-of-Speech Tagging** and **Named Entity Recognition** (NER)
- Text representation with **TF-IDF**, **Word Vectors**, and **Word2Vec**
- **Text Classification** using **spaCy** and **Gensim**
- Comparative studies: **spaCy vs NLTK**
- Advanced NLP techniques for feature extraction and preprocessing