https://github.com/shakilgithub20/text-preprocessing
https://github.com/shakilgithub20/text-preprocessing
contractions corpus-processing matplotlib-venn nltk preprocessing text-preprocessing
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/shakilgithub20/text-preprocessing
- Owner: Shakilgithub20
- Created: 2021-09-19T19:10:02.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2021-09-19T19:23:27.000Z (over 3 years ago)
- Last Synced: 2025-01-10T10:29:55.236Z (4 months ago)
- Topics: contractions, corpus-processing, matplotlib-venn, nltk, preprocessing, text-preprocessing
- Language: Jupyter Notebook
- Homepage:
- Size: 3.77 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Text Preprocessing in NLP
The various text preprocessing steps are:
Tokenization.
Lower casing.
Stop words removal.
Stemming.
Lemmatization,
remove_hashtag,
remove_urls,
remove_numbers,
remove_special_characters,
remove_extra_whitespace_tab,
remove_punctuation.