Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/play3rzer0/wordcount
Using NLP To Find Most Common Words In Text Documents
https://github.com/play3rzer0/wordcount
natural-language-processing nlp nltk-python python text-processing
Last synced: about 4 hours ago
JSON representation
Using NLP To Find Most Common Words In Text Documents
- Host: GitHub
- URL: https://github.com/play3rzer0/wordcount
- Owner: Play3rZer0
- Created: 2019-03-10T23:01:43.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-03-15T21:11:43.000Z (over 5 years ago)
- Last Synced: 2024-07-11T14:25:42.054Z (4 months ago)
- Topics: natural-language-processing, nlp, nltk-python, python, text-processing
- Language: Python
- Size: 6.84 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# WordCount
Using NLP To Find Most Common Words In Text Documents.Process the frequency or number of times the most common words in a text document occurs. This makes use of the NLTK
(Natural Language Toolkit) 3.0 in a Python 2.7.1x or 3.x environment. I was originally using Python version 2.7.10 as a virtual environment within a Python 3.6.5 Anaconda version. The code was originally in 2.7.10, but it was modified to
support the latest version.The following modules must be imported:
- collections
- re
- nltkThe stopwords must also be imported from nltk.corpus.
The ff. is the synatx to run the code from a terminal command prompt (Linux, MacOS, Unix) or command prompt (Windows):
python .py