Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/play3rzer0/wordcount

Using NLP To Find Most Common Words In Text Documents
https://github.com/play3rzer0/wordcount

natural-language-processing nlp nltk-python python text-processing

Last synced: about 4 hours ago
JSON representation

Using NLP To Find Most Common Words In Text Documents

Host: GitHub
URL: https://github.com/play3rzer0/wordcount
Owner: Play3rZer0
Created: 2019-03-10T23:01:43.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2019-03-15T21:11:43.000Z (over 5 years ago)
Last Synced: 2024-07-11T14:25:42.054Z (4 months ago)
Topics: natural-language-processing, nlp, nltk-python, python, text-processing
Language: Python
Size: 6.84 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# WordCount
Using NLP To Find Most Common Words In Text Documents.

Process the frequency or number of times the most common words in a text document occurs. This makes use of the NLTK
(Natural Language Toolkit) 3.0 in a Python 2.7.1x or 3.x environment. I was originally using Python version 2.7.10 as a virtual environment within a Python 3.6.5 Anaconda version. The code was originally in 2.7.10, but it was modified to
support the latest version.

The following modules must be imported:
- collections
- re
- nltk

The stopwords must also be imported from nltk.corpus.

The ff. is the synatx to run the code from a terminal command prompt (Linux, MacOS, Unix) or command prompt (Windows):

python .py