Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/reykan-kp/wordprobabilitycalculator
Analyze text documents and discover the top 5 words with their probabilities and counts. Available in Java, Python, and JavaScript.
https://github.com/reykan-kp/wordprobabilitycalculator
Last synced: about 2 months ago
JSON representation
Analyze text documents and discover the top 5 words with their probabilities and counts. Available in Java, Python, and JavaScript.
- Host: GitHub
- URL: https://github.com/reykan-kp/wordprobabilitycalculator
- Owner: ReyKan-KP
- License: mit
- Created: 2023-10-07T05:35:44.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-07T05:51:57.000Z (over 1 year ago)
- Last Synced: 2024-11-24T18:16:59.058Z (about 2 months ago)
- Language: Java
- Size: 1.18 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Word Probability Calculator
This repository contains three implementations of a Word Probability Calculator in different programming languages: Java, Python, and JavaScript. The calculator analyzes a collection of text documents, calculates the probability of each word occurring, and displays the top 5 words with their probabilities and counts.
## Java Implementation
### Dependencies
- Java 8 or higher
### How to Run
1. Compile the Java code:
```bash
javac WordProbabilityCalculator.java
```2. Run the Java program:
```bash
java WordProbabilityCalculator
```## Python Implementation
### Dependencies
- Python 3.x
- NLTK library (Natural Language Toolkit)You can install NLTK using the following command:
```bash
pip install nltk
```### How to Run
1. Navigate to the Python code directory.
2. Run the Python program:
```bash
python word_probability_calculator.py
```## JavaScript Implementation
### How to Run
1. Navigate to the JavaScript code directory.
2. Run the JavaScript program:
```bash
node wordProbabilityCalculator.js
```## Stopwords
All implementations use a list of English stopwords to filter out common words. You can customize the list by modifying the `stopwords.txt` file.
## Directory Structure
- `documents`: Place your text documents in this directory for analysis.
- `stopwords.txt`: Customize the list of stopwords here.## Output
All implementations will display the top 5 words across all documents, including each word's probability and count.
## Contributing
Feel free to contribute to this repository by adding more features or improvements to the existing code or by providing implementations in other programming languages.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.