https://github.com/singhxtushar/nextwordai
https://github.com/singhxtushar/nextwordai
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/singhxtushar/nextwordai
- Owner: SINGHxTUSHAR
- License: mit
- Created: 2025-01-30T05:26:06.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-01-30T06:08:31.000Z (8 months ago)
- Last Synced: 2025-01-30T06:24:52.053Z (8 months ago)
- Language: Jupyter Notebook
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://github.com/SINGHxTUSHAR/NextWordAI/blob/master/LICENSE)
[](https://GitHub.com/SINGHxTUSHAR/NextWordAI/graphs/contributors/)
[](https://GitHub.com/SINGHxTUSHAR/NextWordAI/issues/)
[](https://GitHub.com/SINGHxTUSHAR/NextWordAI/pulls/)
[](http://makeapullrequest.com)[](https://GitHub.com/SINGHxTUSHAR/NextWordAI/watchers/)
[](https://GitHub.com/SINGHxTUSHAR/NextWordAI/network/)
[](https://GitHub.com/SINGHxTUSHAR/NextWordAI/stargazers/)[](https://open.vscode.dev/SINGHxTUSHAR/NextWordAI)
# NextWordAI
#### Project Overview:
This project focuses on developing a deep learning model for predicting the next word in a given sequence of words using Long Short-Term Memory (LSTM) networks. LSTMs are particularly effective for sequence prediction tasks because they capture long-term dependencies in data. The model is trained on the text of Shakespeare's "Hamlet," providing a rich and complex dataset that challenges the model's predictive capabilities.The project utilizes advanced deep-learning techniques to enhance natural language processing capabilities, specifically in next-word prediction tasks. By leveraging LSTM networks, it captures intricate patterns and dependencies within text data, ultimately providing a tool for applications such as text auto-completion and interactive chatbots. The deployment through a web application allows for user-friendly interaction with the model, showcasing its practical utility in real-time scenarios.
## Workflow Description
The project follows a structured workflow consisting of six key steps:1- `Data Collection`: We use the text of Shakespeare's "Hamlet" as our dataset. This rich, complex text provides a good challenge for our model.
2- `Data Preprocessing`: The text data is tokenized, converted into sequences, and padded to ensure uniform input lengths. The sequences are then split into training and testing sets.
3- `Model Building`: An LSTM model is constructed with an embedding layer, two LSTM layers, and a dense output layer with a softmax activation function to predict the probability of the next word.
4- `Model Training`: The model is trained using the prepared sequences, with early stopping implemented to prevent overfitting. Early stopping monitors the validation loss and stops training when the loss stops improving.
5- `Model Evaluation`: The model is evaluated using a set of example sentences to test its ability to predict the next word accurately.
6- `Deployment`: A Streamlit web application is developed to allow users to input a sequence of words and get the predicted next word in real-time.
## Requirements💻 :
Ensure you have the following dependencies installed:
- Python (version 3.11.x || 3.12.x)
- IDE: VS-CODE or collab
- Virtual-environment(venv)
- Other dependencies (refer to the requirement.txt)You can install the required Python packages using:
```bash
pip install -r requirement.txt
```## Setup 💿:
- Clone the repository:
```bash
git clone https://github.com/SINGHxTUSHAR/NextWordAI.git
cd NextWordAI
```
- Create a virtual environment (optional but recommended):
```bash
python -m venv venv
```
- Activate the virtual environment:
- On Windows:
```bash
venv\Scripts\activate
```
- On macOS/Linux:
```bash
source venv/bin/activate
```## Contributing 📌:
If you'd like to contribute to this project, please follow the standard GitHub fork and pull request process. Contributions, issues, and feature requests are welcome!## Suggestion 🚀:
If you have any suggestions for me related to this project, feel free to contact me at tusharsinghrawat.delhi@gmail.com or LinkedIn.## License 📝:
This project is licensed under the MIT License - see the LICENSE file for details.