https://github.com/prince2004patel/next_word_prediction
This project implements a Next Word Prediction model using LSTM with an Attention mechanism.
https://github.com/prince2004patel/next_word_prediction
keras next-word-prediction python streamlit tensorflow
Last synced: 2 months ago
JSON representation
This project implements a Next Word Prediction model using LSTM with an Attention mechanism.
- Host: GitHub
- URL: https://github.com/prince2004patel/next_word_prediction
- Owner: prince2004patel
- Created: 2025-02-18T07:52:42.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-18T08:10:16.000Z (over 1 year ago)
- Last Synced: 2025-05-18T16:50:15.610Z (about 1 year ago)
- Topics: keras, next-word-prediction, python, streamlit, tensorflow
- Language: Jupyter Notebook
- Homepage: https://next-word-prediction-by-prince.streamlit.app/
- Size: 12.5 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Next Word Prediction Using LSTM + Attention
## Overview
This project implements a **Next Word Prediction** model using **LSTM (Long Short-Term Memory) with an Attention mechanism**. The model is trained on **Shakespeare's Hamlet dataset** and achieves **70% accuracy** in predicting the next word in a sequence.
## Live Demo :-
[](https://next-word-prediction-by-prince.streamlit.app/)
## Project Workflow
### 1. Data Collection
- Collected **Shakespeare's Hamlet** dataset as raw text.
### 2. Data Preprocessing
- **Tokenization:** Used `Tokenizer` from Keras to convert text into sequences.
- **Lowercasing:** Converted all text to lowercase for consistency.
- **Vocabulary Size Check:** Identified a total of **4,818 unique words**.
- **Input Sequence Creation:**
- Generated input sequences by taking a sliding window of words.
- Example: "to be or not to" → "be or not to predict_next_word".
- **Finding Maximum Sequence Length:** Computed the longest input sequence length.
- **Padding Sequences:**
- Used `pad_sequences()` to apply **pre-padding** (padding at the beginning) to standardize input size.
### 3. Feature Engineering
- **Divided dataset into X (input sequences) and y (target words).**
- **Performed train-test split** to prepare the dataset for model training.
### 4. Model Development
- Built an **LSTM-based neural network with an Attention mechanism**.
- Model architecture:
- **Embedding Layer** to convert words into vector representations.
- **LSTM Layer** to capture sequential dependencies.
- **Attention Layer** to focus on important words in the sequence.
- **Dense Output Layer** with a softmax activation for predicting the next word.
- **Compiled the model using categorical cross-entropy loss** and Adam optimizer.
### 5. Model Training
- Trained the model on the dataset.
- Achieved **70% accuracy** on test data.
### 6. Prediction Function
- Created a **helper function** for predicting the next word:
- Takes input text.
- Tokenizes and pads it to match the model's input shape.
- Predicts the most likely next word.
- Successfully tested predictions on different inputs.
### 7. Streamlit Web Application
- Built a **Streamlit-based UI** to interact with the model.
- Users can enter a sentence, and the model predicts the next word.
## How to Run the Project
1. Install dependencies:
```bash
pip install -r requirements.txt
```
2. Run the Streamlit app:
```bash
streamlit run app.py
```
3. Open the **localhost URL** to interact with the app.
## Technologies Used
- **Python**
- **TensorFlow & Keras** (Deep Learning)
- **Streamlit** (Web Interface)
- **Numpy ,Pandas & Pickle** (Data Handling)
## Future Improvements
- Train on a **larger dataset** for improved generalization.
- Experiment with **transformer-based models (e.g., GPT, BERT)** for better predictions.
- Optimize the **attention mechanism** to enhance word predictions.