https://github.com/prince2004patel/next_word_prediction

This project implements a Next Word Prediction model using LSTM with an Attention mechanism.
https://github.com/prince2004patel/next_word_prediction

keras next-word-prediction python streamlit tensorflow

Last synced: 2 months ago
JSON representation

This project implements a Next Word Prediction model using LSTM with an Attention mechanism.

Host: GitHub
URL: https://github.com/prince2004patel/next_word_prediction
Owner: prince2004patel
Created: 2025-02-18T07:52:42.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-18T08:10:16.000Z (over 1 year ago)
Last Synced: 2025-05-18T16:50:15.610Z (about 1 year ago)
Topics: keras, next-word-prediction, python, streamlit, tensorflow
Language: Jupyter Notebook
Homepage: https://next-word-prediction-by-prince.streamlit.app/
Size: 12.5 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Next Word Prediction Using LSTM + Attention

## Overview
This project implements a **Next Word Prediction** model using **LSTM (Long Short-Term Memory) with an Attention mechanism**. The model is trained on **Shakespeare's Hamlet dataset** and achieves **70% accuracy** in predicting the next word in a sequence.

## Live Demo :-

[![Streamlit App](https://img.shields.io/badge/Streamlit-App-blue)](https://next-word-prediction-by-prince.streamlit.app/)

## Project Workflow

### 1. Data Collection
- Collected **Shakespeare's Hamlet** dataset as raw text.

### 2. Data Preprocessing
- **Tokenization:** Used `Tokenizer` from Keras to convert text into sequences.
- **Lowercasing:** Converted all text to lowercase for consistency.
- **Vocabulary Size Check:** Identified a total of **4,818 unique words**.
- **Input Sequence Creation:**
- Generated input sequences by taking a sliding window of words.
- Example: "to be or not to" → "be or not to predict_next_word".
- **Finding Maximum Sequence Length:** Computed the longest input sequence length.
- **Padding Sequences:**
- Used `pad_sequences()` to apply **pre-padding** (padding at the beginning) to standardize input size.

### 3. Feature Engineering
- **Divided dataset into X (input sequences) and y (target words).**
- **Performed train-test split** to prepare the dataset for model training.

### 4. Model Development
- Built an **LSTM-based neural network with an Attention mechanism**.
- Model architecture:
- **Embedding Layer** to convert words into vector representations.
- **LSTM Layer** to capture sequential dependencies.
- **Attention Layer** to focus on important words in the sequence.
- **Dense Output Layer** with a softmax activation for predicting the next word.
- **Compiled the model using categorical cross-entropy loss** and Adam optimizer.

### 5. Model Training
- Trained the model on the dataset.
- Achieved **70% accuracy** on test data.

### 6. Prediction Function
- Created a **helper function** for predicting the next word:
- Takes input text.
- Tokenizes and pads it to match the model's input shape.
- Predicts the most likely next word.
- Successfully tested predictions on different inputs.

### 7. Streamlit Web Application
- Built a **Streamlit-based UI** to interact with the model.
- Users can enter a sentence, and the model predicts the next word.

## How to Run the Project
1. Install dependencies:
```bash
pip install -r requirements.txt
```
2. Run the Streamlit app:
```bash
streamlit run app.py
```
3. Open the **localhost URL** to interact with the app.

## Technologies Used
- **Python**
- **TensorFlow & Keras** (Deep Learning)
- **Streamlit** (Web Interface)
- **Numpy ,Pandas & Pickle** (Data Handling)

## Future Improvements
- Train on a **larger dataset** for improved generalization.
- Experiment with **transformer-based models (e.g., GPT, BERT)** for better predictions.
- Optimize the **attention mechanism** to enhance word predictions.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/prince2004patel/next_word_prediction

Awesome Lists containing this project

README