Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/reyhaneh-saffar/llm-text-generator-on-the-persian-wikipedia-dataset

Training a neural network model to predict and generate text sequences
https://github.com/reyhaneh-saffar/llm-text-generator-on-the-persian-wikipedia-dataset

Last synced: 16 days ago
JSON representation

Training a neural network model to predict and generate text sequences

Awesome Lists containing this project

README

        

This project focuses on training a neural network model to predict and generate text sequences. By leveraging character-level representation and sequential modeling techniques, the project demonstrates the model's ability to learn patterns from a text dataset and generate coherent text continuations.

### Dataset Preparation
- **Preprocessing**:
- A large text dataset was processed by removing unnecessary line breaks and extracting a manageable sample.
- Unique characters (224 in total) were identified and mapped to numerical indices, enabling text conversion into numerical format.
- **Dataset Structuring**:
- The dataset was batched to suit the neural network's requirements, enabling efficient training on character-level sequences.

### Neural Network Model
- **Architecture**:
- The model consisted of:
- **Embedding Layer**: Converts characters into dense vector representations.
- **GRU Layer**: Captures sequential dependencies in character sequences.
- **Dense Layer**: Generates output predictions for the next character in the sequence.
- **Training**:
- The model was trained over multiple epochs, optimizing a loss function suited for categorical data.
- A callback saved the model's weights at each epoch, ensuring progress was preserved.
- Training resulted in a steady reduction of loss values, reflecting the model's increasing accuracy in predicting target sequences.

### Text Generation
- **Implementation**:
- A new model was created with the same architecture and loaded with the saved weights, enabling text generation based on learned patterns.
- A function was implemented to:
- Take a starting text input.
- Generate a specified number of characters by predicting one character at a time.
- **Output**:
- Generated text sequences showcased the model's ability to produce plausible continuations of a given input, demonstrating its understanding of learned patterns in the dataset.

## Results and Insights
- **Training Performance**:
- Loss values decreased consistently over training epochs, indicating successful learning of text patterns.
- **Text Generation**:
- The model generated coherent and contextually relevant sequences, showcasing its practical application for text prediction and generation.