Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/reyhaneh-saffar/llm-text-generator-on-the-persian-wikipedia-dataset
Training a neural network model to predict and generate text sequences
https://github.com/reyhaneh-saffar/llm-text-generator-on-the-persian-wikipedia-dataset
Last synced: 16 days ago
JSON representation
Training a neural network model to predict and generate text sequences
- Host: GitHub
- URL: https://github.com/reyhaneh-saffar/llm-text-generator-on-the-persian-wikipedia-dataset
- Owner: reyhaneh-saffar
- Created: 2025-01-11T00:23:56.000Z (20 days ago)
- Default Branch: main
- Last Pushed: 2025-01-11T00:27:59.000Z (20 days ago)
- Last Synced: 2025-01-11T01:26:00.478Z (20 days ago)
- Language: Jupyter Notebook
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
This project focuses on training a neural network model to predict and generate text sequences. By leveraging character-level representation and sequential modeling techniques, the project demonstrates the model's ability to learn patterns from a text dataset and generate coherent text continuations.
### Dataset Preparation
- **Preprocessing**:
- A large text dataset was processed by removing unnecessary line breaks and extracting a manageable sample.
- Unique characters (224 in total) were identified and mapped to numerical indices, enabling text conversion into numerical format.
- **Dataset Structuring**:
- The dataset was batched to suit the neural network's requirements, enabling efficient training on character-level sequences.### Neural Network Model
- **Architecture**:
- The model consisted of:
- **Embedding Layer**: Converts characters into dense vector representations.
- **GRU Layer**: Captures sequential dependencies in character sequences.
- **Dense Layer**: Generates output predictions for the next character in the sequence.
- **Training**:
- The model was trained over multiple epochs, optimizing a loss function suited for categorical data.
- A callback saved the model's weights at each epoch, ensuring progress was preserved.
- Training resulted in a steady reduction of loss values, reflecting the model's increasing accuracy in predicting target sequences.### Text Generation
- **Implementation**:
- A new model was created with the same architecture and loaded with the saved weights, enabling text generation based on learned patterns.
- A function was implemented to:
- Take a starting text input.
- Generate a specified number of characters by predicting one character at a time.
- **Output**:
- Generated text sequences showcased the model's ability to produce plausible continuations of a given input, demonstrating its understanding of learned patterns in the dataset.## Results and Insights
- **Training Performance**:
- Loss values decreased consistently over training epochs, indicating successful learning of text patterns.
- **Text Generation**:
- The model generated coherent and contextually relevant sequences, showcasing its practical application for text prediction and generation.