https://github.com/uziellujan/dl-textgen-textclass

Deep learning project on practical implementation of text generation and text classification pipelines with PyTorch and Hugging Face using RNNs, LSTMs, GRUs, and Transformers.
https://github.com/uziellujan/dl-textgen-textclass

deep-learning gru huggingface language-models lstm nlp-machine-learning pytorch rnn text-classification text-generation transformers

Last synced: 9 months ago
JSON representation

Deep learning project on practical implementation of text generation and text classification pipelines with PyTorch and Hugging Face using RNNs, LSTMs, GRUs, and Transformers.

Host: GitHub
URL: https://github.com/uziellujan/dl-textgen-textclass
Owner: UzielLujan
Created: 2025-09-27T16:27:04.000Z (9 months ago)
Default Branch: main
Last Pushed: 2025-10-05T00:35:14.000Z (9 months ago)
Last Synced: 2025-10-05T02:41:32.051Z (9 months ago)
Topics: deep-learning, gru, huggingface, language-models, lstm, nlp-machine-learning, pytorch, rnn, text-classification, text-generation, transformers
Language: Python
Homepage:
Size: 289 KB
Stars: 3
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# DL-TextGen-TextClass

Deep learning project on practical implementation of text generation and text classification pipelines with PyTorch and Hugging Face using RNNs, LSTMs, GRUs, and Transformers.

Este repositorio contiene la **Tarea 2** del curso de maestría *Deep Learning para Procesamiento de Texto e Imágenes*.

## 📂 Estructura del proyecto

```bash
DL-TextGen-TextClass/
├── PartA/ # Generación de letras de canciones
│ ├── data/
│ ├── src/
│ ├── models/
│ ├── results/
│ ├── logs/
│ ├── jobs/
│ └── README_A.md
│
├── PartB/ # Clasificación de reseñas turísticas
│ ├── data/
│ ├── src/
│ ├── models/
│ ├── results/
│ ├── logs/
│ ├── jobs/
│ └── README_B.md (pendiente)
│
├── requirements.txt
├── environment.yml
└── README.md
```

- `PartA/README_A.md`: instrucciones específicas para generación de texto.
- `PartB/README_B.md`: instrucciones específicas para clasificación.

## Cómo ejecutar
El proyecto está preparado para correrse tanto en local como en el clúster **Lab-SB (CIMAT)**.
Ejemplo (Parte A, caracter-level RNN):
```bash
cd PartA
sbatch jobs/run_char.sh char_rnn_1
```

📦 Dependencias

- Python 3.11
- PyTorch + CUDA
- Transformers
- Datasets
- Accelerate
- Scikit-learn
- Matplotlib, Pandas, Numpy

Instalar con:
```bash
conda env create -f jobs/environment.yml
```
Resultados esperados

- Perplejidad (PPL) en generación.
- Accuracy y F1-score en clasificación.
- Gráficas de curvas de entrenamiento, matrices de confusión y ejemplos de letras generadas.
- Tablas comparativas de arquitecturas y tiempos de cómputo.

## Autor
Uziel Isaí Lujan López — M.Sc. in Statistical Computing at CIMAT

'uziel.lujan@cimat.mx'

[LinkedIn](https://www.linkedin.com/in/uziel-lujan/) | [GitHub](https://github.com/UzielLujan)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/uziellujan/dl-textgen-textclass

Awesome Lists containing this project

README