An open API service indexing awesome lists of open source software.

https://github.com/astorfi/llm-alignment-project

A comprehensive template for aligning large language models (LLMs) using Reinforcement Learning from Human Feedback (RLHF), transfer learning, and more. Build your own customizable LLM alignment solution with ease.
https://github.com/astorfi/llm-alignment-project

ai alignment deep-learning generative-ai large-language-models llms machine-learning rlhf template

Last synced: 6 months ago
JSON representation

A comprehensive template for aligning large language models (LLMs) using Reinforcement Learning from Human Feedback (RLHF), transfer learning, and more. Build your own customizable LLM alignment solution with ease.

Awesome Lists containing this project

README

          

# ๐ŸŒŒ LLM Alignment Template - Your Template for Aligning Language Models

[![Build Status](https://github.com/astorfi/LLM-Alignment-Project-Template/actions/workflows/ci.yml/badge.svg)](https://github.com/astorfi/LLM-Alignment-Project-Template/actions)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Contributions Welcome](https://img.shields.io/badge/contributions-welcome-brightgreen)](https://github.com/astorfi/LLM-Alignment-Project-Template/issues)
[![Python Version](https://img.shields.io/badge/python-3.8%2B-blue)](https://www.python.org/)

## ๐Ÿ“Œ Introduction

# LLM Alignment Project


LLM Alignment Project


Figure 1: Take a look at: arXiv:2308.05374

**LLM Alignment Template** is not just a comprehensive tool for aligning large language models (LLMs), but also serves as a **powerful template** for building your own LLM alignment application. Inspired by project templates like **PyTorch Project Template**, this repository is designed to provide a full stack of functionality, acting as a starting point to customize and extend for your own LLM alignment needs. Whether you are a researcher, developer, or data scientist, this template provides a solid foundation for efficiently creating and deploying LLMs tailored to align with human values and objectives.

## ๐Ÿš€ Overview

**LLM Alignment Template** provides a full stack of functionality, including training, fine-tuning, deploying, and monitoring LLMs using Reinforcement Learning from Human Feedback (RLHF). This project also integrates evaluation metrics to ensure ethical and effective use of language models. The interface offers a user-friendly experience for managing alignment, visualizing training metrics, and deploying at scale.

## โœจ Features

- **๐ŸŒ Interactive Web Interface**: A user-friendly interface for interacting with the LLM, training models, and viewing alignment metrics.
- **๐Ÿง  Training with RLHF**: Reinforcement Learning from Human Feedback to ensure model alignment with human preferences.
- **๐Ÿ› ๏ธ Data Augmentation & Preprocessing**: Advanced preprocessing, tokenization, and data augmentation with back-translation and paraphrasing.
- **๐Ÿ”„ Transfer Learning**: Utilize pre-trained models like BERT for improved performance on specific tasks.
- **๐Ÿ“ฆ Scalable Deployment**: Docker and Kubernetes-based deployment with Horizontal Pod Autoscaling (HPA).
- **๐Ÿ” Model Explainability**: SHAP-based dashboards for understanding model decisions.
- **๐Ÿ“Š User Feedback Loop**: Collection of user ratings for fine-tuning models continuously.

## ๐Ÿ“‚ Table of Contents

- [Introduction](#-introduction)
- [Overview](#-overview)
- [Features](#-features)
- [Project Structure](#-project-structure)
- [Setup](#๏ธ-setup)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Running Locally](#running-locally)
- [Deployment](#-deployment)
- [Kubernetes Deployment](#kubernetes-deployment)
- [Canary Deployment](#canary-deployment)
- [Monitoring and Logging](#monitoring-and-logging)
- [Training and Evaluation](#-training-and-evaluation)
- [Testing](#-testing)
- [Future Work](#-future-work)
- [Contributing](#-contributing)
- [License](#-license)
- [Contact](#-contact)

## ๐Ÿ“‚ Project Structure

- **`app/`**: Contains API and UI code.
- `auth.py`, `feedback.py`, `ui.py`: API endpoints for user interaction, feedback collection, and general interface management.
- **Static Files**: JavaScript (`app.js`, `chart.js`), CSS (`styles.css`), and Swagger API documentation (`swagger.json`).
- **Templates**: HTML templates (`chat.html`, `feedback.html`, `index.html`) for UI rendering.

- **`src/`**: Core logic and utilities for preprocessing and training.
- **Preprocessing** (`preprocessing/`):
- `preprocess_data.py`: Combines original and augmented datasets and applies text cleaning.
- `tokenization.py`: Handles tokenization.
- **Training** (`training/`):
- `fine_tuning.py`, `transfer_learning.py`, `retrain_model.py`: Scripts for training and retraining models.
- `rlhf.py`, `reward_model.py`: Scripts for reward model training using RLHF.
- **Utilities** (`utils/`): Common utilities (`config.py`, `logging.py`, `validation.py`).

- **`dashboards/`**: Performance and explainability dashboards for monitoring and model insights.
- `performance_dashboard.py`: Displays training metrics, validation loss, and accuracy.
- `explainability_dashboard.py`: Visualizes SHAP values to provide insight into model decisions.

- **`tests/`**: Unit, integration, and end-to-end tests.
- `test_api.py`, `test_preprocessing.py`, `test_training.py`: Various unit and integration tests.
- **End-to-End Tests** (`e2e/`): Cypress-based UI tests (`ui_tests.spec.js`).
- **Load Testing** (`load_testing/`): Uses Locust (`locustfile.py`) for load testing.

- **`deployment/`**: Configuration files for deployment and monitoring.
- **Kubernetes Configurations** (`kubernetes/`): Deployment and Ingress configurations for scaling and canary releases.
- **Monitoring** (`monitoring/`): Prometheus (`prometheus.yml`) and Grafana (`grafana_dashboard.json`) for performance and system health monitoring.

## โš™๏ธ Setup

### Prerequisites

- ๐Ÿ Python 3.8+
- ๐Ÿณ Docker & Docker Compose
- โ˜ธ๏ธ Kubernetes (Minikube or a cloud provider)
- ๐ŸŸข Node.js (for front-end dependencies)

### ๐Ÿ“ฆ Installation

1. **Clone the Repository**:
```bash
git clone https://github.com/yourusername/LLM-Alignment-Template.git
cd LLM-Alignment-Template
```

2. **Install Dependencies**:
- Python dependencies:
```bash
pip install -r requirements.txt
```
- Node.js dependencies (optional for UI improvements):
```bash
cd app/static
npm install
```

### ๐Ÿƒ Running Locally

1. **Build Docker Images**:
```bash
docker-compose up --build
```

2. **Access the Application**:
- Open a browser and visit `http://localhost:5000`.

## ๐Ÿšข Deployment

### โ˜ธ๏ธ Kubernetes Deployment

- **Deploy to Kubernetes**:
- Apply the deployment and service configurations:
```bash
kubectl apply -f deployment/kubernetes/deployment.yml
kubectl apply -f deployment/kubernetes/service.yml
```
- **Horizontal Pod Autoscaler**:
```bash
kubectl apply -f deployment/kubernetes/hpa.yml
```

### ๐ŸŒŸ Canary Deployment

- Canary deployments are configured using `deployment/kubernetes/canary_deployment.yml` to roll out new versions safely.

### ๐Ÿ“ˆ Monitoring and Logging

- **Prometheus and Grafana**:
- Apply Prometheus and Grafana configurations in `deployment/monitoring/` to enable monitoring dashboards.
- **๐Ÿ“‹ Centralized Logging**: The **ELK Stack** is configured with Docker using `docker-compose.logging.yml` for centralized logs.

## ๐Ÿง  Training and Evaluation

### ๐Ÿ”„ Transfer Learning

The training module (`src/training/transfer_learning.py`) uses pre-trained models like **BERT** to adapt to custom tasks, providing a significant performance boost.

### ๐Ÿ“Š Data Augmentation

The `data_augmentation.py` script (`src/data/`) applies augmentation techniques like back-translation and paraphrasing to improve data quality.

### ๐Ÿง  Reinforcement Learning from Human Feedback (RLHF)

- **Reward Model Training**: Uses the `rlhf.py` and `reward_model.py` scripts to fine-tune models based on human feedback.
- **Feedback Collection**: Users rate responses via the feedback form (`feedback.html`), and the model retrains with `retrain_model.py`.

### ๐Ÿ” Explainability Dashboard

The `explainability_dashboard.py` script uses **SHAP** values to help users understand why a model made specific predictions.

## ๐Ÿงช Testing

- **โœ… Unit Tests**: Located in `tests/`, covering API, preprocessing, and training functionalities.
- **๐Ÿ–ฅ๏ธ End-to-End Tests**: Uses **Cypress** to test UI interactions.
- **๐Ÿ“Š Load Testing**: Implemented with **Locust** (`tests/load_testing/locustfile.py`) to ensure stability under load.

## ๐Ÿ”ฎ Future Work

- **๐Ÿ”‘ User Roles and Permissions**: Adding a role-based access control system.
- **๐Ÿ“‰ Advanced Monitoring**: Further enhance Prometheus alerts for anomaly detection.
- **๐Ÿš€ Public Demo Deployment**: Deploy a public version on Heroku or AWS for showcasing.

## ๐Ÿค Contributing

Contributions are welcome! Please submit pull requests or issues for improvements or new features.

## ๐Ÿ“œ License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more information.

## ๐Ÿ“ฌ Contact

- **๐Ÿ“ง Email**: [amirsina.torfi@gmail.com](mailto:amirsina.torfi@gmail.com)
- **๐ŸŒ Website**: [Author Website](https://astorfi.github.io)

## Main Collaborators

| [](https://github.com/astorfi)
[Amirsina Torfi](https://github.com/astorfi) | [](https://github.com/HRajoliN)
[Hossein Rajoli](https://github.com/HRajoliN) |
| --- | --- |