https://github.com/kyegomez/multi-model-training

An experimental repository on research for training multiple models all at once in an evolutionary capacity!
https://github.com/kyegomez/multi-model-training

ai cats mamba ml pytorch ssms tensorflow training transformers

Last synced: 5 days ago
JSON representation

An experimental repository on research for training multiple models all at once in an evolutionary capacity!

Host: GitHub
URL: https://github.com/kyegomez/multi-model-training
Owner: kyegomez
License: mit
Created: 2024-07-08T03:38:16.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-12-21T12:50:55.000Z (10 months ago)
Last Synced: 2024-12-31T15:58:31.758Z (9 months ago)
Topics: ai, cats, mamba, ml, pytorch, ssms, tensorflow, training, transformers
Language: Python
Homepage: https://discord.com/servers/agora-999382051935506503
Size: 2.17 MB
Stars: 2
Watchers: 2
Forks: 0
Open Issues: 3
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          [![Multi-Modality](agorabanner.png)](https://discord.com/servers/agora-999382051935506503)

# Multi-Model Trainers

Over the past year, Agora has implemented and created thousands of models all with slightly different variations and architectures. So a question arose, which was: How can we train multiple models all at once and then evaluate them and then distribute more GPU memory to the models that are learning the fastest with the lowest loss. There are many configurations we'll attempt in the future, but this is super experimental! If you want to hack on this, Join us at Agora and let's accelerate!

## Install

```bash

$ pip install multi-model-trainers

```

## Usage

```python

import torch

import torch.nn as nn

from loguru import logger

from multi_model_trainers.main import MultiModelMemoryTrainer

# Example usage

if __name__ == "__main__":

    # Create some dummy models

    models = [

        nn.Sequential(nn.Linear(10, 50), nn.ReLU(), nn.Linear(50, 1))

        for _ in range(3)

    ]

    initial_allocation = [1 / 3, 1 / 3, 1 / 3]

    total_memory = 4 * 1024 * 1024 * 1024  # 4 GB

    gpu_allocator = MultiModelMemoryTrainer(

        models, initial_allocation, total_memory

    )

    # Simulate a few training steps

    for step in range(5):

        logger.info(f"Training step {step}")

        # Generate dummy data

        train_data = {

            "inputs": torch.rand(32, 10),

            "targets": torch.rand(32, 1),

        }

        losses = gpu_allocator.train_step(train_data)

        # Update learning rates based on losses (this is a simplistic approach)

        learning_rates = [1 / (loss + 1e-5) for loss in losses]

        gpu_allocator.update_learning_rates(learning_rates)

        # Reallocate GPU memory

        gpu_allocator.reallocate_gpu_memory()

        # Validation step

        val_data = {

            "inputs": torch.rand(64, 10),

            "targets": torch.rand(64, 1),

        }

        val_losses = gpu_allocator.validate(val_data)

    logger.info("Training complete")

```

# License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/multi-model-training

Awesome Lists containing this project

README