https://github.com/teilomillet/operatio
https://github.com/teilomillet/operatio
llm merging mojo task-arithmetic
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/teilomillet/operatio
- Owner: teilomillet
- License: apache-2.0
- Created: 2024-10-15T07:44:22.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-16T13:58:55.000Z (over 1 year ago)
- Last Synced: 2024-12-30T20:51:54.519Z (over 1 year ago)
- Topics: llm, merging, mojo, task-arithmetic
- Language: Mojo
- Homepage:
- Size: 2.45 MB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Operatio

Operatio is a tool for manipulating language models using task vectors. It's designed to help me learned about the underlying of the task arithmetic and about the mojo lang, it's also a tool to explore large language models through weight manipulation. I might try to implement the papers in the pdf/.
## What are Task Vectors?
Task vectors are the difference between the weights of a base model and a fine-tuned model. They capture the "knowledge" or "skills" that a model has acquired for a specific task during fine-tuning. By manipulating these task vectors, we can potentially transfer (or suppress) skills between models or create models with combined capabilities.
## What Operatio Does
Operatio provides a suite of operations for working with language models and task vectors:
1. **Load Models**: Efficiently load and save model weights.
2. **Extract Difference**: Compute the task vector by finding the difference between a base model and a fine-tuned model.
3. **Transform Model**: Apply a task vector to a base model, potentially transferring skills or knowledge.
4. **Full Pipeline**: Perform all of the above operations in sequence.
## Installation
I used `magic` from modular so I recommand using it too. After having clone this github, simply use `magic install` then `magic shell` and you can use the following commands.
## Usage
Operatio has a command-line tool with several operations:
```bash
mojo run main.mojo [] [--output_dir ] [--scaling_coef ]
```
## Operations:
- load: Load and save model weights
- extract: Compute the task vector
- transform: Apply a task vector to a model
- full: Run the complete pipeline
## Arguments:
- ``: The operation to perform (load, extract, transform, or full)
- ``: Path or identifier for the base model
- ``: Path or identifier for the fine-tuned model (required for extract and full operations)
- `--output_dir`: Directory to save output files (default: "models")
- `--scaling_coef`: Scaling coefficient for the task vector (default: 0.5)
## Examples:
Load models:
```bash
mojo run main.mojo load path/to/base/model path/to/finetuned/model
```
Extract task vector:
```bash
mojo run main.mojo extract path/to/base/model path/to/finetuned/model
```
Transform a model:
```bash
mojo run main.mojo transform path/to/base/model --scaling_coef 0.7
```
Run full pipeline:
```bash
mojo run main.mojo full path/to/base/model path/to/finetuned/model --scaling_coef 0.7
```
## License
This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.