An open API service indexing awesome lists of open source software.

https://github.com/projectcontinuum/continuum-feature-ai

AI and ML features for continuum
https://github.com/projectcontinuum/continuum-feature-ai

ai continuum continuum-feature cuda llm ml mlops pytourch unsloth

Last synced: about 2 months ago
JSON representation

AI and ML features for continuum

Awesome Lists containing this project

README

          


Continuum Feature AI


AI/ML nodes for fine-tuning LLMs inside your Project Continuum workflows


Kotlin
Python
LoRA
JDK 21

---

## 🌐 Part of Project Continuum

This is the **AI/ML feature repository** for [Project Continuum](https://github.com/projectcontinuum/Continuum) β€” a distributed, crash-proof workflow execution platform. It provides nodes for training and fine-tuning large language models directly inside your visual workflows.

---

## πŸ”₯ What Is This

A standalone Gradle project containing AI/ML workflow nodes. Currently features the **LLM Trainer (Unsloth)** node β€” fine-tune large language models using LoRA (Low-Rank Adaptation) with Unsloth acceleration, right inside your workflow graph.

Ships as a Spring Boot worker with an auto-managed Python virtual environment for ML execution.

---

## πŸ§ͺ Included Nodes

### LLM Trainer (Unsloth)

Fine-tune Large Language Models using LoRA with Unsloth acceleration.

| | |
|---|---|
| **Input** | Parquet table with instruction + response columns |
| **Output** | Model info β€” path to LoRA adapter weights, base model, training config |
| **Category** | Machine Learning, LLM Training |

**Supported Base Models:**

| Provider | Models |
|----------|--------|
| **Unsloth** (fastest) | Phi-4, Phi-4-mini-instruct, Mistral 7B, Llama 3/3.1/3.2, Gemma 2, Qwen 2.5 |
| **Microsoft** | Phi-2, Phi-3-mini-4k-instruct |
| **Meta** | Llama-2-7b, Llama-2-7b-chat |
| **Google** | Gemma 2B, Gemma 7B |
| **Qwen** | Qwen2-7B, Qwen2-7B-Instruct |
| **TII** | Falcon 7B, Falcon 7B-instruct |
| **Custom** | Any HuggingFace causal language model |

**Configurable Parameters:**

| Group | Parameters |
|-------|-----------|
| **Model** | Base model (HuggingFace ID), HuggingFace token for gated models |
| **Data** | Input column, output column, system prompt |
| **Training** | Epochs, batch size, learning rate, max sequence length, warmup steps, weight decay, gradient accumulation |
| **LoRA** | Rank (r), alpha, dropout |
| **Advanced** | 4-bit quantization, random seed, save steps, logging steps, Parquet batch size |

**Key Features:**
- Unsloth acceleration on Linux + CUDA (2x faster, 60% less memory)
- Falls back to standard HuggingFace transformers on other platforms
- 4-bit quantization for reduced memory usage
- Real-time training progress streaming via Kafka
- Auto-managed Python virtual environment

---

## 🐍 Python Environment

The Unsloth node executes training via a Python virtual environment that is **automatically created at startup** if missing.

| Setting | Default |
|---------|---------|
| `com.continuum.feature.ai.unsloth-trainer.venv-path` | `~/.continuum/unsloth-env` |
| `com.continuum.feature.ai.unsloth-trainer.cache-storage-path` | `./.continuum-cache/workflow-data` |

**Required Python packages** (auto-installed): pyarrow, pandas, datasets, torch, transformers, peft, trl, accelerate, hf_transfer, sentencepiece, protobuf, bitsandbytes, unsloth (Linux + CUDA only).

---

## πŸ“¦ Dependencies

Shared libraries from [Continuum](https://github.com/projectcontinuum/Continuum) via GitHub Packages:

| Dependency | Purpose |
|-----------|---------|
| `continuum-commons:0.0.1` | Base node model, data types, Parquet/S3 utilities |
| `continuum-worker-springboot-starter:0.0.1` | Worker framework β€” registers nodes with Temporal |

---

## πŸš€ Quick Start

### Prerequisites

- **JDK 21** β€” [Eclipse Temurin](https://adoptium.net/) recommended
- **Python 3.10+** β€” For the Unsloth training environment
- **Docker & Docker Compose** β€” For local infrastructure
- **GitHub PAT** with `read:packages` scope
- **(Optional) CUDA GPU** β€” For Unsloth acceleration

Set environment variables:

```bash
export GITHUB_USERNAME=your-github-username
export GITHUB_TOKEN=ghp_your-personal-access-token
```

### Run

```bash
# Start infrastructure (Temporal, Kafka, MinIO, API Server, Message Bridge)
cd docker && docker compose up -d

# Build
./gradlew build

# Start the AI worker (auto-creates Python venv on first run)
./gradlew :worker:bootRun
```

---

## πŸ“ Project Structure

```
continuum-feature-ai/
β”œβ”€β”€ features/
β”‚ └── continuum-feature-unsloth/ # Unsloth LLM trainer node
β”‚ β”œβ”€β”€ build.gradle.kts # Depends on continuum-commons
β”‚ └── src/main/kotlin/.../
β”‚ β”œβ”€β”€ AutoConfigure.kt # Spring auto-configuration
β”‚ β”œβ”€β”€ node/
β”‚ β”‚ └── UnslothTrainerNodeModel.kt
β”‚ └── python/
β”‚ └── PythonEnvironmentManager.kt
β”œβ”€β”€ worker/ # Spring Boot worker application
β”‚ β”œβ”€β”€ build.gradle.kts # Depends on starter + unsloth feature
β”‚ └── src/main/
β”‚ β”œβ”€β”€ kotlin/.../App.kt
β”‚ └── resources/application.yaml
β”œβ”€β”€ docker/ # Local development infrastructure
β”‚ └── docker-compose.yml
β”œβ”€β”€ settings.gradle.kts
β”œβ”€β”€ gradle.properties
└── README.md
```

---

## πŸ—ΊοΈ Roadmap

- [x] LLM fine-tuning with Unsloth + LoRA
- [x] Auto-managed Python virtual environment
- [x] 4-bit quantization support
- [ ] Inference node β€” run inference against fine-tuned or base models
- [ ] Model evaluation node β€” automated benchmarking
- [ ] Multi-GPU training support
- [ ] More model architectures (vision, embedding)

---

## πŸ”— Related Repositories

| Repository | Description |
|-----------|-------------|
| [Continuum](https://github.com/projectcontinuum/Continuum) | Core backend β€” API server, worker framework, shared libraries |
| [continuum-workbench](https://github.com/projectcontinuum/continuum-workbench) | Browser IDE β€” Eclipse Theia + React Flow workflow editor |
| [continuum-feature-base](https://github.com/projectcontinuum/continuum-feature-base) | Base analytics nodes β€” data transforms, REST, scripting, anomaly detection |
| **continuum-feature-ai** (this repo) | AI/ML nodes β€” LLM fine-tuning with Unsloth + LoRA |
| [continuum-feature-template](https://github.com/projectcontinuum/continuum-feature-template) | Template β€” scaffold your own custom worker with nodes |