https://github.com/sairam-s0/activelabelingsystem
LabelOps – A local-first AI-assisted data labeling system with continual learning and background model training.
https://github.com/sairam-s0/activelabelingsystem
ai computer-vision data-engineering labeling-tool learners-implementation local machine-learning
Last synced: 3 months ago
JSON representation
LabelOps – A local-first AI-assisted data labeling system with continual learning and background model training.
- Host: GitHub
- URL: https://github.com/sairam-s0/activelabelingsystem
- Owner: sairam-s0
- License: apache-2.0
- Created: 2025-12-30T10:19:11.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2026-04-02T10:44:47.000Z (3 months ago)
- Last Synced: 2026-04-03T00:56:09.515Z (3 months ago)
- Topics: ai, computer-vision, data-engineering, labeling-tool, learners-implementation, local, machine-learning
- Language: Python
- Homepage:
- Size: 52.2 MB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# ActiveLabelingSystem
Formerly known as **LabelOps**.
Local-first AI-assisted image labeling with active learning, manual box tools, background retraining, and dataset version snapshots.
## What is new
- Modular app structure (`src/app`, `src/core`, `src/features`) for cleaner maintenance.
- Active learning image prioritization when loading folders (unlabeled images first, sorted by uncertainty).
- Retraining policy engine with multi-signal checks (sample count, time, entropy shift, class balance, confidence drift).
- Dataset versioning from UI with metadata, hash integrity, and manifest tracking.
- Label format selection at folder load time: `COCO JSON` or `Plain JSON`.
- Safer persistence:
- autosave per folder (`labels_autosave.json`)
- internal state store (`.labels_internal.json`)
- atomic file writes for exports.
- Improved manual labeling UX:
- floating toolbox
- quick class switching (1-9)
- undo/delete shortcuts
- inline floating action toolbar.
- Better training controls in UI:
- force retrain
- live queue/training status
- shadow model promotion with validation warning flow.
## Features
### Active learning and triage
- Entropy-aware detection metadata is attached to predictions.
- Folder loading prioritizes unlabeled images for high-value review.
- Replay buffer preserves historical samples for continual learning.
### Retraining workflow
- Background training via Ray shadow trainer.
- Training trigger policy requires minimum sample count plus at least one urgency signal.
- Promotion flow supports validation and explicit override confirmation.
### Dataset versioning
- Create version snapshots from current labels.
- Stores YOLO-style labels, copied images, metadata, and a dataset hash.
- Keeps lineage and latest pointer in `src/datasets/manifest.json`.
### Label export
- `Plain JSON` output (`labels.json`) for direct app consumption.
- `COCO JSON` output (`labels_coco.json`) for downstream ML pipelines.
### Manual labeling mode
- Draw boxes directly on the canvas.
- Per-box class assignment with deterministic class colors.
- Save-and-next loop without leaving manual mode.
## Project structure
```text
ActiveLabelingSystem/
images/
src/
app/
window.py
dialogs.py
state.py
actions.py
core/
data_manager.py
entropy.py
sample_selector.py
retrain_policy.py
dataset_versioner.py
replay_buffer.py
shadow_trainer.py
training_orchestrator.py
model_manager.py
feedback_validator.py
features/
manual.py
shortcut_manager.py
shortcut_config.py
toolbar_manager.py
toolbar_widget.py
toolbar_styles.py
datasets/
models/
main.py
requirements.txt
run_tests.bat
README.md
GUID.md
```
## Installation
```bash
git clone https://github.com/sairam-s0/ActiveLabelingSystem.git
cd ActiveLabelingSystem
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/macOS
# source .venv/bin/activate
pip install .
```
Optional: install CUDA-enabled PyTorch for GPU inference/training from https://pytorch.org/get-started/locally/.
Users can install it directly with:
```bash
pip install Active-Labeling-System
```
## Run the app
After install:
```bash
als
```
## Publish to PyPI (GitHub Actions)
This repo includes `.github/workflows/publish-pypi.yml`.
1. In PyPI, configure Trusted Publisher for this GitHub repo/workflow:
- Owner: `sairam-s0`
- Repository: `ActiveLabelingSystem`
- Workflow: `publish-pypi.yml`
2. Bump version in `pyproject.toml`.
3. Push a tag like `v0.1.1`.
4. GitHub Actions builds and publishes automatically to PyPI.
## How to use
### 1. Select image folder and output format
- Click `Select Folder`.
- Choose output format:
- `COCO JSON` -> writes `labels_coco.json`
- `Plain JSON` -> writes `labels.json`
- The app also keeps internal state in `.labels_internal.json` and autosave in `labels_autosave.json` inside the selected folder.


### 2. Select classes
- Click `Select Classes`.
- Pick one or more classes.
- Add custom classes from the same dialog when needed.

### 3. Start labeling
- Click `START`.
- Review detections and use bottom actions:
- `Accept (A)`
- `Reject (R)`
- `Skip (N)`
- `Manual (M)`

### 4. Manual mode (box drawing)

- Draw boxes by click-drag on canvas.
- Save boxes and move next with:
- `Space` or `Enter` -> save and next
- `Esc` -> exit manual mode
- `Ctrl+Z` -> undo last box
- `Delete` -> delete last box
- `1..9` -> switch class index

### 5. Monitor active learning and training
- Left panel shows:
- entropy of current image
- queue size
- training progress/status.
- Use `Force Retrain` if you want to bypass normal policy checks (still requires minimum samples).

### 6. Version and promote
- `Create Version` creates a dataset snapshot in `src/datasets/v_YYYYMMDD_HHMMSS/`.
- `List Versions` shows stored versions and metadata.
- `Promote Shadow` promotes trained candidate model to active model.
## Output files and artifacts
For each selected image folder:
- `labels.json` or `labels_coco.json` (selected format)
- `.labels_internal.json` (internal metadata store)
- `labels_autosave.json` (session recovery)
## Notes
- If Ray is unavailable, labeling still works; background training features are reduced.
- If class mapping is not available yet, trainer creation waits until first labels are saved.
- Restart app after model promotion for a clean reload of active weights.
## Contributing
Please read `CONTRIBUTING.md`.
## License
MIT. See `LICENSE`.