https://github.com/daizutabi/hydraflow
Integrate Hydra and MLflow to manage and track machine learning experiments
https://github.com/daizutabi/hydraflow
hydra mlflow
Last synced: 4 months ago
JSON representation
Integrate Hydra and MLflow to manage and track machine learning experiments
- Host: GitHub
- URL: https://github.com/daizutabi/hydraflow
- Owner: daizutabi
- License: mit
- Created: 2024-08-20T10:31:49.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-10T22:22:56.000Z (4 months ago)
- Last Synced: 2025-06-10T23:27:05.696Z (4 months ago)
- Topics: hydra, mlflow
- Language: Python
- Homepage: https://daizutabi.github.io/hydraflow/
- Size: 1.65 MB
- Stars: 5
- Watchers: 0
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# HydraFlow
[![PyPI Version][pypi-v-image]][pypi-v-link]
[![Build Status][GHAction-image]][GHAction-link]
[![Coverage Status][codecov-image]][codecov-link]
[![Documentation Status][docs-image]][docs-link]
[![Python Version][python-v-image]][python-v-link][pypi-v-image]: https://img.shields.io/pypi/v/hydraflow.svg
[pypi-v-link]: https://pypi.org/project/hydraflow/
[GHAction-image]: https://github.com/daizutabi/hydraflow/actions/workflows/ci.yaml/badge.svg?branch=main&event=push
[GHAction-link]: https://github.com/daizutabi/hydraflow/actions?query=event%3Apush+branch%3Amain
[codecov-image]: https://codecov.io/github/daizutabi/hydraflow/coverage.svg?branch=main
[codecov-link]: https://codecov.io/github/daizutabi/hydraflow?branch=main
[docs-image]: https://img.shields.io/badge/docs-latest-blue.svg
[docs-link]: https://daizutabi.github.io/hydraflow/
[python-v-image]: https://img.shields.io/pypi/pyversions/hydraflow.svg
[python-v-link]: https://pypi.org/project/hydraflow## Overview
HydraFlow seamlessly integrates [Hydra](https://hydra.cc/) and [MLflow](https://mlflow.org/) to streamline machine learning experiment workflows. By combining Hydra's powerful configuration management with MLflow's robust experiment tracking, HydraFlow provides a comprehensive solution for defining, executing, and analyzing machine learning experiments.
## Design Principles
HydraFlow is built on the following design principles:
1. **Type Safety** - Utilizing Python dataclasses for configuration type checking and IDE support
2. **Reproducibility** - Automatically tracking all experiment configurations for fully reproducible experiments
3. **Analysis Capabilities** - Providing powerful APIs for easily analyzing experiment results
4. **Workflow Integration** - Creating a cohesive workflow by integrating Hydra's configuration management with MLflow's experiment tracking## Key Features
- **Type-safe Configuration Management** - Define experiment parameters using Python dataclasses with full IDE support and validation
- **Seamless Hydra-MLflow Integration** - Automatically register configurations with Hydra and track experiments with MLflow
- **Advanced Parameter Sweeps** - Define complex parameter spaces using extended sweep syntax for numerical ranges, combinations, and SI prefixes
- **Workflow Automation** - Create reusable experiment workflows with YAML-based job definitions
- **Powerful Analysis Tools** - Filter, group, and analyze experiment results with type-aware APIs
- **Custom Implementation Support** - Extend experiment analysis with domain-specific functionality## Installation
```bash
pip install hydraflow
```**Requirements:** Python 3.13+
## Quick Example
```python
from dataclasses import dataclass
from mlflow.entities import Run
import hydraflow@dataclass
class Config:
width: int = 1024
height: int = 768@hydraflow.main(Config)
def app(run: Run, cfg: Config) -> None:
# Your experiment code here
print(f"Running with width={cfg.width}, height={cfg.height}")if __name__ == "__main__":
app()
```Execute a parameter sweep with:
```bash
python app.py -m width=800,1200 height=600,900
```## Core Components
HydraFlow consists of the following key components:
### Configuration Management
Define type-safe configurations using Python dataclasses:
```python
@dataclass
class Config:
learning_rate: float = 0.001
batch_size: int = 32
epochs: int = 10
```### Main Decorator
The `@hydraflow.main` decorator integrates Hydra and MLflow:
```python
@hydraflow.main(Config)
def train(run: Run, cfg: Config) -> None:
# Your experiment code
```### Workflow Automation
Define reusable experiment workflows in YAML:
```yaml
jobs:
train_models:
run: python train.py
sets:
- each: model=small,medium,large
all: learning_rate=0.001,0.01,0.1
```### Analysis Tools
Analyze experiment results with powerful APIs:
```python
from hydraflow import Run, iter_run_dirs# Load runs
runs = Run.load(iter_run_dirs("mlruns"))# Filter and analyze
best_runs = runs.filter(model_type="transformer").to_frame("learning_rate", "accuracy")
```## Documentation
For detailed documentation, visit our [documentation site](https://daizutabi.github.io/hydraflow/):
- [Getting Started](https://daizutabi.github.io/hydraflow/getting-started/) - Installation and core concepts
- [Practical Tutorials](https://daizutabi.github.io/hydraflow/practical-tutorials/) - Learn through hands-on examples
- [User Guide](https://daizutabi.github.io/hydraflow/part1-applications/) - Detailed documentation of HydraFlow's capabilities
- [API Reference](https://daizutabi.github.io/hydraflow/api/hydraflow/) - Complete API documentation## License
This project is licensed under the MIT License.