{"id":50777199,"url":"https://github.com/malchul/experiment_tracker","last_synced_at":"2026-06-12T01:00:36.546Z","repository":{"id":330881168,"uuid":"1113371247","full_name":"MalchuL/experiment_tracker","owner":"MalchuL","description":"Research-first machine learning experiment tracker for comparing model metrics, scalar curves, artifacts, and experiment lineage.","archived":false,"fork":false,"pushed_at":"2026-06-11T23:06:40.000Z","size":5958,"stargazers_count":5,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-11T23:13:37.907Z","etag":null,"topics":["artifact-tracking","experiment-tracking","machine-learning","metrics","ml-experiments","ml-research","mlops","mlops-environment","mlops-training","model-comparison","python-sdk","scalar-visualization","self-hosted"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MalchuL.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-12-09T22:15:20.000Z","updated_at":"2026-06-11T23:03:59.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/MalchuL/experiment_tracker","commit_stats":null,"previous_names":["malchul/experiment_tracker"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/MalchuL/experiment_tracker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MalchuL%2Fexperiment_tracker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MalchuL%2Fexperiment_tracker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MalchuL%2Fexperiment_tracker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MalchuL%2Fexperiment_tracker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MalchuL","download_url":"https://codeload.github.com/MalchuL/experiment_tracker/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MalchuL%2Fexperiment_tracker/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34224103,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-11T02:00:06.485Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artifact-tracking","experiment-tracking","machine-learning","metrics","ml-experiments","ml-research","mlops","mlops-environment","mlops-training","model-comparison","python-sdk","scalar-visualization","self-hosted"],"created_at":"2026-06-12T01:00:19.227Z","updated_at":"2026-06-12T01:00:36.504Z","avatar_url":"https://github.com/MalchuL.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Experiment Tracker: Self-Hosted ML Experiment Analysis Workspace\n\n\n![Python](https://img.shields.io/badge/Python-3.10%2B-3776AB?logo=python\u0026logoColor=white)\n![FastAPI](https://img.shields.io/badge/FastAPI-backend-009688?logo=fastapi\u0026logoColor=white)\n![Next.js](https://img.shields.io/badge/Next.js-UI-000000?logo=nextdotjs\u0026logoColor=white)\n![PostgreSQL](https://img.shields.io/badge/PostgreSQL-relational%20state-4169E1?logo=postgresql\u0026logoColor=white)\n![ClickHouse](https://img.shields.io/badge/ClickHouse-scalars-FFCC01?logo=clickhouse\u0026logoColor=black)\n![MinIO](https://img.shields.io/badge/MinIO-object%20storage-C72E49?logo=minio\u0026logoColor=white)\n![S3 Compatible](https://img.shields.io/badge/S3-compatible%20blobs-569A31?logo=amazons3\u0026logoColor=white)\n![Docker](https://img.shields.io/badge/Docker-self--hosted-2496ED?logo=docker\u0026logoColor=white)\n![SDK](https://img.shields.io/badge/Python%20SDK-training%20logs-4B8BBE?logo=python\u0026logoColor=white)\n\nExperiment Tracker is an open-source, self-hosted ML/DL experiment tracker for research-heavy workflows. It focuses on experiment understanding: compare final metrics, inspect scalar curves, review step-aware artifacts, and navigate experiment lineage in one workspace.\n\nIt is intentionally smaller than a full MLOps platform. The goal is not remote execution, infrastructure orchestration, production serving, or a universal training launcher. The goal is a clear research workspace for ML engineers and data scientists who run many experiments and need to understand what changed, which run improved, and why.\n\n\u003e A self-hosted experiment tracker for research-heavy ML workflows: metrics-first comparison, readable scalar curves, step-aware artifacts, and experiment lineage without turning your setup into a full MLOps platform.\n\n## What It Is For\n\n- **Metrics-first model selection:** compare final metrics and labeled metric snapshots across many runs before drilling into details.\n- **Readable scalar analysis:** inspect training and validation curves across experiments with smoothing, compare hover, zooming, and backend downsampling.\n- **Step-aware artifact review:** keep generated images, predictions, text outputs, checkpoints, configs, and project files attached to experiment context.\n- **Experiment lineage:** track parent-child research branches, metric deltas, and how one run evolved from another.\n- **Self-hosted research history:** own experiment metadata, scalar series, artifacts, notes, and reports in your own stack.\n\n## What It Is Not\n\nExperiment Tracker is not a training orchestrator, deployment platform, model registry, hyperparameter sweep engine, GPU queue, or agent execution system. If you need a broad AI platform with pipelines, autoscaling infrastructure, registry workflows, automations, and deployment layers, tools like W\u0026B or ClearML cover a larger surface area.\n\nUse Experiment Tracker when you want a focused, self-hosted research workspace for understanding experiments rather than managing infrastructure.\n\n## Why Not Just TensorBoard?\n\nTensorBoard is excellent for local visualization. Experiment Tracker keeps TensorBoard-like logging ergonomics but adds project-level research context around those logs:\n\n- final metric comparison tables for choosing the best run;\n- scalar curves designed for comparing many experiments;\n- step-aware and named artifacts;\n- notes, reports, hypotheses, teams, and project metadata;\n- editable experiment lineage instead of only a flat list of runs.\n\n## Machine Learning Experiment Comparison\n\n\u003cimg src=\"mics/metrics_page_example.png\" alt=\"Machine learning experiment tracker metrics table for comparing model accuracy loss precision recall and mAP\" width=\"100%\"\u003e\n\n### Features for researchers\n\n- **Dense model-selection table:** compare final or labeled metric snapshots across experiments in a project-scoped grid.\n- **Research workflow controls:** filter runs, sort and resize columns, hide rows or metrics, export tables, highlight min/max values, and inspect selected experiment metadata in the side panel.\n- **Clear metric language:** use final metrics and metric snapshots for model selection; use scalar curves for training dynamics.\n\n## Scalar Metrics and Logged Artifacts\n\n\u003cimg src=\"mics/scalars_view_example.png\" alt=\"Machine learning scalar metrics dashboard with training curves validation loss and logged prediction image artifacts\" width=\"100%\"\u003e\n\n### Features for researchers\n\n- **Curves built for comparison:** visualize multi-run scalar curves with synchronized axes, smoothing, compare hover, nearest-point hover, resizable cards, saved views, and selective visibility for each metric stream.\n- **Readable curves at scale:** scalar queries are backed by ClickHouse and sampled per metric and per experiment, so charts stay usable when training logs get large.\n- **Artifacts in training context:** inspect images, predictions, generated samples, text outputs, and other logged objects beside scalar trends, grouped by type and name, with step-aware controls.\n\n## Experiment Lineage and Research History\n\n\u003cimg src=\"mics/dag_view_example.png\" alt=\"Experiment lineage graph for machine learning research showing parent child runs and metric deltas\" width=\"100%\"\u003e\n\n### Features for researchers\n\n- **Research tree, not just run list:** track parent-child relationships between runs and understand how baselines became follow-up experiments.\n- **Metric deltas along branches:** compare selected metrics against each run's parent directly in the lineage view.\n- **Editable lineage:** search, highlight, persist layout, and update parent links while keeping cycle checks in place.\n\n\n## Files comparison\n\n\u003cimg src=\"mics/diff_example.png\" alt=\"Machine learning files comparison view showing side by side diff of two files\" width=\"100%\"\u003e\n\n### Features for researchers\n\n- **Side by side diff:** compare two files side by side with diff highlighting.\n- **Inline highlighting:** highlight changed lines in the file.\n- **Experiment to experiment comparison:** compare two experiments side by side with diff highlighting.\n\n## Architecture Designed Around Experiment Data\n\nExperiment Tracker separates data by workload instead of forcing everything into one store:\n\n```mermaid\nflowchart LR\n  Web[\"Next.js web UI\"]\n  API[\"FastAPI backend\"]\n  PG[\"PostgreSQL\\nusers, teams, projects, experiments, RBAC\"]\n  CH[\"ClickHouse\\nscalar series and step artifact metadata\"]\n  S3[\"MinIO / S3-compatible storage\\ncontent-addressed blobs\"]\n  SDK[\"Python SDK / CLI\"]\n\n  SDK --\u003e API\n  Web --\u003e API\n  API --\u003e PG\n  API --\u003e CH\n  API --\u003e S3\n```\n\n- **PostgreSQL:** relational state such as users, teams, projects, experiments, permissions, notes, and reports.\n- **ClickHouse:** high-volume scalar time series and step-aware artifact metadata.\n- **S3-compatible object storage:** heavy blobs and content-addressed project artifacts.\n- **FastAPI backend:** orchestration layer between the UI, SDK, relational state, scalar storage, and object storage.\n\nThis makes the product lightweight from a workflow perspective while still matching the actual shape of ML experiment data.\n\n## Core Capabilities\n\n| Area | What it helps researchers do |\n|------|-------------------------------|\n| Experiment tracking | Record runs, status, tags, metadata, notes, and project context. |\n| Metrics comparison | Compare final scores and labeled metric snapshots across models in a dense table. |\n| Scalar visualization | Explore loss, accuracy, learning rate, validation metrics, and custom scalar curves with comparison-focused chart tools. |\n| Step-aware artifacts | Review images, predictions, generated samples, text outputs, and other objects at the training step where they were logged. |\n| Named artifacts | Store checkpoints, configs, final exports, and other stable experiment files. |\n| Project artifacts | Deduplicate shared project files by content hash for datasets, code snapshots, configs, and reusable assets. |\n| Research lineage | Keep parent-child run relationships and metric deltas connected to experiment history. |\n| Research organization | Keep hypotheses, reports, kanban items, notes, and SDK-driven training logs in one project workspace. |\n| Self-hosted stack | Run the UI, API, scalars service, object storage, PostgreSQL, ClickHouse, and MinIO/S3-compatible storage with Docker or local development tools. |\n\n## Positioning\n\nExperiment Tracker is best described as a **self-hosted ML experiment analysis workspace** or a **research-first experiment tracker for ML/DL workflows**.\n\n- Compared with **W\u0026B**, it is intentionally narrower: focused on metrics, curves, artifacts, and lineage rather than a broad system of record with sweeps, reports, automations, registry, and platform workflows.\n- Compared with **ClearML**, it does not try to be an end-to-end AI platform with infrastructure control, queues, pipelines, and deployment.\n- Compared with **TensorBoard**, it keeps familiar logging ideas while adding project-level comparison, experiment metadata, artifacts, notes, and lineage.\n\nThe sharpest summary:\n\n\u003e Experiment Tracker helps ML engineers understand experiment evolution, not just log runs: metrics-first comparison, readable scalar curves, step-aware artifacts, and lineage-aware run history in a self-hosted stack.\n\n## Quick Docker Install\n\nInstall Docker with the Compose plugin, then download the required deployment files:\n\n```bash\nmkdir -p experiment-tracker \u0026\u0026 cd experiment-tracker\ncurl -fsSLO https://raw.githubusercontent.com/MalchuL/experiment_tracker/main/docker-compose.yml\ncurl -fsSLO https://raw.githubusercontent.com/MalchuL/experiment_tracker/main/scripts/docker-up-public.sh\nchmod +x docker-up-public.sh\n```\n\nChoose one way to start the stack:\n\n```bash\ndocker compose up -d\n./docker-up-public.sh http://127.0.0.1:3000\n./docker-up-public.sh https://tracker.example.com https://api.example.com\nsudo PUBLIC_URL=http://192.168.1.247 WEB_PORT=3000 ./docker-up-public.sh\nsudo PUBLIC_URL=http://192.168.1.247 ./docker-up-public.sh\n\n```\n\nThe first command uses the default localhost configuration. The second configures a browser-reachable local URL. The third configures separate public UI and API URLs.\n\n**Now you can open the UI at http://127.0.0.1:3000 or https://tracker.example.com.**\n\nStop the stack without deleting stored data:\n\n```bash\ndocker compose down\n```\n\n## Python SDK\n\n### Install\n\n```\npip install \"experiment-tracker-sdk @ git+https://github.com/MalchuL/experiment_tracker.git@main#subdirectory=python/sdk\"\n```\n\nUsing uv:\n```\nuv pip install \"git+https://github.com/MalchuL/experiment_tracker.git@main#subdirectory=python/sdk\"\n```\n\n### Get API token\n\n1. Register new user in the web UI at http://127.0.0.1:3000. You can use any email and password (they will not be used for anything and stored in the local database).\n2. Click in top right corner and select \"API Tokens\"\n3. Click on \"Create Token\" (Use all permissions for now)\n4. Enter a name for the token\n5. Click on \"Create\"\n6. Copy the token (It will only be shown once). Or you can copy whole command to initialize the SDK.\n7. (Optional) Run the command (but if you use uv use `uv run command`). `uv run experiment-tracker init --base-url \"http://127.0.0.1:8000\" --api-prefix \"/api\" --api-token \"pat_nOMwtEGLRZVFI_8IzQi6jmx3YDUGPJL73TgQmxMRBjc\"`\n\n\n\n### Configure\n\nThe SDK installs three equivalent console entry points:\n\n- `experiment-tracker` (full name)\n- `exp-tracker`\n- `exp-track`\n\nThey all invoke the same CLI; use whichever name you prefer. Examples below use\n`experiment-tracker`, but `exp-tracker` and `exp-track` work the same way.\n\nThe CLI is implemented with [Click](https://click.palletsprojects.io/).\n\nOptional environment defaults for interactive `experiment-tracker init` (when\nyou omit flags and press Enter at prompts) can be set with the `EXP_TRACKER_`\nprefix, for example `EXP_TRACKER_DEFAULT_BASE_URL` and\n`EXP_TRACKER_DEFAULT_API_PREFIX`. Values are read from the process environment\nand an optional `.env` file in the current working directory (see\n`experiment_tracker_sdk.settings`).\n\nSave the backend base URL and API token:\n\n**Use the backend URL here, not the UI URL. Example: http://127.0.0.1:8000**\n```\nuv run exp-tracker init --base-url http://127.0.0.1:8000 --api-token \u003cTOKEN\u003e\n```\n\nCheck connectivity or token validity (first checks connectivity to the backend and then checks if the token is valid):\n\n```\nuv run experiment-tracker ping\nuv run experiment-tracker whoami\n```\n\n### Run a training script\nThere is mock training script in `examples/training/train.py`. It is a simple script to show logging capabilities of the SDK.\n```\ncd examples/training\nuv run python train.py --project-name \"SDK Training\" --team-name \"My First Team\" --experiment-name \"Experiment 0\"\n```\n\nFor **large artifact upload/download with tqdm progress** (files \u003e= 50 MiB), see `examples/verbose-artifact-transfer/`:\n```\ncd examples/verbose-artifact-transfer\nuv sync\nuv run python train.py --project-name \"SDK Verbose Artifacts\" --experiment-name \"Large transfer demo\"\n```\n\nIf you want to run script and don't change anything in the script of script and have tensorboardX installed, you can use the following command:\n```\ncd examples/pytorch-mnist-tensorboardx\nuv run experiment-tracker run --project mnist --experiment \"Experiment 0\" train.py -- --epochs 100 --max-train-batches 50 --max-val-batches 50\n```\nThis script runs train.py script with args passed after `--` token.\nIt will create or fetch project \"mnist\" and experiment \"Experiment 0\" if they don't exist.\nAfter that it captures tensorboardX events and logs them to the backend.\n\n## Docker Installation and Deployment\n\nDocker installation, deployment, troubleshooting, and known issues: **[Docker Guide](DOCKER.md)**\n\n## Local Development\n\nFor manual local setup with Postgres, MinIO, ClickHouse, the Python services, and the Next.js frontend, see [LOCAL_RUN.md](LOCAL_RUN.md).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmalchul%2Fexperiment_tracker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmalchul%2Fexperiment_tracker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmalchul%2Fexperiment_tracker/lists"}