{"id":34039683,"url":"https://github.com/diningphil/mlwiz","last_synced_at":"2026-04-19T10:03:54.979Z","repository":{"id":256034999,"uuid":"828474782","full_name":"diningphil/mlwiz","owner":"diningphil","description":"Machine Learning Research Wizard","archived":false,"fork":false,"pushed_at":"2026-04-12T14:47:50.000Z","size":10826,"stargazers_count":13,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-04-12T16:26:20.638Z","etag":null,"topics":["evaluation-framework","experiments","machine-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/diningphil.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-07-14T09:07:24.000Z","updated_at":"2026-04-12T14:43:01.000Z","dependencies_parsed_at":"2024-11-21T16:35:31.418Z","dependency_job_id":"97f87fcf-8f4c-482f-876b-3268087b4d7d","html_url":"https://github.com/diningphil/mlwiz","commit_stats":null,"previous_names":["diningphil/mlwiz"],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/diningphil/mlwiz","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diningphil%2Fmlwiz","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diningphil%2Fmlwiz/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diningphil%2Fmlwiz/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diningphil%2Fmlwiz/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/diningphil","download_url":"https://codeload.github.com/diningphil/mlwiz/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diningphil%2Fmlwiz/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32002361,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T20:23:30.271Z","status":"online","status_checked_at":"2026-04-19T02:00:07.110Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["evaluation-framework","experiments","machine-learning"],"created_at":"2025-12-13T21:45:42.300Z","updated_at":"2026-04-19T10:03:54.972Z","avatar_url":"https://github.com/diningphil.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/diningphil/mlwiz/main/docs/_static/mlwiz-logo2-horizontal.png\" width=\"360\" alt=\"MLWiz logo\"/\u003e\n\u003c/p\u003e\n\n# MLWiz\n_Machine Learning Research Wizard — reproducible experiments from YAML (model selection + risk assessment) for vectors, images, time-series, and graphs._\n\n[![PyPI](https://img.shields.io/pypi/v/mlwiz.svg)](https://pypi.org/project/mlwiz/)\n[![Python](https://img.shields.io/pypi/pyversions/mlwiz.svg)](https://pypi.org/project/mlwiz/)\n[![CI](https://github.com/diningphil/mlwiz/actions/workflows/python-test-and-coverage.yml/badge.svg)](https://github.com/diningphil/mlwiz/actions/workflows/python-test-and-coverage.yml)\n[![Docs](https://readthedocs.org/projects/mlwiz/badge/?version=stable)](https://mlwiz.readthedocs.io/en/stable/)\n[![Coverage](https://raw.githubusercontent.com/diningphil/mlwiz/main/.badges/coverage_badge.svg)](https://github.com/diningphil/mlwiz/actions/workflows/python-test-and-coverage.yml)\n[![Docstrings](https://raw.githubusercontent.com/diningphil/mlwiz/main/.badges/interrogate_badge.svg)](https://interrogate.readthedocs.io/en/latest/)\n[![License](https://img.shields.io/badge/License-BSD_3--Clause-gray.svg)](https://opensource.org/licenses/BSD-3-Clause)\n[![Stars](https://img.shields.io/github/stars/diningphil/mlwiz?style=flat\u0026logo=github)](https://github.com/diningphil/mlwiz/stargazers)\n\n## 🔗 Quick Links\n- 📘 Docs: https://mlwiz.readthedocs.io/en/stable/\n- 🧪 Tutorial (recommended): https://mlwiz.readthedocs.io/en/stable/tutorial.html\n- 📦 PyPI: https://pypi.org/project/mlwiz/\n- 📝 Changelog: `CHANGELOG.md`\n- 🤝 Contributing: `CONTRIBUTING.md`\n\n## ✨ What It Does\nMLWiz helps you run end-to-end research experiments with minimal boilerplate:\n\n- 🧱 Build/prepare datasets and generate splits (hold-out or nested CV)\n- 🎛️ Expand a hyperparameter search space (grid, random, or Bayesian search)\n- ⚡ Run model selection + risk assessment in parallel with Ray (CPU/GPU or cluster)\n- 📈 Log metrics, checkpoints, and TensorBoard traces in a consistent folder structure\n\nInspired by (and a generalized version of) [PyDGN](https://github.com/diningphil/PyDGN).\n\n## ✅ Key Features\n| Area | What you get |\n| --- | --- |\n| Research Oriented Framework | Anything is customizable, easy prototyping of models and setups |\n| Reproducibility | Ensure your results are reproducible across multiple runs |\n| Automatic Split Generation | Dataset preparation + `.splits` generation for hold-out / (nested) CV |\n| Automatic and Robust Evaluation | Nested model selection (inner folds) + risk assessment (outer folds) |\n| Parallelism | Ray-based execution across CPU/GPU (or a Ray cluster) |\n\n\n## 🚀 Getting Started\n\n### 📦 Installation\nMLWiz supports Python 3.10+.\n\n```bash\npip install mlwiz\n```\n\nTip: for GPU / graph workloads, install PyTorch and PyG following their official instructions first, then `pip install mlwiz`.\n\n### ⚡ Quickstart\n| Step | Command | Notes |\n| --- | --- | --- |\n| 1) Prepare dataset + splits | `mlwiz-data --config-file examples/DATA_CONFIGS/config_MNIST.yml` | Creates processed data + a `.splits` file |\n| 2) Run an experiment (grid search) | `mlwiz-exp --config-file examples/MODEL_CONFIGS/config_MLP.yml` | Add `--debug` to run sequentially and print logs |\n| 3) Inspect results | `cat RESULTS/mlp_MNIST/MODEL_ASSESSMENT/assessment_results.json` | Aggregated results live under `RESULTS/` |\n| 4) Visualize in TensorBoard | `tensorboard --logdir RESULTS/mlp_MNIST` | Per-run logs are written automatically |\n| 5) Stop a running experiment | Press `Ctrl-C` | |\n\n### 🧭 Navigating the CLI (non-debug mode)\nExample of the global view CLI:\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/diningphil/mlwiz/main/docs/_static/exp_gui.png\" width=\"760\" alt=\"MLWiz terminal progress UI\"/\u003e\n\u003c/p\u003e\n\nSpecific views can be accessed, e.g. to visualize a specific model run:\n\n```bash\n:\u003couter_fold\u003e \u003cinner_fold\u003e \u003cconfig_id\u003e \u003crun_id\u003e\n```\n\n…or, analogously, a risk assessment run:\n\n```bash\n:\u003couter_fold\u003e \u003crun_id\u003e\n```\n\nHere is how it will look like\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/diningphil/mlwiz/main/docs/_static/run_view.png\" width=\"760\" alt=\"MLWiz terminal specific view\"/\u003e\n\u003c/p\u003e\n\nHandy commands:\n\n```bash\n:  # or :g or :global (back to global view)\n:r # or :refresh (refresh the screen)\n```\n\nYou can use **left-right arrows** to move across configurations, and **up-down arrows** to switch between model selection and risk assessment runs.\n\n## 🧩 Architecture (High-Level)\nMLWiz is built around two YAML files and a small set of composable components:\n\n```text\ndata.yml ──► mlwiz-data ──► processed dataset + .splits\nexp.yml  ──► mlwiz-exp  ──► Ray workers\n                      ├─ inner folds: model selection (best hyperparams)\n                      └─ outer folds: risk assessment (final scores)\n```\n\n- 🧰 **Data pipeline**: `mlwiz-data` instantiates your dataset class and writes a `.splits` file for hold-out / (nested) CV.\n- 🧪 **Search space**: `grid:` and `random:` sections expand into concrete hyperparameter configurations.\n- 🛰️ **Orchestration**: the evaluator schedules training runs with Ray across CPU/GPU (or a Ray cluster).\n- 🏗️ **Execution**: each run builds a model + training engine from dotted paths, then logs artifacts and returns structured results.\n\n## ⚙️ Configuration At A Glance\nMLWiz expects:\n\n- 🗂️ one YAML for **data + splits**\n- 🧾 one YAML for **experiment + search space**\n\nMinimal data config:\n\n```yaml\nsplitter:\n  splits_folder: DATA_SPLITS/\n  class_name: mlwiz.data.splitter.Splitter\n  args:\n    n_outer_folds: 3\n    n_inner_folds: 2\n    seed: 42\n\ndataset:\n  class_name: mlwiz.data.dataset.MNIST\n  args:\n    storage_folder: DATA/\n```\n\nMinimal experiment config (grid search):\n\n```yaml\nstorage_folder: DATA\ndataset_class: mlwiz.data.dataset.MNIST\ndata_splits_file: DATA_SPLITS/MNIST/MNIST_outer3_inner2.splits\n\ndevice: cpu\nmax_cpus: 8\n\ndataset_getter: mlwiz.data.provider.DataProvider\ndata_loader:\n  class_name: torch.utils.data.DataLoader\n  args:\n    num_workers : 0\n    pin_memory: False\n\nresult_folder: RESULTS\nexp_name: mlp\nexperiment: mlwiz.experiment.Experiment\nmodel_selection_criteria:\n  - metric: main_score\n    direction: max\nevaluate_every: 1\nrisk_assessment_training_runs: 3\nmodel_selection_training_runs: 2\n\ngrid:\n  model: mlwiz.model.MLP\n  epochs: 400\n  batch_size: 512\n  dim_embedding: 5\n  mlwiz_tests: True  # patch: allow reshaping of MNIST dataset\n  optimizer:\n    - class_name: mlwiz.training.callback.optimizer.Optimizer\n      args:\n        optimizer_class_name: torch.optim.Adam\n        lr:\n          - 0.01\n          - 0.03\n        weight_decay: 0.\n  loss: mlwiz.training.callback.metric.MulticlassClassification\n  scorer: mlwiz.training.callback.metric.MulticlassAccuracy\n  engine:\n    class_name: mlwiz.training.engine.TrainingEngine\n    args:\n      mixed_precision: false\n      mixed_precision_dtype: torch.float16\n```\n\nWhen `mixed_precision: true` is used on CPU, requesting\n`mixed_precision_dtype: torch.float16` is automatically converted to\n`torch.bfloat16`.\n\n`higher_results_are_better` remains available as a legacy shortcut for\n`main_score`, but it cannot be set together with `model_selection_criteria`.\n\nSee `examples/` for complete configs (including random/Bayesian search, schedulers, early stopping, and more).\n\n### 🧩 Custom Code Via Dotted Paths\nPoint YAML entries to your own classes (in your project). `mlwiz-data` and `mlwiz-exp` add the current working directory to `sys.path`, so this works out of the box:\n\n```yaml\ngrid:\n  model: my_project.models.MyModel\n\ndataset:\n  class_name: my_project.data.MyDataset\n```\n\n## 📦 Outputs\nRuns are written under `RESULTS/`:\n\n| Output | Location |\n| --- | --- |\n| Aggregated outer-fold results | `RESULTS/\u003cexp_name\u003e_\u003cdataset\u003e/MODEL_ASSESSMENT/assessment_results.json` |\n| Per-fold summaries | `RESULTS/\u003cexp_name\u003e_\u003cdataset\u003e/MODEL_ASSESSMENT/OUTER_FOLD_k/outer_results.json` |\n| Model selection (inner folds + winner config) | `.../MODEL_SELECTION/...` |\n| Final retrains with selected hyperparams | `.../final_run*/` |\n\nEach training run also writes TensorBoard logs under `\u003crun_dir\u003e/tensorboard/`.\n\n## 🛠️ Utilities\n### 🗂️ Config Management (CLI)\nDuplicate a base experiment config across multiple datasets:\n\n```bash\nmlwiz-config-duplicator --base-exp-config base.yml --data-config-files data1.yml data2.yml\n```\n\n### 📊 Post-process Results (Python)\nFilter configurations from a `MODEL_SELECTION/` folder and convert them to a DataFrame:\n\n```python\nfrom mlwiz.evaluation.util import retrieve_experiments, filter_experiments, create_dataframe\n\nconfigs = retrieve_experiments(\n    \"RESULTS/mlp_MNIST/MODEL_ASSESSMENT/OUTER_FOLD_1/MODEL_SELECTION/\"\n)\nfiltered = filter_experiments(configs, logic=\"OR\", parameters={\"lr\": 0.001})\ndf = create_dataframe(\n    config_list=filtered,\n    key_mappings=[(\"lr\", float), (\"avg_validation_score\", float)],\n)\n```\n\nExport aggregated assessment results to LaTeX:\n\n```python\nfrom mlwiz.evaluation.util import create_latex_table_from_assessment_results\n\nexperiments = [\n    (\"RESULTS/mlp_MNIST\", \"MLP\", \"MNIST\"),\n    (\"RESULTS/dgn_PROTEINS\", \"DGN\", \"PROTEINS\"),\n]\n\nlatex_table = create_latex_table_from_assessment_results(\n    experiments,\n    metric_key=\"main_score\",\n    no_decimals=3,\n    model_as_row=True,\n    use_single_outer_fold=False,\n)\nprint(latex_table)\n```\n\nCompare statistical significance between models (Welch t-test):\n\n```python\nfrom mlwiz.evaluation.util import statistical_significance\n\nreference = (\"RESULTS/mlp_MNIST\", \"MLP\", \"MNIST\")\ncompetitors = [\n    (\"RESULTS/baseline1_MNIST\", \"B1\", \"MNIST\"),\n    (\"RESULTS/baseline2_MNIST\", \"B2\", \"MNIST\"),\n]\n\ndf = statistical_significance(\n    highlighted_exp_metadata=reference,\n    other_exp_metadata=competitors,\n    metric_key=\"main_score\",\n    set_key=\"test\",\n    confidence_level=0.95,\n)\nprint(df)\n```\n\n### 🔍 Load a Trained Model (Notebook-friendly)\nLoad the best configuration for a fold, instantiate dataset/model, and restore a checkpoint:\n\n```python\nfrom mlwiz.evaluation.util import (\n    retrieve_best_configuration,\n    instantiate_dataset_from_config,\n    instantiate_model_from_config,\n    load_checkpoint,\n)\n\nconfig = retrieve_best_configuration(\n    \"RESULTS/mlp_MNIST/MODEL_ASSESSMENT/OUTER_FOLD_1/MODEL_SELECTION/\"\n)\ndataset = instantiate_dataset_from_config(config)\nmodel = instantiate_model_from_config(config, dataset)\nload_checkpoint(\n    \"RESULTS/mlp_MNIST/MODEL_ASSESSMENT/OUTER_FOLD_1/final_run1/best_checkpoint.pth\",\n    model,\n    device=\"cpu\",\n)\n```\n\nFor more post-processing helpers, see the tutorial: https://mlwiz.readthedocs.io/en/stable/tutorial.html\n\n## 🤝 Contributing\nSee `CONTRIBUTING.md`.\n\n## 📄 License\nBSD-3-Clause. See `LICENSE`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiningphil%2Fmlwiz","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdiningphil%2Fmlwiz","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiningphil%2Fmlwiz/lists"}