{"id":46843115,"url":"https://github.com/reil-uconn/modular-ml","last_synced_at":"2026-03-10T14:01:00.278Z","repository":{"id":311931146,"uuid":"1040929979","full_name":"REIL-UConn/modular-ml","owner":"REIL-UConn","description":"Modular, fast, and reproducible ML experimentation built for R\u0026D.","archived":false,"fork":false,"pushed_at":"2026-02-24T15:45:20.000Z","size":12775,"stargazers_count":8,"open_issues_count":2,"forks_count":5,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-02-24T20:29:17.573Z","etag":null,"topics":["deep-learning","keras-tensorflow","machine-learning","python","pytorch","reproducible-research","research-and-development","scientific-machine-learning","scikit-learn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/REIL-UConn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-19T18:07:19.000Z","updated_at":"2026-02-24T15:46:56.000Z","dependencies_parsed_at":"2025-08-27T23:28:59.475Z","dependency_job_id":"66a207b3-8e7a-41d2-9526-11e442e61478","html_url":"https://github.com/REIL-UConn/modular-ml","commit_stats":null,"previous_names":["reil-uconn/modular-ml"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/REIL-UConn/modular-ml","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/REIL-UConn%2Fmodular-ml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/REIL-UConn%2Fmodular-ml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/REIL-UConn%2Fmodular-ml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/REIL-UConn%2Fmodular-ml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/REIL-UConn","download_url":"https://codeload.github.com/REIL-UConn/modular-ml/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/REIL-UConn%2Fmodular-ml/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30336058,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-10T12:41:07.687Z","status":"ssl_error","status_checked_at":"2026-03-10T12:41:06.728Z","response_time":106,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","keras-tensorflow","machine-learning","python","pytorch","reproducible-research","research-and-development","scientific-machine-learning","scikit-learn"],"created_at":"2026-03-10T14:00:25.554Z","updated_at":"2026-03-10T14:01:00.261Z","avatar_url":"https://github.com/REIL-UConn.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cdiv align=\"center\"\u003e\n\n[![ModularML Banner](docs/_static/logos/modularml_logo_banner.png)](https://github.com/REIL-UConn/modular-ml)\n\n**Modular, fast, and reproducible ML experimentation built for R\\\u0026D.**\n\n[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/)\n[![PyPI](https://img.shields.io/pypi/v/modularml.svg)](https://pypi.org/project/modularml/)\n[![codecov](https://codecov.io/github/REIL-UConn/modular-ml/graph/badge.svg?token=Z063M1M6P3)](https://codecov.io/github/REIL-UConn/modular-ml)\n[![Docs](https://readthedocs.org/projects/modular-ml/badge/?version=latest)](https://modular-ml.readthedocs.io/en/latest/?badge=latest)\n[![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg)](LICENSE)\n\n\u003c/div\u003e\n\n\nModularML is a flexible, backend-agnostic machine learning framework for designing, training, and evaluating machine learning pipelines, tailored specifically for research and scientific workflows.\nIt enables rapid experimentation with complex model architectures, supports domain-specific feature engineering, and provides full reproducibility through configuration-driven declaration.\n\n\u003e ModularML provides a plug-and-play ecosystem of interoperable components for data preprocessing, sampling, modeling, training, and evaluation — all wrapped in a unified experiment container.\n\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/_static/figures/modularml_overview_diagram.png\" alt=\"ModularML Overview Diagram\" width=\"600\"/\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\u003cem\u003eFigure 1. Overview of the ModularML framework, highlighting the three core abstractions: feature set preprocessing and splitting, modular model graph construction, and staged training orchestration.\u003c/em\u003e\u003c/p\u003e\n\n\n\n## Key Concepts and Features\n\n### FeatureSet \u0026 FeatureSetView\n- **`FeatureSet`** is the primary user-facing container for structured data. It tracks features/targets/tags, reversible transforms, and named splits.\n- **`FeatureSetView`** gives a lightweight view into a FeatureSet (rows + selected columns) so you can feed exactly the slices required for a training phase.\n\n### Splitters \u0026 Samplers\n- Built-in **splitters** (e.g., random, rule-based) generate labeled splits from any FeatureSet.\n- **Samplers** consume FeatureSets or views and emit `BatchView`s in the shape required by the model. They support stratification, grouping, triplets/pairs, and custom roles so you can express experiment-specific batching without re-implementing the training loop.\n\n### Models \u0026 Wrappers\n- Use your own **PyTorch or TensorFlow models**, select from pre-exiting templates, or wrap third-party estimators. ModularML provides backend wrappers (Torch, TensorFlow, scikit-learn) so any supported model exposes a consistent forward API and reports its backend.\n\n### ModelGraph and Node-based Connectivity\n- **`ModelNode`** attaches a wrapped model to an upstream FeatureSet or node, handles building, freezing, and optimizer wiring.\n- **`MergeNode`** (e.g., `ConcatNode`) combines outputs from multiple nodes when you need multi-branch architectures.\n- **`ModelGraph`** is the DAG that ties everything together. It resolves head/tail nodes, executes topological forward/backward passes, mixes backends, and lets you switch between stage-wise or global training with a single call.\n\n### AppliedLoss\n- **`AppliedLoss`** instances bind user-defined loss functions to nodes within the ModelGraph. They carry labels, weights, and node scopes so multi-objective training is easy to configure from a phase or experiment.\n\n### Experiment Phases\n- **`TrainPhase`** runs iterative training with your sampler schedule, losses, callbacks, and optimizer configuration.\n- **`FitPhase`** (single-pass) is ideal for algorithms that expect a one-shot `.fit()` (e.g., scikit-learn estimators) after upstream neural components are frozen.\n- **`EvalPhase`** executes forward passes and records losses/metrics on held-out splits without touching gradients.\n\n### Experiment Class\n- The **`Experiment`** binds FeatureSets, ModelGraph, and all phases. It owns execution order, logging, callbacks, and results objects so every run is reproducible. Execution strategies (e.g., cross validation) simply wrap an Experiment to replay the same plan across folds.\n\n### Serialization\n- A core focus of ModularML is reproducibility. To that end, all major classes (FeatureSets, ModelGraph, phases, experiments, losses, samplers, optimizers, callbacks) implement configuration/state serialization\n- All model definitions, training/sampling logic, evaluation, etc is structured under a single Experiment object, allowing for exporting and sharing via a single `.mml` file.\n\n### Callbacks \u0026 Checkpointing\n- Built-in **callbacks** (EarlyStopping, Evaluation + metrics, custom progress hooks) plug directly into Train/Fit/Eval phases, allowing for fully flexibile workflows while retaining a structured experiment API.\n- **Checkpointing** can be attached at any major experiment or training execution step to persist model weights, optimizer states, FeatureSet transforms, and sampler cursors, making restarts seamless.\n\n\n\n## Getting Started\n\nRequires Python \u003e= 3.10\n\n### Installation\nInstall from PyPI:\n```bash\npip install modularml\n```\n\nTo install the latest development version:\n```bash\npip install git+https://github.com/REIL-UConn/modular-ml.git\n```\n\n\n## Explore More\n- **[Explanation](https://modular-ml.readthedocs.io/en/latest/explanation/index.html)** – Conceptual material that explains why ModularML is structured the way it is.\n- **[How-To](https://modular-ml.readthedocs.io/en/latest/how_to/index.html)** – Deep dive on core components of the ModularML framework.\n- **[Tutorials](https://modular-ml.readthedocs.io/en/latest/tutorials/index.html)** – Explore complete walkthroughs of solving common machine learning tasks with ModularML.\n- **[API Reference](https://modular-ml.readthedocs.io/en/latest/reference/index.html)** – API reference, component explanations, configuration guides, and tutorials.\n- **[Discussions](https://github.com/REIL-UConn/modular-ml/discussions)** – Join the community, ask questions, suggest features, or share use cases.\n\n---\n\n\n\u003c!-- ## Cite ModularML\n\nIf you use ModularML in your research, please cite the following:\n\n```bibtex\n@misc{nowacki2025modularml,\n  author       = {The ModularML Team},\n  title        = {ModularML: Modular, fast, and reproducible ML experimentation built for R\u0026D.\n  },\n  year         = {2025},\n  note         = {https://github.com/REIL-UConn/modular-ml},\n} --\u003e\n\u003c!--\n## The Team\nModularML was initiated in 2025 by Ben Nowacki as part of graduate research at the University of Connecticut.\n\nThe project is community-driven and welcomes contributors interested in building modular, reproducible ML workflows for science and engineering. --\u003e\n\n## License\n**[Apache 2.0](https://github.com/REIL-UConn/modular-ml/license)**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freil-uconn%2Fmodular-ml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freil-uconn%2Fmodular-ml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freil-uconn%2Fmodular-ml/lists"}