{"id":13740785,"url":"https://github.com/sematic-ai/sematic","last_synced_at":"2025-05-14T09:07:32.254Z","repository":{"id":41344042,"uuid":"483425714","full_name":"sematic-ai/sematic","owner":"sematic-ai","description":"An open-source ML pipeline development platform","archived":false,"fork":false,"pushed_at":"2025-01-09T17:56:08.000Z","size":21208,"stargazers_count":990,"open_issues_count":131,"forks_count":63,"subscribers_count":12,"default_branch":"main","last_synced_at":"2025-05-11T20:43:00.410Z","etag":null,"topics":["ai","data-science","machine-learning","ml","ml-ops","ml-pipeline","ml-pipelines","mlops","pipeline","python","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sematic-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"docs/code-of-conduct.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-04-19T22:16:33.000Z","updated_at":"2025-05-01T02:34:43.000Z","dependencies_parsed_at":"2024-03-08T21:23:01.843Z","dependency_job_id":"03020fe9-f566-4c34-ac88-d286c3a73437","html_url":"https://github.com/sematic-ai/sematic","commit_stats":null,"previous_names":[],"tags_count":61,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sematic-ai%2Fsematic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sematic-ai%2Fsematic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sematic-ai%2Fsematic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sematic-ai%2Fsematic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sematic-ai","download_url":"https://codeload.github.com/sematic-ai/sematic/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254110374,"owners_count":22016391,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","data-science","machine-learning","ml","ml-ops","ml-pipeline","ml-pipelines","mlops","pipeline","python","python3"],"created_at":"2024-08-03T04:00:52.171Z","updated_at":"2025-05-14T09:07:32.198Z","avatar_url":"https://github.com/sematic-ai.png","language":"Python","funding_links":[],"categories":["Model Training and Orchestration","🚀 MLOps"],"sub_categories":["Tools"],"readme":"![Sematic Logo](https://raw.githubusercontent.com/sematic-ai/sematic/main/docs/images/Logo_README.png)\n\n\u003ch2 align=\"center\"\u003eThe open-source Continuous Machine Learning Platform\u003c/h2\u003e\n\n\u003ch3 align=\"center\"\u003eBuild ML pipelines with only Python, run on your laptop, or in the cloud.\u003c/h3\u003e\n\n![PyPI](https://img.shields.io/pypi/v/sematic/0.41.0?style=for-the-badge)\n[![CircleCI](https://img.shields.io/circleci/build/github/sematic-ai/sematic/main?label=CircleCI\u0026style=for-the-badge\u0026token=60d1953bfee5b6bf8201f8e84a10eaa5bf5622fe)](https://app.circleci.com/pipelines/github/sematic-ai/sematic?branch=main\u0026filter=all)\n![PyPI - License](https://img.shields.io/pypi/l/sematic?style=for-the-badge)\n[![Python 3.9](https://img.shields.io/badge/Python-3.9-blue?style=for-the-badge\u0026logo=none)](https://python.org)\n[![Python 3.10](https://img.shields.io/badge/Python-3.10-blue?style=for-the-badge\u0026logo=none)](https://python.org)\n[![Python 3.11](https://img.shields.io/badge/Python-3.11-blue?style=for-the-badge\u0026logo=none)](https://python.org)\n[![Python 3.12](https://img.shields.io/badge/Python-3.12-blue?style=for-the-badge\u0026logo=none)](https://python.org)\n[![Python 3.13](https://img.shields.io/badge/Python-3.13-blue?style=for-the-badge\u0026logo=none)](https://python.org)\n![Discord](https://img.shields.io/discord/983789877927747714?label=DISCORD\u0026style=for-the-badge)\n[![Made By Sematic](https://img.shields.io/badge/Made_by-Sematic_🦊-E19632?style=for-the-badge\u0026logo=none)](https://sematic.dev)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/sematic?style=for-the-badge)\n\n![Sematic Screenshot](https://raw.githubusercontent.com/sematic-ai/sematic/main/docs/images/Screenshot_README_2.png)\n\n[Sematic](https://sematic.dev) is an open-source ML development platform. It\nlets ML Engineers and Data Scientists write arbitrarily complex end-to-end\npipelines with simple Python and execute them on their local machine, in a cloud\nVM, or on a Kubernetes cluster to leverage cloud resources.\n\nSematic is based on learnings gathered at top self-driving car companies. It\nenables chaining data processing jobs (e.g. Apache Spark) with model training\n(e.g. PyTorch, Tensorflow), or any other arbitrary Python business logic into\ntype-safe, traceable, reproducible end-to-end pipelines that can be monitored\nand visualized in a modern web dashboard.\n\nRead our [documentation](https://docs.sematic.dev) and join our [Discord\nchannel](https://discord.gg/4KZJ6kYVax).\n\n## Why Sematic\n\n- **Easy onboarding** – no deployment or infrastructure needed to get started,\n  simply install Sematic locally and start exploring.\n- **Local-to-cloud parity** – run the same code on your local laptop and on your\n  Kubernetes cluster.\n- **End-to-end traceability** – all pipeline artifacts are persisted, tracked,\n  and visualizable in a web dashboard.\n- **Access heterogeneous compute** – customize required resources for each\n  pipeline step to optimize your performance and cloud footprint (CPUs, memory,\n  GPUs, Spark cluster, etc.)\n- **Reproducibility** – rerun your pipelines from the UI with guaranteed\n  reproducibility of results\n\n## Getting Started\n\nTo get started locally, simply install Sematic in your Python environment:\n\n```shell\n$ pip install sematic\n```\n\nStart the local web dashboard:\n\n```shell\n$ sematic start\n```\n\nRun an example pipeline:\n\n```shell\n$ sematic run examples/mnist/pytorch\n```\n\nCreate a new boilerplate project:\n\n```shell\n$ sematic new my_new_project\n```\n\nOr from an existing example:\n\n```shell\n$ sematic new my_new_project --from examples/mnist/pytorch\n```\n\nThen run it with:\n\n```shell\n$ python3 -m my_new_project\n```\n\nTo deploy Sematic to Kubernetes and leverage cloud resources, see our\n[documentation](https://docs.sematic.dev).\n\n## Features\n\n- **Lightweight Python SDK** – define arbitrarily complex end-to-end pipelines\n- **Pipeline nesting** – arbitrarily nest pipelines into larger pipelines\n- **Dynamic graphs** – Python-defined graphs allow for iterations, conditional\n  branching, etc.\n- **Lineage tracking** – all inputs and outputs of all steps are persisted and\n  tracked\n- **Runtime type-checking** – fail early with run-time type checking\n- **Web dashboard** – Monitor, track, and visualize pipelines in a modern web UI\n- **Artifact visualization** – visualize all inputs and outputs of all steps in\n  the web dashboard\n- **Local execution** – run pipelines on your local machine without any\n  deployment necessary\n- **Cloud orchestration** – run pipelines on Kubernetes to access GPUs and other\n  cloud resources\n- **Heterogeneous compute resources** – run different steps on different\n  machines (e.g. CPUs, memory, GPU, Spark, etc.)\n- **Helm chart deployment** – install Sematic on your Kubernetes cluster\n- **Pipeline reruns** – rerun pipelines from the UI from an arbitrary point in\n  the graph\n- **Step caching** – cache expensive pipeline steps for faster iteration\n- **Step retry** – recover from transient failures with step retries\n- **Metadata and collaboration** – Tags, source code visualization, docstrings,\n  notes, etc.\n- **Numerous integrations** – See below\n\n## Integrations\n\n- **Apache Spark** – on-demand in-cluster Spark cluster\n- **Ray** – on-demand Ray in-cluster Ray resources\n- **Snowflake** – easily query your data warehouse (other warehouses supported\n  too)\n- **Plotly, Matplotlib** – visualize plot artifacts in the web dashboard\n- **Pandas** – visualize dataframe artifacts in the dashboard\n- **Grafana** – embed Grafana panels in the web dashboard\n- **Bazel** – integrate with your Bazel build system\n- **Helm chart** – deploy to Kubernetes with our Helm chart\n- **Git** – track git information in the web dashboard\n\n## Community and resources\n\nLearn more about Sematic and get in touch with the following resources:\n\n- [Sematic landing page](https://sematic.dev)\n- [Documentation](https://docs.sematic.dev)\n- [Discord channel](https://discord.gg/4KZJ6kYVax)\n- [YouTube channel](https://www.youtube.com/@sematic-ai)\n- [Our Blog](https://sematic.dev/blog)\n\n## Contribute!\n\nTo contribute to Sematic, check out [open issues tagged \"good first\nissue\"](https://github.com/sematic-ai/sematic/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22),\nand get in touch with us on [Discord](https://discord.gg/4KZJ6kYVax).\nYou can find instructions on how to get your development environment set up\nin our [developer docs](./developer-docs/README.md). If you'd like to add\nan example, you may also find\n[this guide](https://docs.sematic.dev/project/contributor-guide/contribute-example)\nhelpful.\n\n![scarf pixel](https://static.scarf.sh/a.png?x-pxid=80c3593f-25a0-4b06-90a1-0b670a6567d4)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsematic-ai%2Fsematic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsematic-ai%2Fsematic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsematic-ai%2Fsematic/lists"}