{"id":13633369,"url":"https://github.com/RunLLM/aqueduct","last_synced_at":"2025-04-18T10:34:44.136Z","repository":{"id":37056659,"uuid":"496844646","full_name":"RunLLM/aqueduct","owner":"RunLLM","description":"Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.","archived":false,"fork":false,"pushed_at":"2023-06-07T19:24:59.000Z","size":23117,"stargazers_count":521,"open_issues_count":11,"forks_count":19,"subscribers_count":8,"default_branch":"main","last_synced_at":"2025-04-14T04:55:10.248Z","etag":null,"topics":["ai","data","data-science","kubernetes","llm","llms","machine-learning","ml","ml-infrastructure","ml-monitoring","mlops","orchestration","python","python3"],"latest_commit_sha":null,"homepage":"https://aqueducthq.com","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RunLLM.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2022-05-27T03:07:09.000Z","updated_at":"2025-03-13T05:25:15.000Z","dependencies_parsed_at":"2022-07-09T19:46:08.667Z","dependency_job_id":"7e84372b-0da2-40c5-a668-ede0e68c5643","html_url":"https://github.com/RunLLM/aqueduct","commit_stats":null,"previous_names":["aqueducthq/aqueduct"],"tags_count":48,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunLLM%2Faqueduct","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunLLM%2Faqueduct/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunLLM%2Faqueduct/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RunLLM%2Faqueduct/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RunLLM","download_url":"https://codeload.github.com/RunLLM/aqueduct/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249479058,"owners_count":21279187,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","data","data-science","kubernetes","llm","llms","machine-learning","ml","ml-infrastructure","ml-monitoring","mlops","orchestration","python","python3"],"created_at":"2024-08-01T23:00:35.241Z","updated_at":"2025-04-18T10:34:42.075Z","avatar_url":"https://github.com/RunLLM.png","language":"Go","funding_links":[],"categories":["Large Scale Deployment"],"sub_categories":["Workflow"],"readme":"\n\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"https://aqueducthq.com\"\u003e\n    \u003cimg src=\"https://aqueduct-public-assets-bucket.s3.us-east-2.amazonaws.com/webapp/logos/aqueduct-logo-two-tone/1x/aqueduct-logo-two-tone-1x.png\" width=\"40%\" /\u003e\n  \u003c/a\u003e\n  \n  \u003ch2 style=\"border: 0px white;\"\u003eRun LLMs and ML on any cloud infrastructure\u003c/h2\u003e\n\n### 📢 [Slack](https://slack.aqueducthq.com)\u0026nbsp;\u0026nbsp;|\u0026nbsp;\u0026nbsp;🗺️ [Roadmap](https://roadmap.aqueducthq.com)\u0026nbsp;\u0026nbsp;|\u0026nbsp;\u0026nbsp;🐞 [Report a bug](https://github.com/aqueducthq/aqueduct/issues/new?assignees=\u0026labels=bug\u0026template=bug_report.md\u0026title=%5BBUG%5D)\u0026nbsp;\u0026nbsp;|\u0026nbsp;\u0026nbsp;✍️ [Blog](https://blog.aqueducthq.com)\n\n  \n[![Start Sandbox](https://img.shields.io/static/v1?label=%20\u0026logo=github\u0026message=Start%20Sandbox\u0026color=black)](https://github.com/codespaces/new?hide_repo_select=true\u0026ref=main\u0026repo=496844646)\n[![Downloads](https://pepy.tech/badge/aqueduct-ml/month)](https://pypi.org/project/aqueduct-ml/)\n[![Slack](https://img.shields.io/static/v1.svg?label=chat\u0026message=on%20slack\u0026color=27b1ff\u0026style=flat)](https://join.slack.com/t/aqueductusers/shared_invite/zt-11hby91cx-cpmgfK0qfXqEYXv25hqD6A)\n[![GitHub license](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://github.com/aqueducthq/aqueduct/blob/master/LICENSE)\n[![PyPI version](https://badge.fury.io/py/aqueduct-ml.svg)](https://pypi.org/project/aqueduct-ml/)\n[![Tests](https://github.com/aqueducthq/aqueduct/actions/workflows/integration-tests.yml/badge.svg)](https://github.com/aqueducthq/aqueduct/actions/workflows/integration-tests.yml)\n\u003c/div\u003e\n\n**Aqueduct is an MLOps framework that allows you to define and deploy machine learning and LLM workloads on any cloud infrastructure. [Check out our quickstart guide! →](https://docs.aqueducthq.com/quickstart-guide)**\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/867892/230214641-b0aec53b-4988-4581-84ed-134f97ed9276.png\" width=\"80%\" /\u003e\n\u003c/p\u003e\n\nAqueduct is an open-source MLOps framework that allows you to write code in vanilla Python, run that code on any cloud infrastructure you'd like to use, and gain visibility into the execution and performance of your models and predictions. **[See what infrastructure Aqueduct works with. →](https://aqueducthq.com/integrations/)**\n\nHere's how you can get started: \n\n```bash\npip3 install aqueduct-ml\naqueduct start\n```\n\n### How it works\n\nAqueduct's Python native API allows you to define ML tasks in regular Python code. You can connect Aqueduct to your existing cloud infrastructure ([docs](https://docs.aqueducthq.com/integrations)), and Aqueduct will seamlessly move your code from your laptop to the cloud or between different cloud infrastructure layers. \n\n\u003c!--- TODO(vikram): Modify this once we add support for switching into/out of Databricks in a single workflow. ---\u003e\nFor example, we can define a pipeline that trains a model on Kubernetes using a GPU and validates that model in AWS Lambda in a few lines of Python: \n\n```python\n# Use an existing LLM.\nvicuna = aq.llm_op('vicuna_7b', engine='eks-us-east-2')\nfeatures = vicuna(\n    raw_logs,\n    { \n        \"prompt\": \n        \"Turn this log entry into a CSV: {text}\" \n    }\n)\n\n# Or write a custom op on your favorite infrastructure!\n@op(\n  engine='kubernetes',\n  # Get a GPU.\n  resources={'gpu_resource_name': 'nvidia.com/gpu'}\n)\ndef train(featurized_logs):\n  return model.train(features) # Train your model.\n\ntrain(features)\n```\n\nOnce you publish this workflow to Aqueduct, you can see it on the UI: \n\n![image](https://github.com/aqueducthq/aqueduct/assets/867892/d0561772-8799-4046-92ae-3c975d70e47d)\n\nTo see how to build your first workflow, check out our **[quickstart guide! →](https://docs.aqueducthq.com/quickstart-guide)**\n\n## Why Aqueduct?\n\nMLOps has become a [tangled mess of siloed infrastructure](https://aqueducthq.com/post/the-mlops-knot/). Most teams need to set up and operate many different cloud infrastructure tools to run ML effectively, but these tools have disparate APIs and interoperate poorly.\n\nAqueduct provides a single interface to running machine learning tasks on your existing cloud infrastructure — Kubernetes, Spark, Lambda, etc. From the same Python API, you can run code across any or all of these systems seamlessly and gain visibility into how your code is performing.\n\n* **Python-native pipeline API**: Aqueduct’s API allows you define your workflows in vanilla Python, so you can get code into production quickly and effectively. No more DSLs or YAML configs to worry about.\n* **Integrated with your infrastructure**: Workflows defined in Aqueduct can run on any cloud infrastructure you use, like Kubernetes, Spark, Airflow, or AWS Lambda. You can get all the benefits of Aqueduct without having to rip-and-replace your existing tooling.\n* **Centralized visibility into code, data, \u0026 metadata**: Once your workflows are in production, you need to know what’s running, whether it’s working, and when it breaks. Aqueduct gives you visibility into what code, data, metrics, and metadata are generated by each workflow run, so you can have confidence that your pipelines work as expected — and know immediately when they don’t.\n* **Runs securely in your cloud**: Aqueduct is fully open-source and runs in any Unix environment. It runs entirely in your cloud and on your infrastructure, so you can be confident that your data and code are secure.\n\n## Overview \u0026 Examples\n\nThe core abstraction in Aqueduct is a [Workflow](https://docs.aqueducthq.com/workflows), which is a sequence of [Artifacts](https://docs.aqueducthq.com/artifacts) (data) that are transformed by [Operators](https://docs.aqueducthq.com/operators) (compute). \nThe input Artifact(s) for a Workflow is typically loaded from a database, and the output Artifact(s) are typically persisted back to a database. \nEach Workflow can either be run on a fixed schedule or triggered on-demand.\n\nTo see Aqueduct in action on some real-world machine learning workflows, check out some of our examples:\n\n* [Churn Ensemble](https://github.com/aqueducthq/aqueduct/blob/main/examples/churn_prediction/Customer%20Churn%20Prediction.ipynb)\n* [Sentiment Analysis](https://github.com/aqueducthq/aqueduct/blob/main/examples/sentiment-analysis/Sentiment%20Model.ipynb)\n* [Impute Missing Wine Data](https://github.com/aqueducthq/aqueduct/blob/main/examples/wine-ratings-prediction/Predict%20Missing%20Wine%20Ratings.ipynb)\n* ... [and more](https://github.com/aqueducthq/aqueduct/tree/main/examples)!\n\n## What's next?\n\nCheck out our [documentation](https://docs.aqueducthq.com/), where you'll find:\n* a [Quickstart Guide](https://docs.aqueducthq.com/quickstart-guide)\n* [example workflows](https://docs.aqueducthq.com/example-workflows)\n* and more details on [creating workflows](https://docs.aqueducthq.com/workflows)\n\nIf you have questions or comments or would like to learn more about what we're\nbuilding, please [reach out](mailto:hello@aqueducthq.com), [join our Slack\nchannel](https://join.slack.com/t/aqueductusers/shared_invite/zt-11hby91cx-cpmgfK0qfXqEYXv25hqD6A), or [start a conversation on GitHub](https://github.com/aqueducthq/aqueduct/issues/new).\nWe'd love to hear from you!\n\nIf you're interested in contributing, please check out our [roadmap](https://roadmap.aqueducthq.com) and join the development channel in [our community Slack](https://slack.aqueducthq.com).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRunLLM%2Faqueduct","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FRunLLM%2Faqueduct","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRunLLM%2Faqueduct/lists"}