{"id":28763629,"url":"https://github.com/flink-extended/ai-flow","last_synced_at":"2025-06-17T09:10:02.138Z","repository":{"id":37402626,"uuid":"417022077","full_name":"flink-extended/ai-flow","owner":"flink-extended","description":"AI Flow is an open source framework that bridges big data and artificial intelligence. ","archived":false,"fork":false,"pushed_at":"2022-10-09T06:01:06.000Z","size":57760,"stargazers_count":176,"open_issues_count":12,"forks_count":36,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-06-15T18:19:13.962Z","etag":null,"topics":["machine-learning-workflow","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/flink-extended.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-10-14T06:57:21.000Z","updated_at":"2025-06-09T06:41:19.000Z","dependencies_parsed_at":"2022-08-08T20:15:34.995Z","dependency_job_id":null,"html_url":"https://github.com/flink-extended/ai-flow","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/flink-extended/ai-flow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flink-extended%2Fai-flow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flink-extended%2Fai-flow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flink-extended%2Fai-flow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flink-extended%2Fai-flow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/flink-extended","download_url":"https://codeload.github.com/flink-extended/ai-flow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flink-extended%2Fai-flow/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260326793,"owners_count":22992388,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning-workflow","python"],"created_at":"2025-06-17T09:10:00.562Z","updated_at":"2025-06-17T09:10:02.102Z","avatar_url":"https://github.com/flink-extended.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AIFlow\n\n[![CI](https://github.com/flink-extended/ai-flow/actions/workflows/flink_ai_flow_ci.yml/badge.svg)](https://github.com/flink-extended/ai-flow/actions/workflows/flink_ai_flow_ci.yml)\n[![codecov](https://codecov.io/gh/flink-extended/ai-flow/branch/master/graph/badge.svg?token=ISWZNXUYO5)](https://codecov.io/gh/flink-extended/ai-flow)\n\n## Introduction\n\nAIFlow is an event-based workflow orchestration platform that allows users to\nprogrammatically author and schedule workflows with a mixture of streaming and\nbatch tasks.\n\nMost existing workflow orchestration platforms (e.g. Apache AirFlow, KubeFlow)\nschedule task executions based on the status changes of upstream task\nexecutions. While this approach works well for batch tasks that are guaranteed\nto end, it does not work well for streaming tasks which might run for an\ninfinite amount of time without status changes. AIFlow is proposed to facilitate\nthe orchestration of workflows involving streaming tasks.\n\nFor example, users might want to run a Flink streaming job continuously to\nassemable training data, and start a machine learning training job everytime the\nFlink job has processed all upstream data for the past hour. In order to\nschedule this workflow using non-event-based workflow orchestration platform,\nusers need to schedule the training job periodically based on wallclock time. If\nthere is traffic spike or upstream job failure, then the Flink job might not\nhave processed the expected amount of upstream data by the time the TensorFlow\njob starts. The upstream job should either keep waiting, or fail fast, or\nprocess partial data, none of which is ideal. In comparison, AIFlow provides\nAPIs for the Flink job to emit an event every time its event-based watermark\nincrements by an hour, which triggers the execution of user-specified training\njob, without suffering the issues described above.\n\nLearn more about AIFlow at https://ai-flow.readthedocs.io\n\n## Features\n\n1. Event-driven: AIFlow schedule workflow and jobs based on events. This is more efficient than status-driven scheduling and be able to schedule the workflows that contain stream jobs.\n\n2. Extensible: Users can easily define their own operators and executors to submit various types of tasks to different platforms.\n\n3. Exactly-once: AIFlow provides an event processing mechanism with exactly-once semantics, which means that your tasks will never be missed or repeated even if a failover occurs.\n\n## Articles on AIFlow\n\n- [B站基于AIFlow+Flink在批流融合调度上的实践](https://mp.weixin.qq.com/s/XC7BjfrbOrtFmumwhmctbA)\n\n\n## Contributing\n\nWe happily welcome contributions to AIFlow in any ways, whether reporting problems, drafting features, or contributing code changes.\nYou can report problems to request features in the [GitHub Issues](https://github.com/flink-extended/ai-flow/issues).\nIf you want to contribute code changes, please check out the [contributing documentation](./CONTRIBUTING.md).\n\n\n## Contact Us\n\nFor more information, we recommend you to join the **AIFlow Community Group** on the [Google Groups](https://groups.google.com) to contact us: **aiflow@googlegroups.com**.\n\nYou can also join the group on the [DingTalk](https://www.dingtalk.com). The number of the DingTalk group is `35876083`, which group can also be joined by scanning the QR code below:\n\n![](https://raw.githubusercontent.com/wiki/alibaba/flink-ai-extended/images/dingtalk_qr_code.png)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflink-extended%2Fai-flow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflink-extended%2Fai-flow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflink-extended%2Fai-flow/lists"}