{"id":13780152,"url":"https://github.com/raystack/dagger","last_synced_at":"2025-04-06T09:10:44.282Z","repository":{"id":36970798,"uuid":"350432682","full_name":"raystack/dagger","owner":"raystack","description":"Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data. ","archived":false,"fork":false,"pushed_at":"2023-08-29T09:06:55.000Z","size":12570,"stargazers_count":256,"open_issues_count":22,"forks_count":40,"subscribers_count":16,"default_branch":"main","last_synced_at":"2024-05-18T22:04:42.637Z","etag":null,"topics":["apache-flink","apache-kafka","dataops","framework","influxdb","prometheus","real-time-analytics","real-time-processing","stream-processing"],"latest_commit_sha":null,"homepage":"https://raystack.github.io/dagger/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/raystack.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-03-22T17:33:10.000Z","updated_at":"2024-05-16T15:03:41.000Z","dependencies_parsed_at":"2024-01-07T00:08:39.659Z","dependency_job_id":null,"html_url":"https://github.com/raystack/dagger","commit_stats":null,"previous_names":["raystack/dagger","odpf/dagger"],"tags_count":26,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raystack%2Fdagger","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raystack%2Fdagger/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raystack%2Fdagger/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/raystack%2Fdagger/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/raystack","download_url":"https://codeload.github.com/raystack/dagger/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247423513,"owners_count":20936626,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-flink","apache-kafka","dataops","framework","influxdb","prometheus","real-time-analytics","real-time-processing","stream-processing"],"created_at":"2024-08-03T18:01:12.809Z","updated_at":"2025-04-06T09:10:44.266Z","avatar_url":"https://github.com/raystack.png","language":"Java","readme":"# Dagger\n\n![build workflow](https://github.com/raystack/dagger/actions/workflows/build.yml/badge.svg)\n![package workflow](https://github.com/raystack/dagger/actions/workflows/package.yml/badge.svg)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg?logo=apache)](LICENSE)\n[![Version](https://img.shields.io/github/v/release/raystack/dagger?logo=semantic-release)](https://github.com/raystack/dagger/releases/latest)\n\nDagger or Data Aggregator is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink\nfor stateful processing of data. With Dagger, you don't need to write custom applications or complicated code to process\ndata as a stream. Instead, you can write SQL queries and UDFs to do the processing and analysis on streaming data.\n\n![](docs/static/img/overview/dagger_overview.png)\n\n## Key Features\n\nDiscover why to use Dagger\n\n- **Processing:** Dagger can transform, aggregate, join and enrich streaming data, both real-time and historical.\n- **Scale:** Dagger scales in an instant, both vertically and horizontally for high performance streaming sink and zero data drops.\n- **Extensibility:** Add your own sink to dagger with a clearly defined interface or choose from already provided ones. Use Kafka and/or Parquet Files as stream sources.\n- **Flexibility:** Add custom business logic in form of plugins \\(UDFs, Transformers, Preprocessors and Post Processors\\) independent of the core logic.\n- **Metrics:** Always know what’s going on with your deployment with built-in [monitoring](https://raystack.github.io/dagger/docs/reference/metrics) of throughput, response times, errors and more.\n\n## What problems Dagger solves?\n\n- Map reduce -\u003e [SQL](https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/sql.html)\n- Enrichment -\u003e [Post Processors](https://raystack.github.io/dagger/docs/advance/post_processor)\n- Aggregation -\u003e [SQL](https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/sql.html), [UDFs](https://raystack.github.io/dagger/docs/guides/use_udf)\n- Masking -\u003e [Hash Transformer](https://raystack.github.io/dagger/docs/reference/transformers#HashTransformer)\n- Deduplication -\u003e [Deduplication Transformer](https://raystack.github.io/dagger/docs/reference/transformers#DeDuplicationTransformer)\n- Realtime long window processing -\u003e [Longbow](https://raystack.github.io/dagger/docs/advance/longbow)\n\nTo know more, follow the detailed [documentation](https://raystack.github.io/dagger/).\n\n## Usage\n\nExplore the following resources to get started with Dagger:\n\n- [Guides](https://raystack.github.io/dagger/docs/guides/overview) provides guidance on [creating Dagger](https://raystack.github.io/dagger/docs/guides/create_dagger) with different sinks.\n- [Concepts](https://raystack.github.io/dagger/docs/concepts/overview) describes all important Dagger concepts.\n- [Advance](https://raystack.github.io/dagger/docs/advance/overview) contains details regarding advance features of Dagger.\n- [Reference](https://raystack.github.io/dagger/docs/reference/overview) contains details about configurations, metrics and other aspects of Dagger.\n- [Contribute](https://raystack.github.io/dagger/docs/contribute/contribution) contains resources for anyone who wants to contribute to Dagger.\n- [Usecase](https://raystack.github.io/dagger/docs/usecase/overview) describes examples use cases which can be solved via Dagger.\n- [Examples](https://raystack.github.io/dagger/docs/examples/overview) contains tutorials to try out some of Dagger's features with real-world usecases\n\n## Running locally\n\nPlease follow this [Dagger Quickstart Guide](https://raystack.github.io/dagger/docs/guides/quickstart) for setting up a local running Dagger consuming from Kafka or to set up a Docker Compose for Dagger.\n\n**Note:** Sample configuration for running a basic dagger can be found [here](https://raystack.github.io/dagger/docs/guides/create_dagger#common-configurations). For detailed configurations, refer [here](https://raystack.github.io/dagger/docs/reference/configuration).\n\nFind more detailed steps on local setup [here](https://raystack.github.io/dagger/docs/guides/create_dagger).\n\n## Running on cluster\n\nRefer [here](https://raystack.github.io/dagger/docs/guides/deployment) for details regarding Dagger deployment.\n\n## Running tests\n\n```sh\n# Running unit tests\n$ ./gradlew clean test\n\n# Run code quality checks\n$ ./gradlew checkstyleMain checkstyleTest\n\n# Cleaning the build\n$ ./gradlew clean\n```\n\n## Contribute\n\nDevelopment of Dagger happens in the open on GitHub, and we are grateful to the community for contributing bug fixes and improvements. Read below to learn how you can take part in improving Dagger.\n\nRead our [contributing guide](https://raystack.github.io/dagger/docs/contribute/contribution) to learn about our development process, how to propose bug fixes and improvements, and how to build and test your changes to Dagger.\n\nTo help you get your feet wet and get you familiar with our contribution process, we have a list of [good first issues](https://github.com/raystack/dagger/labels/good%20first%20issue) that contain bugs which have a relatively limited scope. This is a great place to get started.\n\n## Credits\n\nThis project exists thanks to all the [contributors](https://github.com/raystack/dagger/graphs/contributors).\n\n## License\n\nDagger is [Apache 2.0](LICENSE) licensed.\n","funding_links":[],"categories":["Tools","云原生"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraystack%2Fdagger","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fraystack%2Fdagger","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraystack%2Fdagger/lists"}