{"id":13414599,"url":"https://github.com/apache/flink-cdc","last_synced_at":"2025-05-12T13:17:45.424Z","repository":{"id":36956963,"uuid":"282994686","full_name":"apache/flink-cdc","owner":"apache","description":"Flink CDC is a streaming data integration tool","archived":false,"fork":false,"pushed_at":"2025-04-29T11:47:03.000Z","size":42912,"stargazers_count":6051,"open_issues_count":78,"forks_count":2012,"subscribers_count":133,"default_branch":"master","last_synced_at":"2025-05-12T13:16:52.947Z","etag":null,"topics":["batch","cdc","change-data-capture","data-integration","data-pipeline","distributed","elt","etl","flink","kafka","mysql","paimon","postgresql","real-time","schema-evolution"],"latest_commit_sha":null,"homepage":"https://nightlies.apache.org/flink/flink-cdc-docs-stable","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-07-27T19:23:33.000Z","updated_at":"2025-05-08T03:52:38.000Z","dependencies_parsed_at":"2023-09-26T10:00:41.204Z","dependency_job_id":"350b8fc3-feb8-4215-b88f-4aaac2eb69d6","html_url":"https://github.com/apache/flink-cdc","commit_stats":{"total_commits":1100,"total_committers":180,"mean_commits":6.111111111111111,"dds":0.8963636363636364,"last_synced_commit":"a5b666a3254b87b44b9a3843a4d001793e86552c"},"previous_names":["apache/flink-cdc","ververica/flink-cdc-connectors"],"tags_count":38,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fflink-cdc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fflink-cdc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fflink-cdc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fflink-cdc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apache","download_url":"https://codeload.github.com/apache/flink-cdc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253745196,"owners_count":21957319,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["batch","cdc","change-data-capture","data-integration","data-pipeline","distributed","elt","etl","flink","kafka","mysql","paimon","postgresql","real-time","schema-evolution"],"created_at":"2024-07-30T21:00:30.062Z","updated_at":"2025-05-12T13:17:45.318Z","avatar_url":"https://github.com/apache.png","language":"Java","readme":"\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://nightlies.apache.org/flink/flink-cdc-docs-stable/\"\u003e\u003cimg src=\"docs/static/fig/flinkcdc-logo.png\" alt=\"Flink CDC\" style=\"width: 375px;\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://github.com/apache/flink-cdc/\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/stars/apache/flink-cdc?style=social\u0026label=Star\u0026maxAge=2592000\" alt=\"Test\"\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/apache/flink-cdc/releases\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/v/release/apache/flink-cdc?color=yellow\" alt=\"Release\"\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/apache/flink-cdc/actions/workflows/flink_cdc_ci.yml\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/actions/workflow/status/apache/flink-cdc/flink_cdc_ci.yml?branch=master\" alt=\"Build\"\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/apache/flink-cdc/actions/workflows/flink_cdc_ci_nightly.yml\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/actions/workflow/status/apache/flink-cdc/flink_cdc_ci_nightly.yml?branch=master\u0026label=nightly\" alt=\"Nightly Build\"\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/apache/flink-cdc/tree/master/LICENSE\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/static/v1?label=license\u0026message=Apache License 2.0\u0026color=white\" alt=\"License\"\u003e\n\u003c/a\u003e\n\u003c/p\u003e\n\n\nFlink CDC is a distributed data integration tool for real time data and batch data. Flink CDC brings the simplicity \nand elegance of data integration via YAML to describe the data movement and transformation in a \n[Data Pipeline](docs/content/docs/core-concept/data-pipeline.md).\n\n\nThe Flink CDC prioritizes efficient end-to-end data integration and offers enhanced functionalities such as \nfull database synchronization, sharding table synchronization, schema evolution and data transformation.\n\n![Flink CDC framework design](docs/static/fig/architecture.png)\n\n### Quickstart Guide\n\nFlink CDC provides a CdcUp CLI utility to start a playground environment and run Flink CDC jobs.\nYou will need to have a working Docker and Docker compose environment to use it.\n\n1. Run `git clone https://github.com/apache/flink-cdc.git --depth=1` to retrieve a copy of Flink CDC source code.\n2. Run `cd tools/cdcup/ \u0026\u0026 ./cdcup.sh init` to use the CdcUp tool to start a playground environment.\n3. Run `./cdcup.sh up` to boot-up docker containers, and wait for them to be ready.\n4. Run `./cdcup.sh mysql` to open a MySQL session, and create at least one table.\n\n```sql\n-- initialize db and table\nCREATE DATABASE cdc_playground;\nUSE cdc_playground;\nCREATE TABLE test_table (id INT PRIMARY KEY, name VARCHAR(32));\n\n-- insert test data\nINSERT INTO test_table VALUES (1, 'alice'), (2, 'bob'), (3, 'cicada'), (4, 'derrida');\n\n-- verify if it has been successfully inserted\nSELECT * FROM test_table;\n```\n\n5. Run `./cdcup.sh pipeline pipeline-definition.yaml` to submit the pipeline job. You may also edit the pipeline definition file for further configurations.\n6. Run `./cdcup.sh flink` to access the Flink Web UI.\n\n### Getting Started\n\n1. Prepare a [Apache Flink](https://nightlies.apache.org/flink/flink-docs-master/docs/try-flink/local_installation/#starting-and-stopping-a-local-cluster) cluster and set up `FLINK_HOME` environment variable.\n2. [Download](https://github.com/apache/flink-cdc/releases) Flink CDC tar, unzip it and put jars of pipeline connector to Flink `lib` directory.\n\n\u003e If you're using macOS or Linux, you may use `brew install apache-flink-cdc` to install Flink CDC and compatible connectors quickly.\n\n3. Create a **YAML** file to describe the data source and data sink, the following example synchronizes all tables under MySQL app_db database to Doris :\n  ```yaml\n   source:\n     type: mysql\n     hostname: localhost\n     port: 3306\n     username: root\n     password: 123456\n     tables: app_db.\\.*\n\n   sink:\n     type: doris\n     fenodes: 127.0.0.1:8030\n     username: root\n     password: \"\"\n\n   transform:\n     - source-table: adb.web_order01\n       projection: \\*, format('%S', product_name) as product_name\n       filter: addone(id) \u003e 10 AND order_id \u003e 100\n       description: project fields and filter\n     - source-table: adb.web_order02\n       projection: \\*, format('%S', product_name) as product_name\n       filter: addone(id) \u003e 20 AND order_id \u003e 200\n       description: project fields and filter\n\n   route:\n     - source-table: app_db.orders\n       sink-table: ods_db.ods_orders\n     - source-table: app_db.shipments\n       sink-table: ods_db.ods_shipments\n     - source-table: app_db.products\n       sink-table: ods_db.ods_products\n\n   pipeline:\n     name: Sync MySQL Database to Doris\n     parallelism: 2\n     user-defined-function:\n       - name: addone\n         classpath: com.example.functions.AddOneFunctionClass\n       - name: format\n         classpath: com.example.functions.FormatFunctionClass\n  ```\n4. Submit pipeline job using `flink-cdc.sh` script.\n ```shell\n  bash bin/flink-cdc.sh /path/mysql-to-doris.yaml\n ```\n5. View job execution status through Flink WebUI or downstream database.\n\nTry it out yourself with our more detailed [tutorial](docs/content/docs/get-started/quickstart/mysql-to-doris.md). \nYou can also see [connector overview](docs/content/docs/connectors/pipeline-connectors/overview.md) to view a comprehensive catalog of the\nconnectors currently provided and understand more detailed configurations.\n\n### Join the Community\n\nThere are many ways to participate in the Apache Flink CDC community. The\n[mailing lists](https://flink.apache.org/what-is-flink/community/#mailing-lists) are the primary place where all Flink\ncommitters are present. For user support and questions use the user mailing list. If you've found a problem of Flink CDC,\nplease create a [Flink jira](https://issues.apache.org/jira/projects/FLINK/summary) and tag it with the `Flink CDC` tag.   \nBugs and feature requests can either be discussed on the dev mailing list or on Jira.\n\n\n\n### Contributing\n\nWelcome to contribute to Flink CDC, please see our [Developer Guide](docs/content/docs/developer-guide/contribute-to-flink-cdc.md)\nand [APIs Guide](docs/content/docs/developer-guide/understand-flink-cdc-api.md).\n\n\n\n### License\n\n[Apache 2.0 License](LICENSE).\n\n\n\n### Special Thanks\n\nThe Flink CDC community welcomes everyone who is willing to contribute, whether it's through submitting bug reports,\nenhancing the documentation, or submitting code contributions for bug fixes, test additions, or new feature development.     \nThanks to all contributors for their enthusiastic contributions.\n\n\u003ca href=\"https://github.com/apache/flink-cdc/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=apache/flink-cdc\"/\u003e\n\u003c/a\u003e\n","funding_links":[],"categories":["Ingestion","Java","大数据"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fflink-cdc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapache%2Fflink-cdc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fflink-cdc/lists"}