{"id":20162761,"url":"https://github.com/airscholar/changecapture-e2e","last_synced_at":"2025-10-29T20:41:51.281Z","repository":{"id":234048211,"uuid":"724037612","full_name":"airscholar/changecapture-e2e","owner":"airscholar","description":"This project shows how to capture changes from postgres database and stream them into kafka","archived":false,"fork":false,"pushed_at":"2024-05-17T03:40:33.000Z","size":599,"stargazers_count":36,"open_issues_count":4,"forks_count":20,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-24T02:22:01.738Z","etag":null,"topics":["apache-spark","cdc","debezium","docker","kafka","postgres","zookeeper"],"latest_commit_sha":null,"homepage":"https://youtu.be/IocW3KnMFyI","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/airscholar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-11-27T09:15:09.000Z","updated_at":"2025-03-16T19:57:20.000Z","dependencies_parsed_at":"2024-04-18T02:57:13.294Z","dependency_job_id":"0b12fcf3-4341-4db4-82d1-9d296a998dab","html_url":"https://github.com/airscholar/changecapture-e2e","commit_stats":null,"previous_names":["airscholar/changecapture-e2e"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/airscholar%2Fchangecapture-e2e","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/airscholar%2Fchangecapture-e2e/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/airscholar%2Fchangecapture-e2e/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/airscholar%2Fchangecapture-e2e/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/airscholar","download_url":"https://codeload.github.com/airscholar/changecapture-e2e/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248137985,"owners_count":21053772,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-spark","cdc","debezium","docker","kafka","postgres","zookeeper"],"created_at":"2024-11-14T00:26:44.464Z","updated_at":"2025-10-29T20:41:46.248Z","avatar_url":"https://github.com/airscholar.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CDC with Debezium, Kafka, Postgres, Docker \n\n## Overview\n\nThis Python script is designed to generate simulated financial transactions and insert them into a PostgreSQL database. It's particularly useful for setting up a test environment for Change Data Capture (CDC) with Debezium. The script uses the `faker` library to create realistic, yet fictitious, transaction data and inserts it into a PostgreSQL table.\n\n## System Architecture\n![system architecture.png](system%20architecture.png)\n\n## Prerequisites\n\nBefore running this script, ensure you have the following installed:\n- Python 3.9+\n- `psycopg2` library for Python\n- `faker` library for Python\n- PostgreSQL server running locally or accessible remotely\n- Docker and Docker Compose installed on your machine.\n- Basic understanding of Docker, Kafka, and Postgres.\n\n## Installation\n\n1. **Install Required Python Libraries:**\n\n   You can install the required libraries using pip:\n\n   ```bash\n   pip install psycopg2-binary faker\n   ```\n\n## Services in the Compose File\n\n- **Zookeeper:** A centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.\n- **Kafka Broker:** A distributed streaming platform that is used here for handling real-time data feeds.\n- **Confluent Control Center:** A web-based tool for managing and monitoring Apache Kafka.\n- **Debezium:** An open-source distributed platform for change data capture.\n- **Debezium UI:** A user interface for managing and monitoring Debezium connectors.\n- **Postgres:** An open-source relational database.\n\n## Getting Started\n\n1. **Clone the Repository:**\n   Ensure you have this Docker Compose file in your local system. If it's part of a repository, clone the repository to your local machine.\n\n2. **Navigate to the Directory:**\n   Open a terminal and navigate to the directory containing the Docker Compose file.\n\n3. **Run Docker Compose:**\n   Execute the following command to start all services defined in the Docker Compose file:\n\n   ```bash\n   docker-compose up -d\n   ```\n\n   This command will download the necessary Docker images, create containers, and start the services in detached mode.\n\n4. **Verify the Services:**\n   Check if all the services are up and running:\n\n   ```bash\n   docker-compose ps\n   ```\n\n   You should see all services listed as 'running'.\n\n5. **Accessing the Services:**\n   - Kafka Control Center is accessible at `http://localhost:9021`.\n   - Debezium UI is accessible at `http://localhost:8080`.\n   - Postgres is accessible on the default port `5432`.\n\n6. **Shutting Down:**\n   To stop and remove the containers, networks, and volumes, run:\n\n   ```bash\n   docker-compose down\n   ```\n\n## Customization\nYou can modify the Docker Compose file to suit your needs. For example, you might want to persist data in Postgres by adding a volume for the Postgres service.\n\n## Note\nThis setup is intended for development and testing purposes. For production environments, consider additional factors like security, scalability, and data persistence.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fairscholar%2Fchangecapture-e2e","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fairscholar%2Fchangecapture-e2e","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fairscholar%2Fchangecapture-e2e/lists"}