{"id":17124350,"url":"https://github.com/wittline/optimizing-public-transportation","last_synced_at":"2025-03-24T03:26:34.775Z","repository":{"id":111710361,"uuid":"382135450","full_name":"Wittline/optimizing-public-transportation","owner":"Wittline","description":"Streaming event pipeline around Apache Kafka and its ecosystem. Using public data from the Chicago Transit Authority we will construct an event pipeline around Kafka that allows us to simulate and display the status of train lines in real time.","archived":false,"fork":false,"pushed_at":"2021-07-12T03:43:44.000Z","size":371,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-29T09:39:29.233Z","etag":null,"topics":["kafka","stream-processing","streaming","udacity-nanodegree"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Wittline.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-01T19:23:41.000Z","updated_at":"2024-10-05T05:59:20.000Z","dependencies_parsed_at":"2023-04-17T21:14:20.226Z","dependency_job_id":null,"html_url":"https://github.com/Wittline/optimizing-public-transportation","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wittline%2Foptimizing-public-transportation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wittline%2Foptimizing-public-transportation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wittline%2Foptimizing-public-transportation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Wittline%2Foptimizing-public-transportation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Wittline","download_url":"https://codeload.github.com/Wittline/optimizing-public-transportation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245203080,"owners_count":20577099,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["kafka","stream-processing","streaming","udacity-nanodegree"],"created_at":"2024-10-14T18:42:25.795Z","updated_at":"2025-03-24T03:26:34.745Z","avatar_url":"https://github.com/Wittline.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Monitoring the status of public Transportation with Apache Kafka\nWe will build an Streaming event pipeline around Kafka and its ecosystem that allows us to simulate and display the status of train lines in real time, using public data from the Chicago Transit Authority.\n\n# Data Source\nPublic data from \u003ca href=\"https://www.transitchicago.com/data/\"\u003e Chicago Transit Authority \u003c/a\u003e\n\n# Architecture\n\n![image](https://user-images.githubusercontent.com/8701464/124824036-78558d00-df37-11eb-8db2-809633a05bd4.png)\n\n# How to run the project with docker\n\n- Install \u003ca href=\"https://docs.docker.com/docker-for-windows/install/\"\u003eDocker Desktop on Windows\u003c/a\u003e, it will install **docker compose** as well, docker compose will alow you to run multiple containers applications\n- Install \u003ca href=\"https://www.stanleyulili.com/git/how-to-install-git-bash-on-windows/\"\u003egit-bash for windows\u003c/a\u003e, once installed , open **git bash** and download this repository, this will download the **docker-compose.yaml** file, and other files needed.\n\n## Dependencies\n\n- Kafka\n- Zookeeper\n- Schema Registry\n- REST Proxy\n- Kafka Connect\n- KSQL\n- Kafka Connect UI\n- Kafka Topics UI\n- Schema Registry UI\n- Postgres\n\nThe docker-compose file does not run your code, to start docker-compose, navigate to the starter directory containing docker-compose.yaml and run the following commands using git bash:\n\n```\n$\u003e cd starter\n$\u003e docker-compose up\n\nStarting zookeeper          ... done\nStarting kafka0             ... done\nStarting schema-registry    ... done\nStarting rest-proxy         ... done\nStarting connect            ... done\nStarting ksql               ... done\nStarting connect-ui         ... done\nStarting topics-ui          ... done\nStarting schema-registry-ui ... done\nStarting postgres           ... done\n```\n\nYou will see a large amount of text print out in your terminal and continue to scroll. This is normal! This means your dependencies are up and running.\n\nTo check the status of your environment, you may run the following command at any time from a separate terminal instance:\n\n```\n$\u003e docker-compose ps\n\n            Name                          Command              State                     Ports\n-----------------------------------------------------------------------------------------------------------------\nstarter_connect-ui_1           /run.sh                         Up      8000/tcp, 0.0.0.0:8084-\u003e8084/tcp\nstarter_connect_1              /etc/confluent/docker/run       Up      0.0.0.0:8083-\u003e8083/tcp, 9092/tcp\nstarter_kafka0_1               /etc/confluent/docker/run       Up      0.0.0.0:9092-\u003e9092/tcp\nstarter_ksql_1                 /etc/confluent/docker/run       Up      0.0.0.0:8088-\u003e8088/tcp\nstarter_postgres_1             docker-entrypoint.sh postgres   Up      0.0.0.0:5432-\u003e5432/tcp\nstarter_rest-proxy_1           /etc/confluent/docker/run       Up      0.0.0.0:8082-\u003e8082/tcp\nstarter_schema-registry-ui_1   /run.sh                         Up      8000/tcp, 0.0.0.0:8086-\u003e8086/tcp\nstarter_schema-registry_1      /etc/confluent/docker/run       Up      0.0.0.0:8081-\u003e8081/tcp\nstarter_topics-ui_1            /run.sh                         Up      8000/tcp, 0.0.0.0:8085-\u003e8085/tcp\nstarter_zookeeper_1            /etc/confluent/docker/run       Up      0.0.0.0:2181-\u003e2181/tcp, 2888/tcp, 3888/tcp\n\n```\n\n## Connecting to Services in Docker Compose\n\nNow that your project’s dependencies are running in Docker Compose, we’re ready to get our project up and running. Windows Users Only: You must first install librdkafka-dev in your WSL Linux. \n\nRun the following command in your Ubuntu terminal:\n\n```\nsudo apt-get install librdkafka-dev -y\n```\n\n\n## Stopping Docker Compose and Cleaning Up\n\nWhen you are ready to stop Docker Compose you can run the following command:\n\n```\n$\u003e docker-compose stop\nStopping starter_postgres_1           ... done\nStopping starter_schema-registry-ui_1 ... done\nStopping starter_topics-ui_1          ... done\nStopping starter_connect-ui_1         ... done\nStopping starter_ksql_1               ... done\nStopping starter_connect_1            ... done\nStopping starter_rest-proxy_1         ... done\nStopping starter_schema-registry_1    ... done\nStopping starter_kafka0_1             ... done\nStopping starter_zookeeper_1          ... done\n```\n\n\nIf you would like to clean up the containers to reclaim disk space, as well as the volumes containing your data:\n\n```\n$\u003e docker-compose rm -v\nGoing to remove starter_postgres_1, starter_schema-registry-ui_1, starter_topics-ui_1, starter_connect-ui_1, starter_ksql_1, starter_connect_1, starter_rest-proxy_1, starter_schema-registry_1, starter_kafka0_1, starter_zookeeper_1\nAre you sure? [yN] y\nRemoving starter_postgres_1           ... done\nRemoving starter_schema-registry-ui_1 ... done\nRemoving starter_topics-ui_1          ... done\nRemoving starter_connect-ui_1         ... done\nRemoving starter_ksql_1               ... done\nRemoving starter_connect_1            ... done\nRemoving starter_rest-proxy_1         ... done\nRemoving starter_schema-registry_1    ... done\nRemoving starter_kafka0_1             ... done\nRemoving starter_zookeeper_1          ... done\n```\n# Running the producer\n\n```\ncd producers\nvirtualenv venv\n. venv/bin/activate\npip install -r requirements.txt\npython simulation.py\n```\n# Running the Faust Stream Processing Application\n```\ncd consumers\nvirtualenv venv\n. venv/bin/activate\npip install -r requirements.txt\nfaust -A faust_stream worker -l info\n```\n\n# Running the KSQL Creation Script\n```\ncd consumers\nvirtualenv venv\n. venv/bin/activate\npip install -r requirements.txt\npython ksql.py\n```\n\n# Running the consumer\n\n```\ncd consumers\nvirtualenv venv\n. venv/bin/activate\npip install -r requirements.txt\npython server.py\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwittline%2Foptimizing-public-transportation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwittline%2Foptimizing-public-transportation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwittline%2Foptimizing-public-transportation/lists"}