Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/janaom/streaming-pipeline-on-gcp
https://github.com/janaom/streaming-pipeline-on-gcp
gcp pubsub streaming-pipeline
Last synced: 1 day ago
JSON representation
- Host: GitHub
- URL: https://github.com/janaom/streaming-pipeline-on-gcp
- Owner: janaom
- Created: 2023-10-22T16:23:19.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-27T20:02:36.000Z (about 1 year ago)
- Last Synced: 2024-11-19T18:51:25.240Z (2 months ago)
- Topics: gcp, pubsub, streaming-pipeline
- Language: Python
- Homepage:
- Size: 31.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## PubSub Streaming Architecture
![image](https://github.com/janaom/Streaming_Pipeline_on_GCP/assets/83917694/889166ce-862d-42a7-927e-66d3d4eaf2ed)## publish.py
publish.py script publishes messages to a Google Cloud Pub/Sub topic
- This code publishes messages to a Google Cloud Pub/Sub topic.
- It imports necessary modules, sets up project and topic information, and authenticates using a service account.
- It reads input data from a file and publishes each line of the file as a message to the Pub/Sub topic.
- The publishing process is controlled by a loop, and a 1-second delay is added between each message publication.## process.py
process.py script uses Apache Beam to process data from a Google Cloud Pub/Sub subscription and write the processed data to another Pub/Sub topic. Then comes the Beam pipeline, it is reading data from PubSub using input_subscription and writing it to output_topic
- This code uses Apache Beam to process data from a Google Cloud Pub/Sub subscription and write the processed data to another Pub/Sub topic.
- It imports necessary modules from Apache Beam, sets up pipeline options and authentication using a service account.
- It defines a pipeline that reads data from a Pub/Sub subscription, applies any necessary transformations, and writes the processed data to a Pub/Sub topic.
- The pipeline is executed, and the code waits until the pipeline execution is finished.## subscribe.py
subscribe.py script creates a Google Cloud Pub/Sub subscriber and receives messages from a specified subscription
- This code creates a Pub/Sub subscriber that continuously listens for messages from a specified subscription and processes them when received.
- It imports necessary modules, sets up authentication using a service account, and specifies the subscription to listen to.
- It defines a callback function that is called whenever a message is received. In this case, the function simply prints the received message.
- The subscriber is set up to continuously listen for messages from the subscription using the callback function.
- The script enters an infinite loop with a 60-second delay between iterations to keep the subscriber listening for new messages.## Summary
In summary, the project involves publishing messages to a Pub/Sub topic, processing those messages using Apache Beam, and subscribing to a different topic to consume and perform actions on the processed messages.