{"id":28506443,"url":"https://github.com/temporalio/data-pipeline-project-python","last_synced_at":"2025-09-02T13:40:34.261Z","repository":{"id":234245069,"uuid":"636456997","full_name":"temporalio/data-pipeline-project-python","owner":"temporalio","description":null,"archived":false,"fork":false,"pushed_at":"2025-07-02T22:21:14.000Z","size":678,"stargazers_count":6,"open_issues_count":3,"forks_count":3,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-07-02T23:27:22.788Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/temporalio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-05-04T22:25:38.000Z","updated_at":"2025-07-02T22:21:18.000Z","dependencies_parsed_at":null,"dependency_job_id":"283a556c-2ebf-4c53-88c2-8634aea2c795","html_url":"https://github.com/temporalio/data-pipeline-project-python","commit_stats":null,"previous_names":["temporalio/data-pipeline-project-python"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/temporalio/data-pipeline-project-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/temporalio%2Fdata-pipeline-project-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/temporalio%2Fdata-pipeline-project-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/temporalio%2Fdata-pipeline-project-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/temporalio%2Fdata-pipeline-project-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/temporalio","download_url":"https://codeload.github.com/temporalio/data-pipeline-project-python/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/temporalio%2Fdata-pipeline-project-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273293612,"owners_count":25079892,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-02T02:00:09.530Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-08T20:05:44.002Z","updated_at":"2025-09-02T13:40:34.248Z","avatar_url":"https://github.com/temporalio.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Build a data pipeline Workflow with Temporal and Python\n\nFor the complete tutorial, see [Build a data pipeline Workflow with Temporal and Python](https://learn.temporal.io/tutorials/python/data-pipelines/).\n\nTemporal makes writing data pipelines easy with Workflows and Activities.\n\nYou can create a source, process the step or steps, and output the flow of information to a destination with just code. Meaning all of your developer best practices can be implemented, tested, and ran as needed.\n\nThat data that enters a Workflow is handled by Activities, while the Workflow orchestrates the execution of those steps.\nYou can ensure that Temporal handles all actions and executes it observably once, all in Python code.\n\nIn this tutorial, you'll learn to build a data pipeline that gets the top 10 Hacker New stories and processes the items based on the story ID.\nIf the API endpoint is down, the default behavior of the Retry Policy is to retry indefinitely.\n\nYou'll then implement a Schedule to Schedule Workflows on an interval to leverage the automation of running Workflow Executions.\n\n## Step 0: Prerequisites\n\n* Python \u003e= 3.7\n* [Poetry](https://python-poetry.org)\n* [Local Temporal server running](https://docs.temporal.io/application-development/foundations#run-a-development-cluster)\n\nWith this repository cloned, run the following at the root of the directory:\n\n```command\npoetry install\n```\n\n## Start the Workflow\n\nStart and run the Workflow with the following commands:\n\n```command\n# terminal one\npoetry run python run_worker.py\n# terminal two\npoetry run python run_workflow.py\n```\n\nTerminate the Workflow with the following command:\n\n```command\n# terminal three\ntemporal workflow terminate --workflow-id temporal-community-workflow\n```\n\n## Results\n\nYou'll see an output similar to the following in your terminal::\n\n```command\nTop 10 stories on Temporal Community:\n                                               Title                                                URL  Views\n0  Unable to run the temporal examples against th...  https://community.temporal.io/t/unable-to-run-...   2370\n1                   Welcome to community.temporal.io  https://community.temporal.io/t/welcome-to-com...    915\n2  Testing an activity implementation when using ...  https://community.temporal.io/t/testing-an-act...    633\n3      Separate Workers for Workflows and Activities  https://community.temporal.io/t/separate-worke...    423\n4  What is the hardware prerequisite for installi...  https://community.temporal.io/t/what-is-the-ha...    169\n5  Worker TLS errors - \"first record does not loo...  https://community.temporal.io/t/worker-tls-err...    148\n6  Temporal in lieu of a queuing solution, say SQ...  https://community.temporal.io/t/temporal-in-li...    146\n7  Implement Finite State Machine Transitioning i...  https://community.temporal.io/t/implement-fini...     97\n8         Cassandra history_node table keeps growing  https://community.temporal.io/t/cassandra-hist...     91\n9  Getting error TransportError (InvalidCertifica...  https://community.temporal.io/t/getting-error-...     74\n```\n\nYou'll see an output similar to the following in the Temporal Web UI:\n\n![Temporal Web UI](./images/temporal-web-ui.png)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftemporalio%2Fdata-pipeline-project-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftemporalio%2Fdata-pipeline-project-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftemporalio%2Fdata-pipeline-project-python/lists"}