{"id":28504404,"url":"https://github.com/icij/datashare-python","last_synced_at":"2026-05-22T15:10:02.829Z","repository":{"id":268996233,"uuid":"896133379","full_name":"ICIJ/datashare-python","owner":"ICIJ","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-11T13:30:35.000Z","size":4861,"stargazers_count":5,"open_issues_count":5,"forks_count":0,"subscribers_count":5,"default_branch":"main","last_synced_at":"2026-05-11T14:38:22.143Z","etag":null,"topics":["artificial-intelligence","datashare","distributed-systems","investigative-journalism","machine-learning","task"],"latest_commit_sha":null,"homepage":"https://icij.github.io/datashare-python/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ICIJ.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-11-29T16:03:12.000Z","updated_at":"2026-05-11T13:35:47.000Z","dependencies_parsed_at":"2026-03-04T14:04:12.500Z","dependency_job_id":null,"html_url":"https://github.com/ICIJ/datashare-python","commit_stats":null,"previous_names":["icij/datashare-python"],"tags_count":173,"template":true,"template_full_name":null,"purl":"pkg:github/ICIJ/datashare-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ICIJ%2Fdatashare-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ICIJ%2Fdatashare-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ICIJ%2Fdatashare-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ICIJ%2Fdatashare-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ICIJ","download_url":"https://codeload.github.com/ICIJ/datashare-python/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ICIJ%2Fdatashare-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32973426,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T06:31:55.726Z","status":"ssl_error","status_checked_at":"2026-05-13T06:31:51.336Z","response_time":115,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","datashare","distributed-systems","investigative-journalism","machine-learning","task"],"created_at":"2025-06-08T18:05:26.739Z","updated_at":"2026-05-13T08:10:26.871Z","avatar_url":"https://github.com/ICIJ.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv style=\"background-image: linear-gradient(45deg, #193d87, #fa4070);\"\u003e\n  \u003cbr/\u003e\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://datashare.icij.org/\"\u003e\n      \u003cimg align=\"center\" src=\"docs/assets/datashare-logo.svg\" alt=\"Datashare\" style=\"max-width: 60%\"\u003e\n    \u003c/a\u003e\n  \u003c/p\u003e\n  \u003cp align=\"center\"\u003e\n    \u003cem\u003eBetter analysis in all of its forms\u003c/em\u003e  \n  \u003c/p\u003e\n  \u003cbr/\u003e\n\u003c/div\u003e\n\u003cbr/\u003e\n\n---\n\n# Python workers for Temporal in Datashare\n\nThis project serves as a repository of Temporal workers and workflows written in Python\n(useful in machine learning) for use with [Datashare](https://icij.gitbook.io/datashare). Install with \n\n```\nmake install\n```\n\n## File patterns\n\nTo create new workers, you can follow `asr_worker` with the file/dir structure\n```\nactivities.py --\u003e Workflow activities\nconstants.py  --\u003e Worker/workflow constants\nmodels.py     --\u003e Workflow and activity inputs/outputs and other data classes\nworker.py     --\u003e Worker definition\nworkflow.py   --\u003e Workflow definition\n```\n\n## Docker\n\nUse `docker-compose` to run the dev server on `localhost`, which will start `elasticsearch`\n(port `9200`), `postgres` (`5432`), and `redis` (`6379`) services, as well as the `Temporal`\nserver and ui (`7233` and `8233`), and `datashare` (`8080`). Note that container build and\nstartup times can be long if workers and workflows rely on large models, so allocate memory\nto Docker accordingly.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficij%2Fdatashare-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ficij%2Fdatashare-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficij%2Fdatashare-python/lists"}