https://github.com/icij/datashare-python
https://github.com/icij/datashare-python
artificial-intelligence datashare distributed-systems investigative-journalism machine-learning task
Last synced: 22 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/icij/datashare-python
- Owner: ICIJ
- Created: 2024-11-29T16:03:12.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-05-11T13:30:35.000Z (about 1 month ago)
- Last Synced: 2026-05-11T14:38:22.143Z (about 1 month ago)
- Topics: artificial-intelligence, datashare, distributed-systems, investigative-journalism, machine-learning, task
- Language: Python
- Homepage: https://icij.github.io/datashare-python/
- Size: 4.64 MB
- Stars: 5
- Watchers: 5
- Forks: 0
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
---
# Python workers for Temporal in Datashare
This project serves as a repository of Temporal workers and workflows written in Python
(useful in machine learning) for use with [Datashare](https://icij.gitbook.io/datashare). Install with
```
make install
```
## File patterns
To create new workers, you can follow `asr_worker` with the file/dir structure
```
activities.py --> Workflow activities
constants.py --> Worker/workflow constants
models.py --> Workflow and activity inputs/outputs and other data classes
worker.py --> Worker definition
workflow.py --> Workflow definition
```
## Docker
Use `docker-compose` to run the dev server on `localhost`, which will start `elasticsearch`
(port `9200`), `postgres` (`5432`), and `redis` (`6379`) services, as well as the `Temporal`
server and ui (`7233` and `8233`), and `datashare` (`8080`). Note that container build and
startup times can be long if workers and workflows rely on large models, so allocate memory
to Docker accordingly.