{"id":27005681,"url":"https://github.com/relevanceai/ai-transform","last_synced_at":"2026-01-26T17:06:10.246Z","repository":{"id":61779054,"uuid":"542395749","full_name":"RelevanceAI/ai-transform","owner":"RelevanceAI","description":"Relevance AI Bulk Chain Workflow SDK","archived":false,"fork":false,"pushed_at":"2024-09-05T00:36:48.000Z","size":1556,"stargazers_count":3,"open_issues_count":1,"forks_count":0,"subscribers_count":6,"default_branch":"development","last_synced_at":"2024-09-07T07:34:44.365Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RelevanceAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-28T03:47:22.000Z","updated_at":"2024-09-05T00:34:51.000Z","dependencies_parsed_at":"2024-09-06T07:34:57.870Z","dependency_job_id":"0c2cba95-e6ca-40a3-8b8a-e535c2398911","html_url":"https://github.com/RelevanceAI/ai-transform","commit_stats":null,"previous_names":[],"tags_count":152,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelevanceAI%2Fai-transform","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelevanceAI%2Fai-transform/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelevanceAI%2Fai-transform/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RelevanceAI%2Fai-transform/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RelevanceAI","download_url":"https://codeload.github.com/RelevanceAI/ai-transform/tar.gz/refs/heads/development","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247135126,"owners_count":20889421,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-04T07:17:05.396Z","updated_at":"2026-01-26T17:06:10.200Z","avatar_url":"https://github.com/RelevanceAI.png","language":"Python","readme":"# AI Transform\n\nBelow is a hierarchy diagram for all the moving parts of a workflow.\n\n![hierarchy](hierarchy.png \"Hierarchy\")\n\n## 🛠️ Installation\n\nFresh install\n\n```{bash}\npip install ai-transform\n```\n\nto upgrade to the latest version\n\n```{bash}\npip install --upgrade ai-transform\n```\n\n## 🏃Quickstart\n\nTo get started, please refer to the example scripts in `scripts/`\n\n```python\n\nimport random\nfrom ai_transform.api.client import Client\nfrom ai_transform.engine.stable_engine import StableEngine\nfrom ai_transform.workflow.helpers import decode_workflow_token\nfrom ai_transform.workflow import Workflow\nfrom ai_transform.operator.abstract_operator import AbstractOperator\nfrom ai_transform.utils.random import Document\n\nclass RandomOperator(AbstractOperator):\n    def __init__(self, upper_bound: int=10):\n        self.upper_bound = upper_bound\n\n    def transform(self, documents):\n        for d in documents:\n            d['random_number'] = random.randint(0, self.upper_bound)\n\n\nclient = Client()\nds = client.Dataset(\"sample_dataset\")\noperator = RandomOperator()\n\nengine = StableEngine(\n    dataset=ds,\n    operator=operator,\n    chunksize=10,\n    filters=[],\n)\nworkflow = Workflow(engine)\n\nworkflow.run()\n```\n\n## Workflow IDs and Job IDs\n\nWorkflows have Workflow IDs such as sentiment  - for example:\nsentiment.py is called sentiment and this is how the frontend triggers it.\nWorkflow Name is what we call the workflow like Extract Sentiment .\nEach instance of a workflow is a job and these have job_id so we can track their status.\n\n## Engine Selection\n\n### StableEngine\n\nThis the safest and most basic way to write a workflow. This engine will pull `chunksize`\nnumber of documents, transform them according to the transform method in the respective operator\nand then insert them. If `chunksize=None`, the engine will attempt to pull the entire dataset\ntransform the entire dataset in one go, and then reinsert all the documents at once. Batching is limited\nby the value provided to `chunksize`.\n\n### InMemoryEngine\n\nThis Engine is intended to be used when operations are done on the whole dataset at once.\nThe advantage this has over `StableEngine` with `chunksize=None` is that the pulling and\npushing documents is done in batch, but the operation is done in bulk. With `StableEngine`,\nthis would have involved extremely large API calls with larger datasets.\n\n### Polling\n\nSometimes you will want to wait until the Relevance AI\nschema updates before proceeding to the next step. For more information - look at `workflow/helpers.py` file.\n\n```{python}\n\npoll_until_health_updates_with_input_field(\n    dataset=dataset,\n    input_field=...,\n    output_field=...,\n    minimum_coverage=0.95,\n    sleep_timer=10\n)\n```\n\n\n### How to release\n\nTo cut a release, go to \"Releases\" and create a new version from `main` branch.\n\n### Architecture Decisions\n\n#### Pydantic\n\nThere are a few reasons for the pydantic choice:\n- good strong validation\n- outputs nicely to OpenAPI which allows us to generate workflow docs automatically in future for Workflow APIs\n- used in FastAPI stack so workflows can also be FastAPI compatible in the future.\n\n### For Developers\n\nWhen developing with Workflows Core, we have the following philosophies:\n\n- Support for only 1 entrypoint where possible\n- Readable comments for anything that others might not understand\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frelevanceai%2Fai-transform","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frelevanceai%2Fai-transform","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frelevanceai%2Fai-transform/lists"}