{"id":20199771,"url":"https://github.com/milenkovicm/torchfusion","last_synced_at":"2025-10-08T15:59:30.101Z","repository":{"id":224203353,"uuid":"762631309","full_name":"milenkovicm/torchfusion","owner":"milenkovicm","description":"Torchfusion is a very opinionated torch inference on datafusion.","archived":false,"fork":false,"pushed_at":"2025-04-24T01:32:20.000Z","size":96,"stargazers_count":5,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-08T15:59:29.583Z","etag":null,"topics":["batch-inference","datafusion","inference","machine-learning","pytorch","rust","sql","torch","userdefined-functions"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/milenkovicm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-24T08:45:37.000Z","updated_at":"2025-08-24T16:27:21.000Z","dependencies_parsed_at":"2024-08-12T08:25:36.989Z","dependency_job_id":"7dcca6cc-401d-4a7b-8348-5ad26627c286","html_url":"https://github.com/milenkovicm/torchfusion","commit_stats":null,"previous_names":["milenkovicm/torchfusion"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/milenkovicm/torchfusion","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milenkovicm%2Ftorchfusion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milenkovicm%2Ftorchfusion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milenkovicm%2Ftorchfusion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milenkovicm%2Ftorchfusion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/milenkovicm","download_url":"https://codeload.github.com/milenkovicm/torchfusion/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/milenkovicm%2Ftorchfusion/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278972320,"owners_count":26078017,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["batch-inference","datafusion","inference","machine-learning","pytorch","rust","sql","torch","userdefined-functions"],"created_at":"2024-11-14T04:38:47.958Z","updated_at":"2025-10-08T15:59:30.084Z","avatar_url":"https://github.com/milenkovicm.png","language":"Rust","readme":"# TorchFusion\n\nTorchfusion is a very opinionated torch inference on datafusion, implemented to demonstrate datafusion `FunctionFactory` functionality merge request ([arrow-datafusion/pull#9333](https://github.com/apache/arrow-datafusion/pull/9333)).\n\n\u003e [!NOTE]\n\u003e It has not been envisaged as a actively maintained library.\n\nOther project utilizing `FunctionFactory`:\n\n- [LightGBM Inference on DataFusion](https://github.com/milenkovicm/lightfusion)\n- [DataFusion JVM User Defined Functions (UDF)](https://github.com/milenkovicm/adhesive)\n\n## How to use\n\nA torch model can be defined as and SQL UDF definition:\n\n```sql\nCREATE FUNCTION iris(FLOAT[])\nRETURNS FLOAT[]\nLANGUAGE TORCH\nAS '/models/iris.pt'\n```\n\nWhere function parameter defines type of input array and return type defines type of return array.\nReturn statement points to location where scripted model file is located.\n\nor, something which is not implemented in this example, referencing a model in MlFlow repository:\n\n```sql\nCREATE FUNCTION iris(FLOAT[])\nRETURNS FLOAT[]\nLANGUAGE TORCH\nAS 'models:/iris@champion'\n```\n\nso overall flow should be:\n\n```rust\nlet ctx = torchfusion::configure_context();\n\nlet sql = r#\"\n    CREATE EXTERNAL TABLE iris STORED AS PARQUET LOCATION 'data/iris.snappy.parquet';\n\"#;\n\nctx.sql(sql).await?.show().await?;\n\n// ctx.sql(\"SET torch.cuda_device = 0\").await?;\nctx.sql(\"SET torchfusion.device = cpu\").await?.show().await?;\n\n// definition of torch model to use\nlet sql = r#\"\n    CREATE FUNCTION iris(FLOAT[])\n    RETURNS FLOAT[]\n    LANGUAGE TORCH\n    AS 'model/iris.spt'\n\"#;\n\nctx.sql(sql).await?.show().await?;\n\nlet sql = r#\"\n    SELECT \n    sl, sw, pl, pw,\n    features, \n    argmax(iris(features)) as f_inferred, \n    argmax(iris([sl, sw, pl, pw])) as inferred, \n    label\n    FROM iris \n    LIMIT 50\n\"#;\n\nctx.sql(sql).await?.show().await?;\n```\n\n```txt\n+-----+-----+-----+-----+----------------------+------------+----------+-------+\n| sl  | sw  | pl  | pw  | features             | f_inferred | inferred | label |\n+-----+-----+-----+-----+----------------------+------------+----------+-------+\n| 4.4 | 3.0 | 1.3 | 0.2 | [4.4, 3.0, 1.3, 0.2] | 0          | 0        | 0     |\n| 5.5 | 4.2 | 1.4 | 0.2 | [5.5, 4.2, 1.4, 0.2] | 0          | 0        | 0     |\n| 5.7 | 2.9 | 4.2 | 1.3 | [5.7, 2.9, 4.2, 1.3] | 1          | 1        | 1     |\n| 5.8 | 2.7 | 3.9 | 1.2 | [5.8, 2.7, 3.9, 1.2] | 1          | 1        | 1     |\n| 5.9 | 3.0 | 4.2 | 1.5 | [5.9, 3.0, 4.2, 1.5] | 1          | 1        | 1     |\n| 5.9 | 3.0 | 5.1 | 1.8 | [5.9, 3.0, 5.1, 1.8] | 2          | 2        | 2     |\n| 6.1 | 2.8 | 4.0 | 1.3 | [6.1, 2.8, 4.0, 1.3] | 1          | 1        | 1     |\n| 6.1 | 2.8 | 4.7 | 1.2 | [6.1, 2.8, 4.7, 1.2] | 1          | 1        | 1     |\n| 6.2 | 2.8 | 4.8 | 1.8 | [6.2, 2.8, 4.8, 1.8] | 2          | 2        | 2     |\n| 6.4 | 2.7 | 5.3 | 1.9 | [6.4, 2.7, 5.3, 1.9] | 2          | 2        | 2     |\n| 6.4 | 3.2 | 4.5 | 1.5 | [6.4, 3.2, 4.5, 1.5] | 1          | 1        | 1     |\n+-----+-----+-----+-----+----------------------+------------+----------+-------+\n```\n\n## Available Configuration\n\n`FunctionFactor` exposes set of configuraiton options which can be retrieved quering system catalog:\n\n```sql\nSELECT * FROM information_schema.df_settings WHERE NAME LIKE 'torchfusion%'\n```\n\n```txt\n+--------------------------------+-------+--------------------------------------------------------------------------------------+\n| name                           | value | description                                                                          |\n+--------------------------------+-------+--------------------------------------------------------------------------------------+\n| torchfusion.device             | Cpu   | Device to run model on. Valid values 'cpu', 'cuda', 'mps', 'vulkan'. Default: 'cpu'  |\n| torchfusion.cuda_device        | 0     | Cuda device to use. Valid value positive integer. Default: 0                         |\n| torchfusion.batch_size         | 1     | Batch size to be used. Valid value positive non-zero integers. Default: 1            |\n+--------------------------------+-------+--------------------------------------------------------------------------------------+\n```\n\nAvailable configuration options can be changed:\n\n```sql\nSET torchfusion.device = cpu\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmilenkovicm%2Ftorchfusion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmilenkovicm%2Ftorchfusion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmilenkovicm%2Ftorchfusion/lists"}