{"id":27838675,"url":"https://github.com/cactus-compute/cactus","last_synced_at":"2025-12-26T19:25:47.429Z","repository":{"id":290256291,"uuid":"971447302","full_name":"cactus-compute/cactus","owner":"cactus-compute","description":"Framework for AI on mobile devices and wearables, hardware-aware C/CPP backend, with wrappers for Kotlin, Java, Swift, React, Flutter.","archived":false,"fork":false,"pushed_at":"2025-05-06T15:03:57.000Z","size":153020,"stargazers_count":131,"open_issues_count":15,"forks_count":24,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-05-07T02:07:17.878Z","etag":null,"topics":["android","dart","flutter","framework","ios","java","javascript","kotlin","library","llamacpp","llm","llm-inference","llms","objective-c","react-native","swift","transformer","transformers","typescript"],"latest_commit_sha":null,"homepage":"https://cactuscompute.com","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cactus-compute.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-23T14:33:43.000Z","updated_at":"2025-05-07T01:21:58.000Z","dependencies_parsed_at":"2025-04-27T21:40:24.888Z","dependency_job_id":null,"html_url":"https://github.com/cactus-compute/cactus","commit_stats":null,"previous_names":["cactus-compute/cactus"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cactus-compute%2Fcactus","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cactus-compute%2Fcactus/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cactus-compute%2Fcactus/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cactus-compute%2Fcactus/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cactus-compute","download_url":"https://codeload.github.com/cactus-compute/cactus/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252798854,"owners_count":21805888,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["android","dart","flutter","framework","ios","java","javascript","kotlin","library","llamacpp","llm","llm-inference","llms","objective-c","react-native","swift","transformer","transformers","typescript"],"created_at":"2025-05-03T00:02:10.141Z","updated_at":"2025-12-26T19:25:47.422Z","avatar_url":"https://github.com/cactus-compute.png","language":"C++","funding_links":[],"categories":["HarmonyOS","C++","其他_机器学习与深度学习"],"sub_categories":["Windows Manager"],"readme":"\u003cimg src=\"assets/banner.jpg\" alt=\"Logo\" style=\"border-radius: 30px; width: 100%;\"\u003e\n\nCross-platform \u0026 energy-efficient kernels, runtime and AI inference engine for mobile devices. \n\n## Cactus Graph \nCactus Graph is a general numerical computing framework for implementing \nany model, like PyTorch for mobile devices.\n\n```cpp\n#include cactus.h\n\nCactusGraph graph;\nauto a = graph.input({2, 3}, Precision::FP16);\nauto b = graph.input({3, 4}, Precision::INT8);\n\nauto x1 = graph.matmul(a, b, false);\nauto x2 = graph.transpose(x1);\nauto result = graph.matmul(b, x2, true);\n\nfloat a_data[6] = {1.1f, 2.3f, 3.4f, 4.2f, 5.7f, 6.8f};\nfloat b_data[12] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};\ngraph.set_input(a, a_data, Precision::FP16);\ngraph.set_input(b, b_data, Precision::INT8);\n\ngraph.execute();\nvoid* output_data = graph.get_output(result);\n\ngraph.hard_reset(); \n\n```\n\n## Cactus Engine\nCactus Engine is an AI inference engine with OpenAI-compatible APIs built on top of Cactus Graphs.\n\n```cpp\n#include cactus.h\n\ncactus_set_pro_key(\"\"); // email founders@cactuscompute.com for optional key\n\ncactus_model_t model = cactus_init(\"path/to/weight/folder\", 2048);\n\nconst char* messages = R\"([\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": \"My name is Henry Ndubuaku\"}\n])\";\n\nconst char* options = R\"({\n    \"max_tokens\": 50,\n    \"stop_sequences\": [\"\u003c|im_end|\u003e\"]\n})\";\n\nchar response[1024];\nint result = cactus_complete(model, messages, response, sizeof(response), options, nullptr, nullptr, nullptr);\n```\nExample response from Gemma3-270m-INT8\n```json\n{\n    \"success\": true,\n    \"response\": \"Hi there! I'm just a friendly assistant.\",\n    \"time_to_first_token_ms\": 45.23,\n    \"total_time_ms\": 163.67,\n    \"tokens_per_second\": 168.42,\n    \"prefill_tokens\": 28,\n    \"decode_tokens\": 50,\n    \"total_tokens\": 78\n}\n```\n\n## INT8 Performance\n\n- \u003csub\u003e**Models:** LFM2-VL-450m \u0026 Whisper-Small\u003c/sub\u003e\n- \u003csub\u003e**Decode** = toks/sec, **P/D** = prefill/decode, **VLM** = 256×256 image, **STT** = 30s audio\u003c/sub\u003e\n- \u003csub\u003e**Cactus Pro**: Uses NPU for realtime and large context (Apple for now), scores are marked with *\u003c/sub\u003e\n- \u003csub\u003e**INT4 coming**: 1.8x speed, 1.9x smaller files\u003c/sub\u003e\n\n| Device | Short Decode | 1k-P/D | 4k-P/D | 4k-P Pro | 4k-RAM | VLM-TTFT | VLM-Dec | VLM-RAM | STT-TTFT | STT-Dec | STT-RAM |\n|--------|--------|--------|--------|----------|--------|----------|---------|---------|----------|---------|---------|\n| Mac M4 Pro | 173 | 1574/115 | 1089/100 | - | 122MB | 0.4s/0.1s* | 168 | 112MB | 1.7s/0.2s* | 83 | 142MB |\n| Mac M3 Pro | 150 | 1540/109 | 890/93 | - | 121MB | 0.5s/0.1s* | 149 | 113MB | 2.9s/0.4s* | 78 | 140MB |\n| iPad/Mac M4 | 129 | 793/82 | 507/64 | - | 80MB | 0.5s/0.1s* | 113 | 145MB | 2.4s0.3s* | 60 | 131MB |\n| iPad/Mac M3 | 112 | 786/78 | 446/60 | - | 81MB | 0.6s/0.1s* | 111 | 154MB | 4.2s/0.7s* | 58 | 142MB |\n| iPhone 17 Pro | 136 | 810/105 | 628/84 | - | - | 1.1s/0.1s* | 120 | - | 3.0s/0.6s* | - | - |\n| iPhone 16 Pro | 114 | 716/98 | 580/81 | - | - | 1.3s/0.2s* | 101 | - | 3.5s/0.7s* | 75 | - |\n| iPhone 15 Pro | 99 | 549/86 | 530/75 | - | - | 1.5s/0.3s* | 92 | - | 3.8s/0.8s* | 70 | - |\n| Galaxy S25 Ultra | 91 | 230/63 | 173/57 | - | 128MB | 1.4s | 58 | - | - | - | - |\n| Nothing 3 | 56 | 167/49 | 160/46 | - | - | 1.7s | 54 | - | 8.5s | 55 | - |\n| Nothing 3a | 31 | 114/26 | 108/24 | - | - | 2.4s | 29 | - | - | - | - |\n| Raspberry Pi 5 | 24 | 192/28 | - | - | - | 2.3s | 23 | - | 21s | 16 | - |\n\n\n## Supported models (INT8)\n\n| Model | Compressed Size | Completion | Tool Call | Vision | Embed | Speech | Pro\n|-------|--------------------|-------------------|----------------|------|------|------|------|\n| google/gemma-3-270m-it | 172MB  | ✓ | ✗ | ✗ | ✗ | ✗ | Apple |\n| google/functiongemma-270m-it | 172MB  | ✓ | ✓ | ✗ | ✗ | ✗ | Apple |\n| openai/whisper-small | 282MB  | ✗ | ✗ | ✗ | ✓ | ✓ | Apple |\n| LiquidAI/LFM2-350M | 233MB  | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ |\n| HuggingFaceTB/SmolLM2-360m-Instruct | 227MB  | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |\n| LiquidAI/LFM2-VL-450M | 420MB  | ✓ | ✗ | ✓ | ✓ | ✗ | Apple |\n| Qwen/Qwen3-0.6B | 394MB  | ✓ | ✓ | ✗ | ✓ | ✗ | Apple |\n| Qwen/Qwen3-Embedding-0.6B | 394MB  | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ |\n| LiquidAI/LFM2-700M | 467MB  | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ |\n| nomic-ai/nomic-embed-text-v2-moe | 533MB  | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ |\n| google/gemma-3-1b-it | 642MB  | ✓ | ✗ | ✗ | ✗ | ✗ | Apple |\n| openai/whisper-medium | 646MB  | ✗ | ✗ | ✗ | ✓ | ✓ | Apple |\n| LiquidAI/LFM2-1.2B | 722MB  | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ |\n| LiquidAI/LFM2-1.2B-RAG | 722MB  | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ |\n| LiquidAI/LFM2-1.2B-Tool | 722MB  | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ |\n| LiquidAI/LFM2-VL-1.6B | 1440MB  | ✓ | ✗ | ✓ | ✓ | ✗ | Apple |\n| Qwen/Qwen3-1.7B | 1161MB  | ✓ | ✓ | ✗ | ✓ | ✗ | Apple |\n| HuggingFaceTB/SmolLM2-1.7B-Instruct | 1161MB  | ✓ | ✗ | ✗ | ✓ | ✗ | ✗ |\n\n## Using this repo on Mac\n\n- Clone repo and run `source ./setup`.\n- Setup is automatic and usage instructions printed after.\n- Run `cactus --help` to see guides anytime.\n- Remember to run `source ./setup` in any new terminal.\n\n## Using in your apps\n\n- [Kotlin Multiplatform SDK](https://github.com/cactus-compute/cactus-kotlin)\n- [Flutter SDK](https://github.com/cactus-compute/cactus-flutter)\n- [React Native SDK](https://github.com/cactus-compute/cactus-react-native)\n- [Swift SDK](https://github.com/mhayes853/swift-cactus)\n\n## Try demo apps\n\n- [iOS Demo](https://apps.apple.com/gb/app/cactus-chat/id6744444212)\n- [Android Demo](https://play.google.com/store/apps/details?id=com.rshemetsubuser.myapp)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcactus-compute%2Fcactus","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcactus-compute%2Fcactus","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcactus-compute%2Fcactus/lists"}