{"id":16941272,"url":"https://github.com/iceber/wasmcloud-ollama","last_synced_at":"2025-08-13T01:07:09.396Z","repository":{"id":218507307,"uuid":"744434799","full_name":"Iceber/wasmcloud-ollama","owner":"Iceber","description":null,"archived":false,"fork":false,"pushed_at":"2024-02-07T08:49:29.000Z","size":562,"stargazers_count":16,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-27T08:11:13.967Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Iceber.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-17T09:45:07.000Z","updated_at":"2025-02-02T15:37:08.000Z","dependencies_parsed_at":"2024-11-26T14:42:00.699Z","dependency_job_id":"15fc8183-09f0-4297-b2f9-015fc7bd6e3d","html_url":"https://github.com/Iceber/wasmcloud-ollama","commit_stats":null,"previous_names":["iceber/wasmcloud-ollama"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Iceber%2Fwasmcloud-ollama","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Iceber%2Fwasmcloud-ollama/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Iceber%2Fwasmcloud-ollama/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Iceber%2Fwasmcloud-ollama/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Iceber","download_url":"https://codeload.github.com/Iceber/wasmcloud-ollama/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248750717,"owners_count":21155770,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-13T21:09:07.444Z","updated_at":"2025-04-13T17:20:53.498Z","avatar_url":"https://github.com/Iceber.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [wasmCloud](https://github.com/wasmCloud/wasmCloud) \u0026 [Ollama](https://github.com/jmorganca/ollama)\n\n**wasmCloud-Ollama enables Wasm AI applications to be distributedly deployed and to make distributed calls to underlying AI inference capabilities.**\n\nThe [Ollama](https://github.com/jmorganca/ollama)-based implementation allows Wasm AI to have [very rich model choices](https://ollama.ai/library) and\nthe flexibility to manage these models using native Ollama commands.\n\n![wasmCloud-Ollama Arch](./arch.png)\n\n---\n\n**wasmCloud-Ollama is one of wasmCloud's solutions for AI scenarios.**\n\n**wasmCloud allows users to design and implement specific AI capabilities for Wasm AI applications through flexible interface definitions.**\n\nOther AI capability interfaces and implementations:\n* wasi-nn: https://github.com/Iceber/wasmcloud-wasi-nn\n\n## What is wasmCloud\nwasmCloud is a sandbox project of CNCF, which is mainly used for distributed deployment and management of Wasm, and breaks the function limitation of Wasm Runtime through the Actor-Provider way, allowing users to provide customized capabilities according to their business.\n\nFor more information about wasmCloud, please visit the [repository](https://github.com/wasmCloud/wasmCloud) and the [website](https://wasmcloud.com/).\n\n## wasmCloud-Ollama\nWe use Ollama to complete a more complete and convenient model loading reasoning base capability, it has rich model loading capability.\n\nwasmCloud-Ollama mainly provides the following components:\n* [ollama.wit](./wit/ollama.wit): Ollama's Wasm interface definition, Wasm AI applications can *import* this wit to utilize the inference capabilities provided by the underlay.\n* [wasmcloud_ollama_adaptor.wasm](./ollama-adaptor): enables Wasm AI applications using ollama.wit to run on wasmCloud.\n* [wasmCloud Ollama Provider](./provider): implements the *ollama.wit* to provide inference capabilities to Wasm AI applications, and two implementations are provided:\n    * **Remote Provider**: implements the ollama.wit interface by accessing a deployed Ollama Server, for a lighter binary.\n    * **Standalone Provider**: runs the Ollama Server implementation at the same time, without the need to run an additional Ollama Server.\n\n* [Ollama OpenAI Server](./examples/openai-server): Wasm application that implements OpenAI APIServer through *ollama.wit*.\n\n## Quick - Running AI Large Language Models Locally\nLet's take a quick look at the distributed AI inference capabilities of wasmCloud using *./examples/openai-server*.\n\n![ollama openai server arch](./examples/openai-server/ollama_openai_server.png)\n\n### Deploying wasmCloud Locally\nDownload and install the wash command\n```bash\n# macos\n$ brew install washmcloud/wasmcloud/wash\n```\nOther system installation commands: https://wasmcloud.com/docs/installation\n\nStarting wasmCloud\n```bash\n$ wash up -d --nats-websocket-port 4001 --rpc-timeout-ms=20000\n```\n\n#### Run wasmCloud UI (optional)\nIn order to see the wasmCloud resources easily, you could run the ui service\n```bash\n$ wash ui\n```\n\n### Build and Deploy Ollam OpenAI Server\n```\n$ make openai-server\n```\nThe Makefile encapsulates the building and composing of Ollama OpenAI Server. After executing the commands, you can directly see a Wasm component and Actor artifact.\n\n```\n$ ls -alh ./ollama_openai*\n-rw-r--r--  1 icebergu  staff   4.5M  2  1 17:42 ./ollama_openai_server.wasm\n-rw-r--r--  1 icebergu  staff   4.5M  2  1 17:42 ./ollama_openai_server_s.wasm\n```\nYou can see that the compiled Wasm component is only 4.5M,\nwhich is actually one of the advantages that Wasm brings to AI applications: it provides a lighter artifact while ensuring operational security.\n\nViewing information about the actor artifact.\n```\n$ wash claims inspect ./ollama_openai_server_s.wasm\n\n                        ollama-openai-server - Actor\n  Account         AB3OYLJCCEVR22KVJCJZBPZO74GLMXTERE2OEJAMUXSJWT4Y2QU6W5M5\n  Actor           MBKTHG75VW5O7P4OHZMLV7YUQSDDZLNMPWIVQGSEPWFIUOQMZEGSJOOK\n  Expires                                                            never\n  Can Be Used                                                  immediately\n  Version                                                        0.1.0 (1)\n  Call Alias                                                     (Not set)\n                                Capabilities\n  HTTP Server\n  Logging\n  ollama:llm\n                                    Tags\n  None\n```\nYou can see that the ollama-openai-server relies on three functions: **HTTP Server**, **Logging**, and **ollama:llm**, of which Logging will be provided by wasmCloud.\n\nNow, in order to ensure that the server can run properly, you also need to deploy the **HTTP Server Provider** and the **ollama:llm Provider** to provide the corresponding capabilities.\n\nBefore deploying the providers, we can deploy the ollama-openai-server in wasmCloud first.\n```bash\n$ wash start actor file://`pwd`/ollama_openai_server_s.wasm\n\nActor [MBKTHG75VW5O7P4OHZMLV7YUQSDDZLNMPWIVQGSEPWFIUOQMZEGSJOOK] (ref: [file:///Users/icebergu/workspace/wasm/wasmcloud/ai/wasmcloud-ollama/ollama_openai_server_s.wasm]) started on host [ND4SFXMOROAHNRZV54KCEVU3HQC2BDGRVC3M5FCVSHYS2C3RSK35EIJD]\n```\n\n### Building and Deploying Ollama Provider\nwasmCloud-ollama provides two ollama providers, to avoid deploying ollama server separately, we can deploy the ollama standalone-provider directly\n\n\nBuild ollama standalone provider\n```\n$ make standalone-provider-par\n\n$ wash claims inspect ./standalone-provider.par\n\n\n                  Ollama LLM Standalone Provider - Capability Provider\n  Account                     AB3OYLJCCEVR22KVJCJZBPZO74GLMXTERE2OEJAMUXSJWT4Y2QU6W5M5\n  Service                     VBHDATHFOKV4EUZK62XTQMTG4AZ7HYQDMY7RXQXIRMDMUGCVWV2LYRJM\n  Capability Contract ID                                                    ollama:llm\n  Vendor                                                                        Iceber\n  Version                                                                         None\n  Revision                                                                        None\n                             Supported Architecture Targets\n  aarch64-macos\n```\n\nDeploy standalone provider\n```\n$ echo \"{ \\\"work_path\\\": \\\"`pwd`\\\" }\" \u003e config.json\n$ wash start provider file://`pwd`/standalone-provider.par --config-json ./config.json\n\nProvider [VBHDATHFOKV4EUZK62XTQMTG4AZ7HYQDMY7RXQXIRMDMUGCVWV2LYRJM] (ref: [file:///Users/icebergu/workspace/wasm/wasmcloud/ai/wasmcloud-ollama/standalone-provider.par]) started on host [ND4SFXMOROAHNRZV54KCEVU3HQC2BDGRVC3M5FCVSHYS2C3RSK35EIJD]\n```\n**Wasm AI application does not need to run on the same host as the ollama provider, the current example is in standalone mode, if there are multiple hosts, you can choose to deploy on any host**\n\n#### Link to Ollama OpenAI Server\n```\n$ wash link put ollama-openai-server VBHDATHFOKV4EUZK62XTQMTG4AZ7HYQDMY7RXQXIRMDMUGCVWV2LYRJM ollama:llm\n```\n\n### Deploying HTTP Server Provider\n```\n$ wash start provider oci://wasmcloud.azurecr.io/httpserver:0.19.1\n\nProvider [VAG3QITQQ2ODAOWB5TTQSDJ53XK3SHBEIFNK4AYJ5RKAX2UNSCAPHA5M] (ref: [oci://wasmcloud.azurecr.io/httpserver:0.19.1]) started on host [ND4SFXMOROAHNRZV54KCEVU3HQC2BDGRVC3M5FCVSHYS2C3RSK35EIJD]\n```\n**Note**: Even if you set a large default rpc timeout on wasmCloud startup, httpserver:0.19.1 still limits the timeout to 10s,\nwhich has been fixed by https://github.com/wasmCloud/wasmCloud/pull/1439 , so you can build the [httpserver provider](https://github.com/Iceber/wasmCloud/tree/fix_rpc_client_timeout/crates/providers/http-server) yourself.\n\n#### Link to  Ollama OpenAI Server\n\n```\n$ wash link put ollama-openai-server VAG3QITQQ2ODAOWB5TTQSDJ53XK3SHBEIFNK4AYJ5RKAX2UNSCAPHA5M wasmcloud:httpserver \"address=0.0.0.0:7000\"\n```\n\n#### Test connectivity\n```\n$ curl http://127.0.0.1:7000/echo\necho test\n\n$ curl http://127.0.0.1:7000/v1/models\n{\"object\":\"list\",\"data\":[]}\n```\n\n### Selecting and downloading models\nstandalone-provider is compatible with Ollama commands, no need to download additional Ollama binaries.\n```\n$ ./build/standalone-provider --help\n```\n\nSelect the appropriate model in the [Model Library](https://ollama.ai/library), or import a local model via the [Model File](https://github.com/ollama/ollama/blob/main/docs/modelfile.md).\n\nLet's start with the most commonly used model, [Llama2](https://ollama.ai/library/llama2:latest).\n```\n$ ./build/standalone-provider pull llama2\n```\n\n### Accessing Ollama OpenAI Server with curl\n```\n$ curl http://127.0.0.1:7000/v1/models\n{\"object\":\"list\",\"data\":[{\"id\":\"llama2:latest\",\"created\":1705651410,\"object\":\"model\",\"owned_by\":\"Not specified\"},{\"id\":\"llama2-chinese:latest\",\"created\":1707284593,\"object\":\"model\",\"owned_by\":\"Not specified\"}]}\n\n$ curl -X POST http://127.0.0.1:7000/v1/chat/completions -H 'accept:application/json' -H 'Content-Type: application/json' -d '{\"model\":\"llama2\",\"messages\":[{\"role\":\"system\",\"content\":\"You are a helpful, respectful and honest assistant. Always answer as short as possible, while being safe.\"},{\"role\":\"user\",\"content\":\"hello\"}],\"temperature\":1}'\n{\n  \"id\": \"01b6ba99-4017-4249-98db-e363807e7aec\",\n  \"object\": \"chat.completion\",\n  \"created\": 1707290034,\n  \"model\": \"llama2\",\n  \"choices\": [\n    {\n      \"index\": 0,\n      \"message\": {\n        \"role\": \"assistant\",\n        \"content\": \"Hello! How can I assist you today?\"\n      },\n      \"finish_reason\": \"stop\"\n    }\n  ],\n  \"usage\": {\n    \"prompt_tokens\": 0,\n    \"completion_tokens\": 0,\n    \"total_tokens\": 0\n  }\n}\n```\n\n### Deploying Chat UI\nUse [chatbot-ui](https://github.com/second-state/chatbot-ui) as ui, or you can choose any other ui.\n```\n$ curl -LO https://github.com/second-state/chatbot-ui/releases/download/v0.1.0/chatbot-ui.tar.gz \u0026\u0026 tar xvf chatbot-ui.tar.gz\n```\n\nGet the ip of the host running the HTTP Server Provider via `ifconfig` or other commands, and generate the nginx configuration.\n```\n$ HOST_IP=\u003cYOUR IP ADDRESS\u003e\n$ echo \"\"\"\nserver {\n    listen       80;\n\n    location /v1/ {\n        proxy_pass   http://$HOST_IP:7000;\n    }\n}\n\"\"\" \u003e nginx.conf\n```\n\nRun nginx with docker\n```\n$ docker run --name chat-server -v `pwd`/chatbot-ui:/etc/nginx/html:ro -v `pwd`/nginx.conf:/etc/nginx/conf.d/default.conf:ro --rm -p 8080:80 nginx\n```\n\nUse a browser to access http://127.0.0.1:8080\n\n## TODO\n* Versioning the ollama wit and dependencies\n* Optimizing field types\n* Add more meaningful logs\n* Push ollama-openai-server and ollama providers to the image repository\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficeber%2Fwasmcloud-ollama","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ficeber%2Fwasmcloud-ollama","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficeber%2Fwasmcloud-ollama/lists"}