{"id":14240781,"url":"https://github.com/nagaraj-real/localaipilot-api","last_synced_at":"2025-06-12T15:39:18.613Z","repository":{"id":237347507,"uuid":"794347272","full_name":"nagaraj-real/localaipilot-api","owner":"nagaraj-real","description":"API container for Local AI Pilot extension","archived":false,"fork":false,"pushed_at":"2025-02-17T23:25:17.000Z","size":147,"stargazers_count":47,"open_issues_count":1,"forks_count":10,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-18T00:22:59.470Z","etag":null,"topics":["ai","chatgpt","codecompletion","copilot","deepseek-r1","ollama","openai","vscode-extension"],"latest_commit_sha":null,"homepage":"https://marketplace.visualstudio.com/items?itemName=nr-codetools.localaipilot","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nagaraj-real.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-01T00:19:52.000Z","updated_at":"2025-02-17T23:25:21.000Z","dependencies_parsed_at":"2025-02-18T00:21:55.316Z","dependency_job_id":"8f1d8503-6fbe-4bca-8a8f-78512b163219","html_url":"https://github.com/nagaraj-real/localaipilot-api","commit_stats":null,"previous_names":["nagaraj-real/localaipilot-api"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/nagaraj-real/localaipilot-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nagaraj-real%2Flocalaipilot-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nagaraj-real%2Flocalaipilot-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nagaraj-real%2Flocalaipilot-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nagaraj-real%2Flocalaipilot-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nagaraj-real","download_url":"https://codeload.github.com/nagaraj-real/localaipilot-api/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nagaraj-real%2Flocalaipilot-api/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259495215,"owners_count":22866640,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","chatgpt","codecompletion","copilot","deepseek-r1","ollama","openai","vscode-extension"],"created_at":"2024-08-21T10:01:51.652Z","updated_at":"2025-06-12T15:39:18.593Z","avatar_url":"https://github.com/nagaraj-real.png","language":"Python","readme":"\n## Open in an Online IDE of your choice:\n\n[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io#https://github.com/nagaraj-real/localaipilot-api)\n[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/nagaraj-real/localaipilot-api?devcontainer_path=.devcontainer%2Fcodespaces%2Fdevcontainer.json)\n[![Open with CodeSandbox](https://assets.codesandbox.io/github/button-edit-lime.svg)](https://codesandbox.io/p/sandbox/github/nagaraj-real/localaipilot-api)\n[![Open in Codeanywhere](https://codeanywhere.com/img/open-in-codeanywhere-btn.svg)](https://app.codeanywhere.com/#https://github.com/nagaraj-real/localaipilot-api)\n\n## Standalone Mode\n\nIn standalone (non-container) mode, the extension connects directly with an Ollama instance.\n\n### 🚀 Quick Start\n\n#### 1. Install Ollama on your machine from [Ollama Website](https://ollama.com/download).\n\n#### 2. Pull local models\n\n- Chat Model\n\n  ```sh\n  ollama pull gemma:2b\n  ```\n\n- Code Model\n\n  ```sh\n  ollama pull codegemma:2b\n  ```\n\n#### 3. Update the **mode** as \"Standalone\" in the extension (**Settings \u003e Local AI Pilot \u003e Mode**).\n\n#### [Using different models](#choosing-local-models) for chat/code completion **[Optional]**\n\n- Configure model used for chat in the extension (**Settings \u003e Local AI Pilot \u003e ollamaModel**).\n- Configure model used for code completion in the extension (**Settings \u003e Local AI Pilot \u003e ollamaCodeModel**).\n\n---\n\n## Container Mode\n\nIn Container Mode, the LLM API Container acts as a bridge between Ollama and the Extension, enabling fine grained customizations and advanced features like Document Q\u0026A, Chat History(caching), Remote models.\n\n### Pre-requisites\n\n- Install [Docker](https://www.docker.com/) and [Docker Compose](https://docs.docker.com/compose/)\n\n- **[Optional]** GPU (NVIDIA) -\n  Download and install [NVIDIA® GPU drivers](https://www.nvidia.com/download/index.aspx?lang=en-us)\n\n  Checkout [GPU support](#gpu-support-help) for more information.\n\n### 🚀 Quick Start\n\n#### 1. Start containers using docker compose\n\n#### Download docker compose and start the services on demand.\n\n[docker-compose-cpu.yml](https://raw.githubusercontent.com/nagaraj-real/localaipilot-api/main/recipes/docker-compose-cpu.yml) | [docker-compose-gpu.yml](https://raw.githubusercontent.com/nagaraj-real/localaipilot-api/main/recipes/docker-compose-gpu.yml)\n\n```sh\ndocker compose -f docker-compose-cpu|gpu.yml up llmapi [ollama] [cache]\n```\n\n**Container Services**\n\n- **llmapi** : LLM API container service that connects the extension with Ollama. All configurations are available through ENV variables.\n- **ollama [Optional]** : Turn on this service for running [Ollama as container](https://github.com/nagaraj-real/localaipilot-api#running-ollama-as-container).\n- **cache [Optional]** : Turn on this service for caching and searching [chat history](https://github.com/nagaraj-real/localaipilot-api?tab=readme-ov-file#1-chat-history)\n\n\u003e [!TIP]\n\u003e Start with the llmapi service. Add other services based on your needs.\n\nConfiguring Docker Compose to connect with Ollama running on localhost (via [ollama app](https://github.com/nagaraj-real/localaipilot-api?tab=readme-ov-file#1-install-ollama-on-your-machine-from-ollama-website))\n\n```sh\ndocker compose -f docker-compose-cpu|gpu.yml up llmapi\n\n# update OLLAMA_HOST env variable to point localhost(host.docker.internal)\n```\n\n#### 2. Update the **mode** as \"Container\" in the extension. (**Settings \u003e Local AI Pilot \u003e Mode**)\n\n---\n\n### 📘 Advanced Configuration (Container Mode)\n\n#### 1. Chat History\n\nChat History can be saved in Redis by turning on the cache service.\nBy default, the chats are cached for 1 hour, which is configurable in docker compose.\nThis also enables searching previous chats via extension by keyword or chat ID.\n\n```sh\ndocker compose -f docker-compose-cpu|gpu.yml up cache\n```\n\n#### 2. Document Q\u0026A (RAG Chat)\n\nStart Q\u0026A chat using Retrieval-Augmented Generation (RAG) and embeddings.\nPull a local model to generate and query embeddings.\n\n- Embed Model\n\n  ```sh\n  ollama pull nomic-embed-text\n  ```\n\nUse Docker Compose Volume (_ragdir_) to bind the folder containing documents for Q\u0026A.\nThe embeddings are stored in volume (_ragstorage_).\n\n#### 3. Using a different Ollama model\n\n- Pull your [preferred model](#choosing-local-models) from [ollama model library](https://ollama.com/library)\n\n  ```bash\n  ollama pull \u003cmodel-name\u003e\n  ollama pull \u003ccode-model-name\u003e\n  ollama pull \u003cembed-model-name\u003e\n  ```\n\n- Update model name in docker compose environment variable.\n\n  Note: Local models are prefixed by the word \"local/\"\n\n  ```env\n  MODEL_NAME: local/\u003cmodel-name\u003e\n  CODE_MODEL_NAME: local/\u003ccode-model-name\u003e\n  EMBED_MODEL_NAME: local/\u003cembed-model-name\u003e\n  ```\n\n---\n\n#### 🌐 Remote models (Container Mode)\n\nRemote models require API keys which can be configured in the Docker Compose file.\n\nSupports the models of gemini, cohere, openai, anthropic LLM providers.\n\nUpdate model name and model key in docker compose environment variables.\n\nTurn down ollama service if it's running as it will not be used for remote inference.\n\n```bash\ndocker compose down ollama\n```\n\nSupports _{Provider}/{ModelName}_ format\n\n- Gemini\n\n  Create API keys https://aistudio.google.com/app/apikey\n\n  ```env\n   MODEL_NAME: gemini/gemini-pro\n   EMBED_MODEL_NAME: gemini/embedding-001\n   API_KEY: \u003cAPI_KEY\u003e\n   EMBED_API_KEY: \u003cAPI_KEY\u003e\n  ```\n\n- Cohere\n\n  Create API keys https://dashboard.cohere.com/api-keys\n\n  ```env\n   MODEL_NAME: cohere/command\n   EMBED_MODEL_NAME: cohere/embed-english-v3.0\n   API_KEY: \u003cAPI_KEY\u003e\n   EMBED_API_KEY: \u003cAPI_KEY\u003e\n  ```\n\n- Open AI\n\n  Create API keys https://platform.openai.com/docs/quickstart/account-setup\n\n  ```env\n   MODEL_NAME: openai/gpt-4o\n   EMBED_MODEL_NAME: openai/text-embedding-3-large\n   API_KEY: \u003cAPI_KEY\u003e\n   EMBED_API_KEY: \u003cAPI_KEY\u003e\n  ```\n\n- Anthropic\n\n  Create API keys https://www.anthropic.com/ and https://www.voyageai.com/\n\n  ```env\n   MODEL_NAME: anthropic/claude-3-opus-20240229\n   EMBED_MODEL_NAME: voyageai/voyage-2\n   API_KEY: \u003cAPI_KEY\u003e\n   EMBED_API_KEY: \u003cVOYAGE_API_KEY\u003e\n  ```\n\n- Mistral AI (codestral)\n\n  Create API keys https://console.mistral.ai/codestral\n\n  ```env\n   MODEL_NAME: mistralai\n   CODE_MODEL_NAME: mistralai\n   API_KEY: \u003cAPI_KEY\u003e\n  ```\n\n---\n\n### Choosing Local Models\n\nModels trained on large number of parameters (7b, 70b) are generally more reliable and precise.\nThough, small models like gemma:2b and phi3 have surprised everyone by delivering better performance.\nUltimately, choosing the ideal local model depends on your system's resource capacity and model's performance.\n\n\u003e [!WARNING]\n\u003e Heavier models will require more processing power and memory.\n\n#### Chat Models\n\nYou can choose any instruct model for chat.\nFor better results, choose models that are trained for programming tasks.\n\n[gemma:2b](https://ollama.com/library/gemma:2b) | [phi3](https://ollama.com/library/phi3) | [llama3](https://ollama.com/library/llama3) | [deepseek-r1](https://ollama.com/library/deepseek-r1) |\n[gemma:7b](https://ollama.com/library/gemma:7b) | [codellama:7b](https://ollama.com/library/codellama:7b) |  [qwen2:1.5b](https://ollama.com/library/qwen2:1.5b)\n\n#### Code Completion Models\n\nFor code completion, choose code models that supports FIM (fill-in-the-middle)\n\n[codegemma:2b](https://ollama.com/library/codegemma:2b) | [codegemma:7b-code](https://ollama.com/library/codegemma:7b-code) | [codellama:code](https://ollama.com/library/codellama:code) |\n[codellama:7b-code](https://ollama.com/library/codellama:7b-code) | [deepseek-coder:6.7b-base](https://ollama.com/library/deepseek-coder:6.7b-base) | [granite-code:3b-base](https://ollama.com/library/granite-code:3b-base)\n\n\u003e [!IMPORTANT]  \n\u003e Instruct based models are not supported for code completion.\n\n### Embed Models\n\nChoose any [embed model](https://ollama.com/library?q=embed)\n\n---\n\n#### Running Ollama as container\n\n```sh\ndocker compose -f docker-compose-cpu|gpu.yml up ollama\n\n# update OLLAMA_HOST env variable to \"ollama\"\n```\n\nollama commands are now available via docker.\n\n```sh\ndocker exec -it ollama-container ollama ls\n```\n\n---\n\n#### GPU support help\n\n- https://hub.docker.com/r/ollama/ollama\n- https://docs.docker.com/compose/gpu-support/\n- https://docs.docker.com/desktop/gpu/\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnagaraj-real%2Flocalaipilot-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnagaraj-real%2Flocalaipilot-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnagaraj-real%2Flocalaipilot-api/lists"}