{"id":28412592,"url":"https://github.com/dineshsoudagar/local-llms-on-android","last_synced_at":"2026-04-26T06:04:21.789Z","repository":{"id":291848990,"uuid":"977593093","full_name":"dineshsoudagar/local-llms-on-android","owner":"dineshsoudagar","description":"Run large language models like Qwen and LLaMA locally on Android for offline, private, real-time question answering and chat - powered by ONNX Runtime.","archived":false,"fork":false,"pushed_at":"2026-04-10T11:57:00.000Z","size":24431,"stargazers_count":121,"open_issues_count":1,"forks_count":16,"subscribers_count":6,"default_branch":"main","last_synced_at":"2026-04-10T20:49:45.075Z","etag":null,"topics":["android","android-app","chatbot","huggingface-tokenizers","llama3","local-llm","local-llm-integration","mobile-ai","offline-inference","on-device-ai","onnx-runtime","qwen"],"latest_commit_sha":null,"homepage":"","language":"Kotlin","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dineshsoudagar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-05-04T15:04:09.000Z","updated_at":"2026-04-10T03:00:42.000Z","dependencies_parsed_at":"2025-05-06T21:26:51.946Z","dependency_job_id":"a4a3ab9b-5937-4a4b-96c3-a0022404d394","html_url":"https://github.com/dineshsoudagar/local-llms-on-android","commit_stats":null,"previous_names":["dineshsoudagar/llm-english-german-small-translator","dineshsoudagar/local-llm-on-andriod-qwen-qa","dineshsoudagar/local-llm-android"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/dineshsoudagar/local-llms-on-android","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dineshsoudagar%2Flocal-llms-on-android","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dineshsoudagar%2Flocal-llms-on-android/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dineshsoudagar%2Flocal-llms-on-android/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dineshsoudagar%2Flocal-llms-on-android/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dineshsoudagar","download_url":"https://codeload.github.com/dineshsoudagar/local-llms-on-android/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dineshsoudagar%2Flocal-llms-on-android/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31658964,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-10T17:19:37.612Z","status":"ssl_error","status_checked_at":"2026-04-10T17:19:13.364Z","response_time":98,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["android","android-app","chatbot","huggingface-tokenizers","llama3","local-llm","local-llm-integration","mobile-ai","offline-inference","on-device-ai","onnx-runtime","qwen"],"created_at":"2025-06-02T23:14:36.361Z","updated_at":"2026-04-26T06:04:21.784Z","avatar_url":"https://github.com/dineshsoudagar.png","language":"Kotlin","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤖 Pocket LLM for Android (Offline, Private \u0026 Fast)\n\nAn Android application that brings local LLM chat, voice input, image input, OCR, and camera-based prompting to your phone.\n\nPocket LLM runs fully on device after model download. It supports ONNX-based Qwen models, LiteRT-based Qwen 3 and Gemma 4 models, streaming responses, persistent local chat history, markdown-rendered replies, downloadable models, in-app model switching, editable model instructions, and multiple image input workflows.\n\nThe app ships as a small base APK. Users download only the models they want, switch between them inside the app, and delete unused models later to save device storage.\n\n---\n\n[![Total APK downloads](https://img.shields.io/github/downloads/dineshsoudagar/local-llms-on-android/total?logo=github\u0026label=Total%20APK%20downloads)](https://github.com/dineshsoudagar/local-llms-on-android/releases)\n\n---\n\n## 🆕 New in v1.5.0\n\nPocket LLM now supports richer local input workflows beyond text chat.\n\n- 🎙️ Added voice input for faster prompting\n- 🖼️ Added image input with OCR, Gemma direct image input, and FastVLM image description support\n- 📷 Added camera capture with retake, crop, and photo review\n- 🗂️ Added a side panel for quick access to previous chats\n- 🗑️ Added easier chat deletion from the history panel\n- 💾 Added downloaded model deletion to free device storage\n- ⚙️ Added editable model instructions with presets and custom prompts\n- 🎨 Added dark mode, light mode, accent colors, and chat font-size control\n- 📋 Added copy button for assistant responses\n\n#### ➡️ [See all releases](https://github.com/dineshsoudagar/local-llms-on-android/releases)\n\n---\n\n### 🔗 Also Check Out\n\n**[local-document-intelligence](https://github.com/dineshsoudagar/local-document-intelligence)**  \nA privacy-first offline document intelligence system with persistent local RAG, hybrid retrieval, and source-grounded answers.\n\n---\n\n## ✨ Features\n\n- 📱 Fully on-device LLM chat for private offline use\n- 🎙️ Voice input for faster prompting\n- 🖼️ Image input with OCR, Gemma vision, and FastVLM support\n- 📷 Camera capture with retake, crop, and photo review\n- 💬 Persistent multi-turn chat with local history\n- 📦 Download, switch, and delete models inside the app\n- 🧠 Supports Qwen2.5, Qwen3, Qwen3 LiteRT, and Gemma 4 LiteRT models\n- ⚡ ONNX and LiteRT backend support\n- 🎛️ Editable model instructions with presets and custom prompts\n- 🎨 Light mode, dark mode, accent colors, and adjustable chat font size\n- 🔐 Offline after model download, with no telemetry\n\n---\n\n## 📸 Inference Preview\n\n\u003ctable align=\"center\"\u003e\n  \u003ctr\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003cimg src=\"data/Chat.gif\" alt=\"Model Output 1\" width=\"260\"/\u003e\u003cbr/\u003e\n      \u003csub\u003e\u003cb\u003eChat Inference\u003c/b\u003e\u003c/sub\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003cimg src=\"data/Image support.gif\" alt=\"Model Output 2\" width=\"260\"/\u003e\u003cbr/\u003e\n      \u003csub\u003e\u003cb\u003eImage Support\u003c/b\u003e\u003c/sub\u003e\n    \u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\n      \u003cimg src=\"data/New ui.gif\" alt=\"Chat UI Preview\" width=\"260\"/\u003e\u003cbr/\u003e\n      \u003csub\u003e\u003cb\u003eNew UI\u003c/b\u003e\u003c/sub\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cem\u003eFigure: Pocket LLM showing offline chat, image input, and the updated Android UI.\u003c/em\u003e\n\u003c/p\u003e\n\n---\n\n## 📦 Download APK - v1.5.0\n\nThe app ships as a **single smaller base APK**.\n\n#### ➡️ [Download APK](https://github.com/dineshsoudagar/local-llms-on-android/releases/download/v1.5.0/pocket_llm_v1.5.0.apk)\n\nModels are **not bundled inside the APK**. After installation, choose and download the models you want directly on device.\n\nYou can download **multiple models**, switch between them inside the app, and delete unused downloaded models later to free storage.\n\n### Available chat models\n\n- **Gemma 4 E4B LiteRT** - Best for **flagship mobiles**\n- **Gemma 4 E2B LiteRT** - Best for **decent to mid-range mobiles**\n- **Qwen3 0.6B LiteRT** - Best for **low-end mobiles**\n- **Qwen3 0.6B Q4F16 ONNX** - Good for **low to mid-range mobiles**\n- **Qwen2.5 0.5B ONNX** - Best for **mid to high-end mobiles**, **full precision**\n\n### Image input support\n\n- **OCR mode** - Extract text from images\n- **Gemma vision mode** - Use Gemma direct image input on supported models\n- **FastVLM mode** - Use lightweight image description for non-Gemma models\n- **Camera capture** - Take a photo, retake, crop, review, and send it as input\n\n\u003e Note: internet is required only for downloading models. Chat, OCR, image input, camera workflows, and inference remain fully on-device after the required models are installed.\n\n---\n\n## 🧠 Backend Support\n\nThis app supports **ONNX-based Qwen models** and **LiteRT-based Qwen 3 and Gemma 4 models**.\n\n### Backend overview\n\n- **ONNX backend**: supports **Qwen2.5** and **Qwen3**\n- **LiteRT backend**: supports **Qwen3** and **Gemma 4**\n\n### Thinking Mode\n\n- **Qwen3** and **Gemma 4** support **Thinking Mode**\n- The toggle is shown only for models that support it\n\n---\n\n## 🚀 Why LiteRT\n\n**LiteRT** is a strong fit for fast local Android chat because:\n\n- It is designed for **high-performance on-device LLM deployment**\n- It supports **hardware acceleration**, including **GPU and NPU acceleration** on supported devices\n- It helps reduce startup and generation latency for local chat workloads\n- It expands the range of practical Android model builds beyond a single backend path\n- It fits well with a privacy-first app design focused on fully offline usage\n\n\u003e Note: model capability and performance still depend on the specific model build and the hardware of the target Android device.\n\n---\n\n## ⚙️ Requirements\n\n- [Android Studio](https://developer.android.com/studio)\n- A physical Android device for deployment and testing\n- 4 GB or more RAM for smaller models\n- More RAM is recommended for larger models such as **Gemma 4 E2B** and **Gemma 4 E4B**\n- A temporary internet connection for downloading models inside the app\n- Real hardware is preferred; emulators are mainly useful for UI checks\n\n---\n\n## 🚀 How to Build \u0026 Run\n\n1. Clone this repository.\n2. Install the latest **Android Studio**.\n3. Open the Android project folder in Android Studio:\n\n    ```text\n    pocket_llm_src/\n    ```\n4. Build and install the app on your Android device.\n5. Launch the app.\n6. On first launch, choose a model from the built-in model picker.\n7. Download the selected model directly inside the app.\n8. Start chatting locally on device\n\n---\n\n## 📄 License Notice\n\n### Gemma 4\n\nGemma 4 is provided by Google under the **Apache License 2.0**. Google's Gemma documentation also states that Gemma models are provided with open weights and support responsible commercial use.\n\n- Gemma 4 license: https://ai.google.dev/gemma/apache_2\n- Gemma 4 overview: https://ai.google.dev/gemma/docs/core\n\n### Qwen models\n\nQwen model files follow the upstream Qwen license terms.  \nPlease review the original model license before redistribution or commercial use.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdineshsoudagar%2Flocal-llms-on-android","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdineshsoudagar%2Flocal-llms-on-android","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdineshsoudagar%2Flocal-llms-on-android/lists"}