{"id":13830852,"url":"https://github.com/maxbbraun/llama4micro","last_synced_at":"2025-07-09T12:34:21.334Z","repository":{"id":207299052,"uuid":"718884771","full_name":"maxbbraun/llama4micro","owner":"maxbbraun","description":"A \"large\" language model running on a microcontroller","archived":false,"fork":false,"pushed_at":"2023-12-09T02:01:15.000Z","size":162414,"stargazers_count":465,"open_issues_count":6,"forks_count":33,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-08-05T10:13:35.135Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/maxbbraun.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-15T01:38:55.000Z","updated_at":"2024-07-26T23:52:12.000Z","dependencies_parsed_at":"2024-05-28T09:53:28.412Z","dependency_job_id":"17914d6c-5205-4b9c-a916-9cf1b7dd88a5","html_url":"https://github.com/maxbbraun/llama4micro","commit_stats":null,"previous_names":["maxbbraun/llama4micro"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxbbraun%2Fllama4micro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxbbraun%2Fllama4micro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxbbraun%2Fllama4micro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxbbraun%2Fllama4micro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/maxbbraun","download_url":"https://codeload.github.com/maxbbraun/llama4micro/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225553257,"owners_count":17487293,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-04T10:01:10.406Z","updated_at":"2024-11-20T12:30:34.511Z","avatar_url":"https://github.com/maxbbraun.png","language":"C++","funding_links":[],"categories":["C++"],"sub_categories":[],"readme":"# llama4micro 🦙🔬\n\nA \"large\" language model running on a microcontroller.\n\n![Example run](llama4micro.gif)\n\n## Background\n\nI was wondering if it's possible to fit a non-trivial language model on a microcontroller. Turns out the answer is some version of yes! (Later, things got a bit out of hand and now the prompt is based on objects detected by the camera.)\n\nThis project is using the [Coral Dev Board Micro](https://coral.ai/products/dev-board-micro) with its [FreeRTOS toolchain](https://coral.ai/docs/dev-board-micro/freertos/). The board has a number of neat [hardware features](https://coral.ai/docs/dev-board-micro/get-started/#the-hardware), but – most importantly for our purposes – it has 64MB of RAM. That's tiny for LLMs, which are typically measured in the GBs, but comparatively huge for a microcontroller.\n\nThe LLM implementation itself is an adaptation of [llama2.c](https://github.com/karpathy/llama2.c) and the [tinyllamas](https://huggingface.co/karpathy/tinyllamas/tree/main) checkpoints trained on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) dataset. The quality of the smaller model versions isn't ideal, but good enough to generate somewhat coherent (and occasionally weird) stories.\n\n\u003e [!NOTE]\n\u003e Language model inference runs on the 800 MHz [Arm Cortex-M7](https://developer.arm.com/Processors/Cortex-M7) CPU core. Camera image classification uses the [Edge TPU](https://coral.ai/technology/) and a [compiled](https://coral.ai/docs/edgetpu/compiler/) [YOLOv5 model](https://github.com/ultralytics/yolov5). The board also has a second 400 MHz [Arm Cortex-M4](https://developer.arm.com/Processors/Cortex-M4) CPU core, which is currently unused.\n\n## Setup\n\nClone this repo with its submodules [`karpathy/llama2.c`](https://github.com/karpathy/llama2.c), [`google-coral/coralmicro`](https://github.com/google-coral/coralmicro), and [`ultralytics/yolov5`](https://github.com/ultralytics/yolov5).\n\n```bash\ngit clone --recurse-submodules https://github.com/maxbbraun/llama4micro.git\n\ncd llama4micro\n```\n\nThe pre-trained models are in the [`models/`](models/) directory. Refer to the [instructions](models/README.md) on how to download and convert them.\n\nBuild the image:\n\n```bash\nmkdir build\ncd build\n\ncmake ..\nmake -j\n```\n\nFlash the image:\n\n```bash\npython3 -m venv venv\n. venv/bin/activate\n\npip install -r ../coralmicro/scripts/requirements.txt\n\npython ../coralmicro/scripts/flashtool.py \\\n    --build_dir . \\\n    --elf_path llama4micro\n```\n\n## Usage\n\n1. The models load automatically when the board powers up.\n   - This takes ~7 seconds.\n   - The green light will turn on when ready.\n2. Point the camera at an object and press the button.\n   - The green light will turn off.\n   - The camera will take a picture and detect an object.\n3. The model now generates tokens starting with a prompt based on the object.\n   - The results are streamed to the serial port.\n   - This happens at a rate of ~2.5 tokens per second.\n4. Generation stops after the end token or maximum steps.\n   - The green light will turn on again.\n   - Goto 2.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxbbraun%2Fllama4micro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaxbbraun%2Fllama4micro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxbbraun%2Fllama4micro/lists"}