https://github.com/adamydwang/mobilellama

a lightweight C++ LLaMA inference engine for mobile devices
https://github.com/adamydwang/mobilellama

cpp inference llama llm openblas

Last synced: 6 months ago
JSON representation

a lightweight C++ LLaMA inference engine for mobile devices

Host: GitHub
URL: https://github.com/adamydwang/mobilellama
Owner: adamydwang
License: mit
Created: 2023-10-18T16:50:53.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-10-28T14:56:56.000Z (over 2 years ago)
Last Synced: 2025-10-28T22:42:11.149Z (9 months ago)
Topics: cpp, inference, llama, llm, openblas
Language: C++
Homepage:
Size: 40 KB
Stars: 15
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# MobileLLaMA
## A Lightweight C++ Implementation of LLaMA for Mobile Devices

# 1. Milestones

- [ ] Naive version: pure c++ implementation of Matrix manipulation
- [x] C++ Inference Engine
- [ ] Model Transform tool: PyTorch model to MobileLLaMA model
- [ ] OpenBLAS version: speedup matrix manipulation by OpenBLAS

# 2. Build and Run

## How to build?

```
$ git clone --recurse-submodules https://github.com/adamydwang/mobilellama.git
$ cd mobilellama
$ cd deps && bash build.sh #build thirdparty dependencies
$ mkdir ../build && cd ../build && cmake .. && make
```

***Reminds:*** executables and libraries are output to directories: *lib* and *bin*

## How to run demo

***!!!caution: model transformer tool has not been ready, so can not run yet***

```
$ cd bin
$ ./demo ${model_path} ${tokenizer_path}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/adamydwang/mobilellama

Awesome Lists containing this project

README