https://github.com/infinitensor/infer.cc
https://github.com/infinitensor/infer.cc
Last synced: 8 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/infinitensor/infer.cc
- Owner: InfiniTensor
- Created: 2024-09-12T08:23:36.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-05-26T10:58:35.000Z (about 1 year ago)
- Last Synced: 2025-07-08T07:55:40.747Z (12 months ago)
- Language: C++
- Size: 137 KB
- Stars: 4
- Watchers: 1
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 跨平台统一运行时 C 语言接口 InfiniRT & 微型 C++ 大模型推理引擎 InfiniInfer
## 使用方式
### 配置 XMake
- 配置 XMake(选择硬件平台)
```shell
xmake f [--nv-gpu/--ascend-npu]=true -cv
```
- 只编译运行时库,不使用多卡通信以及模型推理引擎(默认为打开)
```shell
xmake f --ccl=false --infer=false -cv
```
### 编译和部署
- 设置 `INFINI_ROOT` 环境变量(推荐,默认安装地址为 `$HOME/.infini`)
```shell
export INFINI_ROOT=$HOME/.infini
export LD_LIBRARY_PATH=$INFINI_ROOT/lib:$LD_LIBRARY_PATH
```
- 编译和部署
```shell
xmake && xmake install
```
### 测试推理引擎
- 需要先编译和部署运行时和推理引擎
```shell
python test/model/test_llama.py --cuda path/to/model/dir/
```