https://github.com/eniompw/llama-cpp-gpu

Load larger models by offloading model layers to both GPU and CPU
https://github.com/eniompw/llama-cpp-gpu

colab colab-notebook gpu gpu-acceleration llama llama-cpp llamacpp

Last synced: 9 months ago
JSON representation

Load larger models by offloading model layers to both GPU and CPU

Host: GitHub
URL: https://github.com/eniompw/llama-cpp-gpu
Owner: eniompw
License: mit
Created: 2023-06-23T07:05:53.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-07-28T14:00:05.000Z (over 2 years ago)
Last Synced: 2024-10-18T23:15:32.186Z (about 1 year ago)
Topics: colab, colab-notebook, gpu, gpu-acceleration, llama, llama-cpp, llamacpp
Language: Jupyter Notebook
Homepage:
Size: 109 KB
Stars: 3
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # LLaMA.cpp GPU

Offloads some of the model layers to the GPU, allowing larger models to be loaded

![model](https://github.com/eniompw/llama-cpp-gpu/blob/main/model-details.png)

![colab resources](https://github.com/eniompw/llama-cpp-gpu/blob/main/colab-resources.png)

![parameters](https://github.com/eniompw/llama-cpp-gpu/blob/main/size-parameters.png)

* [cuBLAS](https://github.com/ggerganov/llama.cpp#cublas)

* [Model](https://huggingface.co/TheBloke/Wizard-Vicuna-13B-Uncensored-GGML#how-to-run-in-llamacpp)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/eniompw/llama-cpp-gpu

Awesome Lists containing this project

README