https://github.com/rhohndorf/Auto-Llama-cpp

Uses Auto-GPT with Llama.cpp
https://github.com/rhohndorf/Auto-Llama-cpp

Last synced: 2 months ago
JSON representation

Uses Auto-GPT with Llama.cpp

Host: GitHub
URL: https://github.com/rhohndorf/Auto-Llama-cpp
Owner: rhohndorf
License: mit
Created: 2023-04-11T17:03:58.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-04-08T23:15:16.000Z (about 1 year ago)
Last Synced: 2024-11-09T18:41:57.582Z (8 months ago)
Language: Python
Size: 579 KB
Stars: 384
Watchers: 20
Forks: 68
Open Issues: 5
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

Awesome-Auto-GPT - Auto-Llama-cpp - GPT with Llama.cpp. (🌟334) (GitHub projects)

README

> [!NOTE]
> If you are interested in locally running autonomous agents, also have a look at this [project](https://github.com/rhohndorf/beezle-bug) of mine. It's much cleaner, more stable and under more active development.

# Auto-Llama-cpp: An Autonomous Llama Experiment
This is a fork of Auto-GPT with added support for locally running llama models through llama.cpp.
This is more of a proof of concept. It's sloooow and most of the time you're fighting with the too small context window size or the models answer is not valid JSON. But sometimes it works and then it's really quite magical what even such a small model comes up with.
But obviously don't expect GPT-4 brilliance here.

## Supported Models
---
Since this uses [llama.cpp](https://github.com/ggerganov/llama.cpp) under the hood it should work with all models they support. As of writing this is
* LLaMA
* Alpaca
* GPT4All
* Chinese LLaMA / Alpaca
* Vigogne (French)
* Vicuna
* Koala

## Model Performance (the experience so far)
---

### Response Quality
So far I have tried
* Vicuna-13b-4BIT
* LLama-13B-4BIT

Overall the Vicuna model performed much better than the original LLama model in terms of answering in the required JSON format and how much sense the answers make. I just couldn't get it to stop starting every answer with ### ASSISTANT.
I am very curious to hear how well others models perform. The 7B models seemed have problems with grasping what's asked of them in the prompt, but I tried very little in this direction since the inference speed didn't seem to be much faster for me.

### Inference Speed
The biggest problem at the moment is indeed inference speed. As the agent is self prompting a lot, a few seconds of infernce that are acceptable in a chatbot scenario become minutes and more.
Testing things like different prompts etc is a pain under these conditions.

## Discussion
Fell free to add your thoughts and experiences in the [discussion](https://github.com/rhohndorf/Auto-Llama-cpp/discussions) area. What models did you try? How well did they work ou for you?

## Future Plans
---

1. Add GPU Support via GPTQ
2. Improve Prompts
3. Remove external API support (This is supposed to be completely self-contained agent)
4. Add support for [Open Assistent](https://github.com/LAION-AI/Open-Assistant) models

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rhohndorf/Auto-Llama-cpp

Awesome Lists containing this project

README