https://github.com/haschka/cli-rag

Command line tool to Interact with a llama.cpp server. Also implements a basic vector database with cosine similarity search.
https://github.com/haschka/cli-rag

artificial-intelligence cli large-language-models llama-cpp llm unix-shell

Last synced: 11 months ago
JSON representation

Command line tool to Interact with a llama.cpp server. Also implements a basic vector database with cosine similarity search.

Host: GitHub
URL: https://github.com/haschka/cli-rag
Owner: haschka
License: mit
Created: 2024-07-27T16:26:51.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-01-29T16:28:31.000Z (11 months ago)
Last Synced: 2025-01-29T17:31:04.375Z (11 months ago)
Topics: artificial-intelligence, cli, large-language-models, llama-cpp, llm, unix-shell
Language: C
Homepage:
Size: 43.9 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

## Command line client for llama.cpp

### Video tutorial:
https://www.youtube.com/watch?v=rsBnAF5ZLK8

### Usage instructions:
1. Requirements:
The tool requires the libraries and headers (-dev packages)
of curl, json-c and gnu readline. On debian based systems, i.e.
debian, ubuntu etc. an install of these can be achieved using:
```
sudo apt-get install build-essential libjson-c-dev libcurl-dev libreadline-dev
```
Further you need a llama.cpp compatible large language and embedding model.
For the following instructins we suggest:

https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF

https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF

2. Build:
in most cases a simple `make all` should be enough.
In case it does not work edit the makefile in order to
satisfy your systems libraries cflags.

3. Conversation Run:

3.1 Start a llama.cpp server:
```
llama.cpp/bin/llama-server -m Meta-Llama-3.1-8B-Instruct-Q6_K.gguf --host 127.0.0.1
```
3.2 Connect to your llama.cpp server with the client:
```
bin/rag-conversation 127.0.0.1 8080 -1
```
When you type your text finish with `Ctrl-d`. This allows multiline input
on the terminal.

4. Run with RAG:

4.1 Start a llama.cpp server to generate embeddings:
```
llama.cpp/bin/llama-server -m nomic-embed-text-v1.5.f16.gguf --host localhost --port 8081
```
4.2 Create a vector database from a text document:
```
bin/build-vector-db-from-server your-text.txt localhost 8081 2000 your-text.vdb
```
4.3 Run vector database supported and talk about your text:
```
bin/rag-with-vdb-cos-client localhost 8080 -1 your-text.vdb 3 localhost 8081
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/haschka/cli-rag

Awesome Lists containing this project

README