https://github.com/bot08/local-llm-discord-bot

A lightweight Discord bot interface for interacting with locally-hosted language models. Supports conversation history, streaming responses, and custom configurations.
https://github.com/bot08/local-llm-discord-bot

discord-bot discord-py llama llamacpp

Last synced: 3 months ago
JSON representation

A lightweight Discord bot interface for interacting with locally-hosted language models. Supports conversation history, streaming responses, and custom configurations.

Host: GitHub
URL: https://github.com/bot08/local-llm-discord-bot
Owner: bot08
License: gpl-3.0
Created: 2024-10-22T11:41:16.000Z (9 months ago)
Default Branch: main
Last Pushed: 2025-04-02T20:23:47.000Z (3 months ago)
Last Synced: 2025-04-02T21:29:06.864Z (3 months ago)
Topics: discord-bot, discord-py, llama, llamacpp
Language: Python
Homepage:
Size: 39.1 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Local LLM Discord Bot Interface

A lightweight Discord bot interface for interacting with locally-hosted language models. Supports conversation history, streaming responses, and custom configurations.

---

## Core Features

- **DM-Only Interactions**: Restrict bot usage to private messages

- **Context-Aware Chat**: Maintains limited conversation history per user

- **Ping & Clear Commands**: `ping` displays bot latency, `clear` resets user history

- **Message Chunking**: Automatically splits long responses (>2000 chars)

- **Custom Username Injection**: Use [user] in SYSTEM_PROMPT to dynamically insert the current user's name

- **GPU Acceleration**: Configure offloading layers for performance

- **Streaming Mode**: Real-time token delivery with typing simulation

- **Custom Prompts**: Modify system behavior via `SYSTEM_PROMPT`

- **Thread Safety**: Prevents race conditions with user-level locks

---

## Setup Guide

1. **Install Requirements**:

    ```bash

    pip install discord.py llama-cpp-python python-dotenv

    ```

2. **Create `.env`**:

    ```env

    DISCORD_TOKEN=TOKEN

    MODEL_PATH=Llama-3.1-8B-Q4_K_L.gguf

    # Required parameters above. Optional below:

    COMMAND_PREFIX=!

    FULL_LOG=FALSE

    MODEL_N_CTX=1024

    MAX_TOKENS=256

    TOP_K=40

    TOP_P=0.95

    TEMPERATURE=0.7

    REPEAT_PENALTY=1.1

    GPU_LAYERS=7

    ONLY_DM=TRUE

    HISTORY_LIMIT=3

    STREAM_MODE=FALSE

    SYSTEM_PROMPT=You are a helpful assistant. Answer as concisely as possible.

    ```

3. **Run Bot**:

    ```bash

    python main.py

    ```

---

## Full .env Configuration

| Parameter             | Type     | Description                                      | Default               |

|-----------------------|----------|--------------------------------------------------|-----------------------|

| `DISCORD_TOKEN`       | String   | **Required** Discord bot token                  | -                     |

| `COMMAND_PREFIX`      | String   | Bot command prefix                              | `!`                   |

| `FULL_LOG`            | Boolean  | Enable verbose logging                          | `FALSE`               |

| `MODEL_PATH`          | String   | **Required** Path to GGUF model file            | -                     |

| `CHAT_FORMAT`         | String   | Chat formatting style for model                 | `None`                |

| `MODEL_N_CTX`         | Integer  | Context window size                             | `1024`                |

| `MAX_TOKENS`          | Integer  | Maximum tokens per response                     | `256`                 |

| `TOP_K`               | Integer  | Top-k sampling                                  | `40`                  |

| `TOP_P`               | Float    | Top-p sampling                                  | `0.95`                |

| `TEMPERATURE`         | Float    | Response randomness (0.1-2.0)                   | `0.7`                 |

| `REPEAT_PENALTY`      | Float    | Penalize repeated phrases                       | `1.1`                 |

| `GPU_LAYERS`          | Integer  | GPU offloading layers (0=CPU-only)              | `0`                   |

| `ONLY_DM`             | Boolean  | Bot responds only in DMs                        | `TRUE`                |

| `HISTORY_LIMIT`       | Integer  | Max stored message pairs (user+assistant)       | `3`                   |

| `STREAM_MODE`         | Boolean  | Enable real-time token streaming                | `FALSE`               |

| `SYSTEM_PROMPT`       | String   | Initial assistant behavior prompt               | `You are a helpful...`|

---

## TODO

- **Stream fix**:

  - Fix generation interruption caused by Discord API rate limits

  - Implement adaptive delay between token sends

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bot08/local-llm-discord-bot

Awesome Lists containing this project

README