https://github.com/forcetrainer/discord-analyzer
Export Discord messages and analyze them for FAQs using semantic clustering
https://github.com/forcetrainer/discord-analyzer
clustering discord discord-bot faq fastapi nlp python react semantic-search
Last synced: 5 months ago
JSON representation
Export Discord messages and analyze them for FAQs using semantic clustering
- Host: GitHub
- URL: https://github.com/forcetrainer/discord-analyzer
- Owner: forcetrainer
- License: other
- Created: 2026-01-10T18:36:06.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-01-10T23:20:14.000Z (5 months ago)
- Last Synced: 2026-01-11T05:33:20.090Z (5 months ago)
- Topics: clustering, discord, discord-bot, faq, fastapi, nlp, python, react, semantic-search
- Language: Python
- Size: 424 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Discord Chat Analyzer
[](./LICENSE)
[](https://www.python.org/downloads/)
[](https://reactjs.org/)
A web-based tool to export Discord messages and analyze them for frequently asked questions. Helps identify common questions in your community and the responses they receive, so you can build better documentation.
> **Warning**
>
> This tool is designed for **personal use on servers you control**. Whoever holds the bot token has access to read messages from **all servers** the bot has been added to. Do not share your bot's invite link publicly or offer this as a service to others. If someone adds your bot to their server, you will be able to export their messages. Only use this tool on your own communities.
## Features
- Export messages from any Discord channel your bot has access to
- Automatic question detection using pattern matching
- Semantic clustering of similar questions using sentence embeddings
- Response extraction to show how questions were answered
- Web UI for managing exports and viewing results

*Clustered FAQ questions with community responses*
## Prerequisites
- Python 3.11+
- Node.js 18+
- A Discord bot token
## Setup
### 1. Create a Discord Bot
1. Go to the [Discord Developer Portal](https://discord.com/developers/applications)
2. Click "New Application" and give it a name
3. Go to the "Bot" section and click "Add Bot"
4. Enable these Privileged Gateway Intents:
- **Message Content Intent** (required to read message text)
5. Copy the bot token (you'll need this later)
6. Go to "OAuth2" > "URL Generator":
- Under Scopes: select `bot`
- Under Bot Permissions: select `View Channels` (General) and `Read Message History` (Text)
7. Use the generated URL to invite the bot to your server
### 2. Backend Setup
```bash
cd backend
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Start the server
python3 run.py
```
The backend will start on http://localhost:8000
**Note:** The first time you run an analysis, it will download the sentence-transformers model (~90MB). This only happens once.
### 3. Frontend Setup
```bash
cd frontend
# Install dependencies
npm install
# Start the dev server
npm run dev
```
The frontend will start on http://localhost:5173
## Usage
### 1. Setup
Enter your Discord bot token on the Setup page.

### 2. Export
Select a server and channels, then start exporting messages.

### 3. Analysis
Once export completes, run semantic analysis to find FAQ clusters.

### 4. Results
View grouped questions and their community responses.

## Analysis Parameters
- **Min Cluster Size**: Minimum number of similar questions needed to form a cluster (default: 5)
- **Min Samples**: Core samples required for HDBSCAN clustering (default: 3)
Lower values = more clusters with fewer questions each. Higher values = fewer, larger clusters.
## Architecture
See [ARCHITECTURE.md](./ARCHITECTURE.md) for detailed technical documentation.
## Tech Stack
**Backend:**
- FastAPI (async Python web framework)
- discord.py (Discord API wrapper)
- sentence-transformers (semantic embeddings)
- HDBSCAN (density-based clustering)
- SQLite (local database)
**Frontend:**
- React + TypeScript
- Vite (build tool)
- TanStack Query (data fetching)
- Tailwind CSS (styling)
## Development
### Demo Mode
To explore the UI without a Discord bot:
```bash
cd backend
python3 seed_demo_data.py
DISCORD_ANALYZER_DEMO_MODE=true python3 run.py
```
This seeds the database with sample data and mocks the Discord connection.
## License
This project is licensed under the [PolyForm Noncommercial License 1.0.0](./LICENSE). You are free to use, modify, and share this software for noncommercial purposes with attribution. Commercial use is not permitted.