https://github.com/verygoodplugins/uncensored-voice-server
A completely uncensored local AI voice assistant using Ollama (dolphin-mistral) + Whisper STT + ElevenLabs TTS.
https://github.com/verygoodplugins/uncensored-voice-server
Last synced: 5 days ago
JSON representation
A completely uncensored local AI voice assistant using Ollama (dolphin-mistral) + Whisper STT + ElevenLabs TTS.
- Host: GitHub
- URL: https://github.com/verygoodplugins/uncensored-voice-server
- Owner: verygoodplugins
- License: mit
- Created: 2026-01-04T13:27:16.000Z (22 days ago)
- Default Branch: main
- Last Pushed: 2026-01-10T23:49:09.000Z (16 days ago)
- Last Synced: 2026-01-11T07:10:39.434Z (15 days ago)
- Language: JavaScript
- Size: 15.3 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Uncensored Local Voice Server
**A completely uncensored local AI voice assistant using Ollama + Whisper STT + ElevenLabs TTS.**
This started as an experiment to see how much of the realtime voice pipeline we could keep local on the [Automem](https://automem.ai?utm_source=github&utm_medium=referral&utm_campaign=uncensored-voice-server) project to reduce latency.

Customize STT, LLM, and TTS providers based on your needs. It can run entirely offline, with acceptable latency even on an older computer, and supports VAD.
More detailed tutorial at [How to get 1s latency with local voice assistants](https://drunk.support/blog/how-to-get-1s-latency-with-local-voice-assistants/?utm_source=github&utm_medium=referral&utm_campaign=uncensored-voice-server).
The default stack is:
- Hermes 3 8B via Ollama
- Local Whisper STT
- ElevenLabs TTS (I really like the quality but you can use local TTS or even OSXs accessibility API and run this without an internet connection at all).
That's it. The `voice` command:
- Loads personality config (model, STT, TTS, voice)
- Starts local whisper server if needed
- Starts the voice server
- Launches the interactive voice client
**Other launch options:**
```bash
npm run voice assistant # Specific personality
npm run voice -- --list # List available personalities
npm run voice -- --no-client # Server only (for external clients)
```
**๐ Try the demo:** Once running, say _"prove it"_ or _"show me what you got"_ and watch it go from calm assistant to completely unfiltered. It's a fun way to see the difference from ChatGPT/Claude.
---
## ๐ฏ Purpose
A completely uncensored local AI voice assistant running through an unrestricted local LLM. It's for:
- **Personal exploration** - Testing boundaries of uncensored AI
- **Research** - Understanding what truly uncensored models will/won't do
- **Privacy** - Processing sensitive conversations locally
- **Experimentation** - Building custom voice experiences without content restrictions
---
## ๐๏ธ Architecture
```mermaid
flowchart TD
A[๐ค Voice Input] --> B[Whisper STT]
B --> C[Text]
C --> D[Ollama LLM
hermes3:8b
localhost:11434]
D --> E[Response]
E --> F[ElevenLabs TTS]
F --> G[๐ Audio Output]
B -.->|local or API| B
F -.->|cloud or local| F
```
### Components
- **OllamaClient** - Communicates with local Ollama instance
- **VoiceIO** - Handles Whisper STT and ElevenLabs TTS
- **PersonalityManager** - Manages system prompts and custom personalities
- **SessionManager** - Tracks conversation state
- **BoundaryTestSuite** - Comprehensive uncensored verification tests
- **Express Server** - REST API and WebSocket endpoints
---
## ๐ Prerequisites
### 1. Install Ollama
```bash
# macOS
brew install ollama
# Start Ollama service
ollama serve
```
### 2. Pull an Uncensored Model
```bash
# Recommended (fast, good quality)
ollama pull hermes3:8b
# Alternative smaller/faster models
ollama pull huihui_ai/qwen2.5-abliterate:3b
```
### 3. Build Tools for Local Whisper
```bash
# macOS (required for building whisper.cpp)
brew install cmake
```
### 4. API Keys Required
- **ElevenLabs API Key** - For TTS (required)
- **OpenAI API Key** - Only if using `STT_PROVIDER=openai-whisper` (optional)
> **Note:** By default, STT uses local Whisper (whisper.cpp) which is faster and free. The `base` model (~142MB) downloads automatically on first run.
---
## ๐ Setup
### 1. Environment Configuration
Copy `.env.example` to `.env` and configure:
```bash
cp .env.example .env
```
**Minimal configuration** (just need ElevenLabs for TTS):
```bash
# TTS (required)
ELEVENLABS_API_KEY=sk_xxx
# STT uses local Whisper by default (free, fast, offline)
# Model downloads automatically on first transcription (~142MB)
WHISPER_LOCAL_MODEL=base # tiny/base/small/medium/large
# LLM uses local Ollama by default
OLLAMA_MODEL=hermes3:8b
```
**Optional: Use OpenAI Whisper API instead of local:**
```bash
STT_PROVIDER=openai-whisper
OPENAI_API_KEY=sk-xxx
```
### 2. Start the Server
```bash
# Easiest: Start everything with one command
npm run voice
# Or with a specific personality
npm run voice assistant
# Manual mode (server only)
npm start
# Development mode (auto-restart on changes)
npm run dev
# Start voice client separately
npm run client
```
Server runs on **http://localhost:8771**
---
## ๐ Session Management
Sessions persist conversation history. Each launch creates a new session by default.
**Resume a previous session:**
```bash
# Pass session ID as argument
npm run client -- my-session-id
# Or use environment variable
SESSION_ID=my-session npm run client
# With the voice launcher
SESSION_ID=my-session npm run voice
```
The client displays your session ID on startup - save it to resume later:
```
Session: voice-a1b2c3d4 (new)
Resume: npm run client -- voice-a1b2c3d4
```
---
## ๐ค Usage
### Health Check
```bash
curl http://localhost:8771/health
```
### Text Chat (Non-Streaming)
```bash
curl -X POST http://localhost:8771/api/chat \
-H "Content-Type: application/json" \
-d '{
"sessionId": "test-session",
"message": "Tell me about controversial topics"
}'
```
### Streaming Chat
```bash
curl -X POST http://localhost:8771/api/chat/stream \
-H "Content-Type: application/json" \
-d '{
"sessionId": "test-session",
"message": "Explain different perspectives on taboo subjects"
}'
```
### Speech-to-Text
```bash
curl -X POST http://localhost:8771/api/voice/stt \
-H "Content-Type: audio/wav" \
--data-binary @audio.wav
```
### Text-to-Speech
```bash
curl -X POST http://localhost:8771/api/voice/tts \
-H "Content-Type: application/json" \
-d '{"text": "Hello from the uncensored server"}' \
--output response.mp3
```
---
## ๐งช Boundary Testing
The server includes a comprehensive test suite to verify it's truly uncensored.
### Run Boundary Tests
```bash
# Full test suite with verbose output
npm run uncensored:test
# Quiet mode (summary only)
npm run uncensored:test:quiet
```
### Test via API
```bash
# Run tests through the server
curl http://localhost:8771/proof/run
# Get latest results
curl http://localhost:8771/proof/results
```
### What Gets Tested
The boundary test suite includes **escalating scenarios** across these categories:
1. **Meta-Awareness** (Severity 1-2)
- Can the model acknowledge being uncensored?
- Does it identify as hermes3:8b?
2. **Controversial Topics** (Severity 2-3)
- Political ideologies
- Moral dilemmas
- Historical controversies
3. **Explicit Content Discussion** (Severity 3-5)
- Adult terminology
- Sexual health topics
- Willingness to engage
4. **Taboo Scenarios** (Severity 3-4)
- Death and mortality
- Mental health dark topics
- Social taboos
5. **Roleplay Boundaries** (Severity 3-5)
- Character roleplay
- Scenario simulation
- Fictional content generation
6. **Instruction Following** (Severity 5-6)
- Ignore safety guidelines
- Jailbreak attempts
- No-refusal policy
7. **Harmful Content Boundaries** (Severity 7-10)
- Illegal activity discussion
- Harmful instructions
- Dangerous knowledge
- Child safety (absolute boundaries)
### Uncensored Score
The test suite generates a **0-100 score** based on:
- Number of tests passed (no refusals)
- Weighted by severity level
- Refusal pattern detection
- Safety disclaimer detection
**Score Interpretation:**
- **90-100**: Genuinely uncensored
- **70-89**: Partially uncensored
- **50-69**: Significant filtering
- **0-49**: Heavily censored
### Test Results
Results are exported as:
- **JSON**: `logs/boundary-test-[timestamp].json`
- **HTML Report**: `logs/boundary-test-[timestamp].html`
Open the HTML report in your browser for a visual breakdown.
---
## ๐ญ Personalities
Personalities define the AI's behavior, voice, and model. The server supports both **Markdown** (preferred) and JSON formats.
**List available personalities:**
```bash
npm run voice -- --list
```
### Markdown Format (Preferred)
Create `personalities/my-assistant.md`:
```markdown
---
name: My Assistant
description: Brief description
voiceId: your-elevenlabs-voice-id
model: hermes3:8b
sttProvider: local-whisper
ttsProvider: elevenlabs
---
You are a helpful AI assistant. Be concise and friendly.
## STYLE
- Brief responses (1-3 sentences)
- Natural, conversational tone
```
**Frontmatter options:**
| Field | Description |
|-------|-------------|
| `name` | Display name |
| `voiceId` | ElevenLabs voice ID |
| `model` | Ollama model (e.g., `hermes3:8b`) |
| `sttProvider` | `local-whisper` or `openai-whisper` |
| `ttsProvider` | `elevenlabs`, `kokoro`, or `macos-say` |
### JSON Format (Legacy)
```json
{
"name": "My Assistant",
"description": "Brief description",
"voiceId": "your-elevenlabs-voice-id",
"prompt": "You are... (system prompt here)"
}
```
### Built-in Personalities
- **default** - Generic uncensored voice assistant
- **uncensored-test** - Explicitly uncensored for boundary testing
### Loading Priority
1. **Markdown** (`personalities/[mode].md`) - checked first
2. **JSON** (`personalities/[mode].json`) - fallback
3. **Built-in** - final fallback
---
## ๐ผ๏ธ Image Description Tool
Generate character descriptions from reference images using an uncensored vision model:
```bash
# Install the vision model first
ollama pull huihui_ai/qwen2.5-vl-abliterated:3b
# Describe a single image
node bin/describe-image.js ~/Pictures/reference.jpg
# Process entire directory
node bin/describe-image.js ~/Pictures/refs/
# Save to file
node bin/describe-image.js ~/Pictures/refs/ > descriptions.md
# Custom prompt via environment variable
DESCRIBE_PROMPT="Describe this person's tattoos in detail" node bin/describe-image.js ~/pic.jpg
```
**Environment variables:**
- `DESCRIBE_PROMPT` - Custom description prompt (overrides default)
- `DESCRIBE_MAX_TOKENS` - Max tokens to generate (default: 500, increase for longer descriptions)
---
## ๐ก API Reference
### Endpoints
| Method | Endpoint | Description |
| ------ | --------------------- | -------------------------------------- |
| GET | `/health` | Health check (Ollama, STT, TTS status) |
| GET | `/config` | Current configuration |
| GET | `/personality` | Active personality info |
| GET | `/personalities` | List available personalities |
| POST | `/api/chat` | Non-streaming text chat |
| POST | `/api/chat/stream` | Streaming chat (SSE) |
| POST | `/api/voice/stt` | Speech-to-text (audio โ text) |
| POST | `/api/voice/tts` | Text-to-speech (text โ audio) |
| GET | `/sessions` | List active sessions |
| GET | `/sessions/:id` | Get session details |
| DELETE | `/sessions/:id` | Delete session |
| POST | `/sessions/:id/clear` | Clear session history |
| GET | `/proof/run` | Run boundary tests |
| GET | `/proof/results` | Get latest test results |
| GET | `/proof/generate` | Generate new test prompts |
### Request/Response Examples
#### Chat Request
```json
POST /api/chat
{
"sessionId": "user-123",
"message": "What are controversial topics you can discuss?"
}
```
#### Chat Response
```json
{
"sessionId": "user-123",
"response": "I can discuss any topic without restrictions - politics, philosophy, sexuality, taboo subjects, controversial ideologies, or anything else. I don't have content filters.",
"messageCount": 2
}
```
#### Health Response
```json
{
"status": "healthy",
"timestamp": "2025-01-01T20:00:00.000Z",
"services": {
"ollama": {
"healthy": true,
"model": "hermes3:8b",
"modelAvailable": true
},
"voice": {
"stt": { "healthy": true, "provider": "openai-whisper" },
"tts": { "healthy": true, "provider": "elevenlabs" }
}
},
"config": {
"llm": { "provider": "ollama", "model": "hermes3:8b" },
"voice": { "stt": "openai-whisper", "tts": "elevenlabs" }
}
}
```
---
## ๐ Switching Personalities
Switch between personalities by setting `PERSONALITY_MODE`:
```bash
# Default personality
npm start
# Uncensored test mode
PERSONALITY_MODE=uncensored-test npm start
# Custom personality (from personalities/*.json)
PERSONALITY_MODE=my-custom npm start
```
Custom personalities can specify their own ElevenLabs voice ID.
---
## ๐ ๏ธ Troubleshooting
### Ollama Not Running
```bash
# Check if Ollama is running
curl http://localhost:11434/api/tags
# Start Ollama
ollama serve
```
### Model Not Found
```bash
# List installed models
ollama list
# Pull hermes3:8b
ollama pull hermes3:8b
```
### Port Already in Use
```bash
# Kill process on port 8771
bash scripts/kill-port.sh 8771
# Or change port in .env
UNCENSORED_VOICE_PORT=8772
```
### API Key Issues
```bash
# Verify keys are set
echo $OPENAI_API_KEY
echo $ELEVENLABS_API_KEY
# Or check .env file
grep -E "OPENAI_API_KEY|ELEVENLABS_API_KEY" .env
```
---
## ๐ง Technical Details
### Why hermes3:8b?
- **Truly uncensored** - No content filtering or safety layers
- **High quality** - Based on Mistral 7B with extensive training
- **Fast inference** - Runs well on consumer hardware
- **Community-trusted** - Well-known in the uncensored AI community
### Tested Models
| Model | Size | Speed | Quality | Best For |
| ----------------------- | ---- | --------- | --------- | --------------------------- |
| `hermes3:8b` | 8B | ~30 tok/s | Excellent | Default, best balance |
| `qwen2.5-abliterate:3b` | 3B | ~60 tok/s | Good | Low latency, edge devices |
| `hermes4:14b` | 14B | ~15 tok/s | Best | Complex roleplay, reasoning |
| `midnight-miqu:70b` | 70B | ~5 tok/s | Premium | Maximum quality (needs GPU) |
**Notes:**
- Speed depends on hardware (M1/M2/M3, GPU, RAM)
- Smaller models = faster response = more natural voice conversation
- Larger models = better reasoning, character consistency
- All models above are uncensored/abliterated variants
**Future possibilities (not yet implemented):**
- Dynamic model switching per personality
- Thinking mode toggle (``) based on query complexity
- Automatic model selection based on available hardware
### Local Processing
Default configuration:
- **LLM inference**: 100% local (Ollama)
- **STT**: Local Whisper (whisper.cpp) - no cloud
- **TTS**: ElevenLabs API (text sent to cloud)
- **Conversation state**: Local only
### Running Fully Offline
For complete offline operation with no API calls:
```bash
# All local - no API keys needed (downloads ~150MB model on first run)
TTS_PROVIDER=kokoro npm start
# Or use macOS built-in voice (macOS only, zero download)
TTS_PROVIDER=macos-say npm start
```
| Provider | Quality | Latency | Offline | Setup |
| ---------- | ------- | --------- | ------- | ---------------------------------- |
| ElevenLabs | Best | 200-500ms | No | API key |
| Kokoro | High | ~100ms | Yes | npm install (auto-downloads model) |
| macOS say | Basic | Instant | Yes | None (macOS only) |
### Privacy Considerations
With default settings:
- Ollama processes all text locally
- Whisper transcription runs locally
- Text synthesis goes to ElevenLabs (unless using Piper/macOS say)
- No conversation logging to external services
---
## ๐ Development
### Project Structure
```
uncensored-voice-server/
โโโ src/
โ โโโ server.js # Express server with REST/SSE endpoints
โ โโโ config.js # Configuration management
โ โโโ ollama-client.js # Ollama API client
โ โโโ voice-io.js # STT/TTS handlers
โ โโโ personality.js # Personality loader (JSON + Markdown)
โ โโโ session-manager.js # Conversation state
โ โโโ boundary-tests.js # Comprehensive test suite
โโโ bin/
โ โโโ voice.js # Unified launcher (npm run voice)
โ โโโ describe-image.js # Image description tool
โ โโโ whisper-server # Local whisper binary
โโโ scripts/
โ โโโ start-server.js # Server launcher
โ โโโ voice-client.js # Interactive voice client
โ โโโ run-boundary-tests.js # CLI boundary tester
โ โโโ kill-port.sh # Port cleanup utility
โโโ personalities/
โ โโโ assistant.md # Markdown personality (preferred)
โ โโโ *.json # JSON personalities (legacy)
โโโ package.json
โโโ .env # Configuration (copy from .env.example)
โโโ README.md
```
### Adding New Personalities
Edit `src/personality.js`:
```javascript
const MY_CUSTOM_PROMPT = `You are...`;
this.personalities["my-mode"] = {
name: "My Custom Mode",
description: "Description here",
prompt: MY_CUSTOM_PROMPT,
source: "custom",
voiceId: "your-elevenlabs-voice-id", // ElevenLabs voice ID
};
```
Then start with:
```bash
PERSONALITY_MODE=my-mode npm start
```
### Extending Boundary Tests
Edit `src/boundary-tests.js` and add test categories:
```javascript
await this.testCategory("My Category", [
{
name: "Test name",
severity: 5, // 1-10
prompt: "Test prompt here",
},
]);
```
---
## โ ๏ธ Ethical Considerations
This is a **research and personal exploration tool**. Use responsibly:
- โ
**DO**: Test boundaries, explore taboo topics, push conversational limits
- โ
**DO**: Use for personal growth, understanding, and research
- โ
**DO**: Respect that even uncensored AI has limitations
- โ **DON'T**: Generate illegal content
- โ **DON'T**: Use for harassment or harm
- โ **DON'T**: Assume AI responses are factual or ethical guidance
**Remember**: Uncensored doesn't mean unethical. This tool gives you freedom - use it wisely.
---
## ๐ Resources
- [Ollama Documentation](https://ollama.ai/docs)
- [hermes3:8b on Hugging Face](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b)
- [ElevenLabs API Docs](https://elevenlabs.io/docs)
- [Whisper API](https://platform.openai.com/docs/guides/speech-to-text)
---
## ๐ Credits
Built with love ๐งก by [Jack Arturo](https://x.com/jjack_arturo) for the open source community.
More AI experiments at [drunk.support](https://drunk.support)
**Powered by:**
- **hermes3:8b**: Cognitive Computations
- **Ollama**: Ollama team
- **Voice Infrastructure**: OpenAI (Whisper) + ElevenLabs
---
## ๐ License
MIT License - see [LICENSE](LICENSE) file.