https://github.com/spences10/audiomind
An MP3 to AI Chat Assistant - A configurable AI chat assistant that can be customized for your content and use case. Transform audio content into interactive, searchable conversations.
https://github.com/spences10/audiomind
deepgram libsql svelte svelte5 sveltekit turso voyageai
Last synced: 3 months ago
JSON representation
An MP3 to AI Chat Assistant - A configurable AI chat assistant that can be customized for your content and use case. Transform audio content into interactive, searchable conversations.
- Host: GitHub
- URL: https://github.com/spences10/audiomind
- Owner: spences10
- Created: 2024-12-08T09:55:54.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-03-25T19:47:45.000Z (3 months ago)
- Last Synced: 2025-03-25T20:38:45.460Z (3 months ago)
- Topics: deepgram, libsql, svelte, svelte5, sveltekit, turso, voyageai
- Language: TypeScript
- Homepage:
- Size: 362 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# AudioMind
> **Note**: This started as a personal/toy project I created to ask
> questions to podcasts I listen to. I've made it generic so others
> can use it with their own audio content. Feel free to adapt it for
> your needs!An MP3 to AI Chat Assistant - A configurable AI chat assistant that
can be customized for your content and use case. Transform audio
content into interactive, searchable conversations.## ⚠️ Important Security Notice
This is a demonstration project and **should not be deployed to
production** without implementing additional security measures:- The application currently has no user authentication
- Lack of authentication could lead to:
- Unauthorized access to your API endpoints
- Potential abuse of your API rate limits
- Excessive costs from unrestricted API usage
- Malicious use of your services
- Consider implementing proper authentication, rate limiting, and
usage monitoring before any production deployment## Technologies Used
- **Frontend Framework**: SvelteKit 2.x with Svelte 5
- **UI/Styling**:
- TailwindCSS for styling
- DaisyUI for UI components
- Typography plugin for content formatting
- **AI/Machine Learning**:
- Anthropic Claude 3 for natural language processing
- Deepgram for audio transcription
- Voyage for vector database and similarity search
- **Database**: Turso (LibSQL) for data storage
- **Development Tools**:
- TypeScript for type safety
- Vite for development and building
- ESLint and Prettier for code quality
- Vitest and Playwright for testing
- **Performance**:
- LRU Cache for response caching
- Server-sent events for real-time progress updates## Setup
1. Clone the repository
2. Copy `.env.example` to `.env`:```bash
cp .env.example .env
```3. Fill in your environment variables in `.env`
4. Install dependencies:```bash
npm install
```5. Start the development server:
```bash
npm run dev
```## Configuration
### Environment Variables
Required environment variables in your `.env` file:
```env
# Required
ANTHROPIC_API_KEY=your_api_key_here
TURSO_URL=your_turso_url_here
TURSO_AUTH_TOKEN=your_turso_auth_token_here
DEEPGRAM_API_KEY=your_deepgram_key_here
VOYAGE_API_KEY=your_voyage_key_here
```### Application Configuration
The application can be customized by modifying
`src/lib/config/app-config.ts`. This file contains non-sensitive
configuration options:```typescript
{
// Application name and description
app_name: 'My Custom Assistant',
app_description: 'Chat with your content using AI',// AI Configuration
ai: {
// Available models:
// - claude-3-opus-20240229 (most capable)
// - claude-3-sonnet-20240229 (balanced)
// - claude-3-haiku-20240307 (fastest)
model: 'claude-3-haiku-20240307',max_tokens: 1024,
// Customize the AI's behaviour
system_prompt: 'Your custom system prompt...',// Define different response styles
style_instructions: {
normal: {
description: 'Balanced and clear responses',
instruction: 'Your custom instruction...'
},
// Add more styles as needed
}
},// Set default response style
default_response_style: 'normal'
}
```### Response Styles
The application supports multiple response styles that can be
configured:- `normal`: Balanced and clear responses
- `concise`: Brief and direct responses
- `explanatory`: Detailed explanations
- `formal`: Professional and structured responsesYou can modify existing styles or add new ones in the configuration
file.## Features
- Audio file upload and processing
- Real-time transcription progress updates
- Vector-based semantic search
- Configurable AI response styles
- Interactive chat interface
- Server-side streaming responses
- Response caching for performance## Security Notes
- Never commit your `.env` file
- Keep all sensitive information (API keys, credentials) in
environment variables
- The `app-config.ts` file should only contain non-sensitive
application settings
- Implement proper authentication before deploying to production
- Consider adding rate limiting to protect your API usage
- Monitor API usage to prevent excessive costs## Development
To start the development server:
```bash
npm run dev
```Visit `http://localhost:5173` to see your application.
## License
MIT