An open API service indexing awesome lists of open source software.

https://github.com/spences10/audiomind

An MP3 to AI Chat Assistant - A configurable AI chat assistant that can be customized for your content and use case. Transform audio content into interactive, searchable conversations.
https://github.com/spences10/audiomind

deepgram libsql svelte svelte5 sveltekit turso voyageai

Last synced: 3 months ago
JSON representation

An MP3 to AI Chat Assistant - A configurable AI chat assistant that can be customized for your content and use case. Transform audio content into interactive, searchable conversations.

Awesome Lists containing this project

README

        

# AudioMind

> **Note**: This started as a personal/toy project I created to ask
> questions to podcasts I listen to. I've made it generic so others
> can use it with their own audio content. Feel free to adapt it for
> your needs!

An MP3 to AI Chat Assistant - A configurable AI chat assistant that
can be customized for your content and use case. Transform audio
content into interactive, searchable conversations.

## ⚠️ Important Security Notice

This is a demonstration project and **should not be deployed to
production** without implementing additional security measures:

- The application currently has no user authentication
- Lack of authentication could lead to:
- Unauthorized access to your API endpoints
- Potential abuse of your API rate limits
- Excessive costs from unrestricted API usage
- Malicious use of your services
- Consider implementing proper authentication, rate limiting, and
usage monitoring before any production deployment

## Technologies Used

- **Frontend Framework**: SvelteKit 2.x with Svelte 5
- **UI/Styling**:
- TailwindCSS for styling
- DaisyUI for UI components
- Typography plugin for content formatting
- **AI/Machine Learning**:
- Anthropic Claude 3 for natural language processing
- Deepgram for audio transcription
- Voyage for vector database and similarity search
- **Database**: Turso (LibSQL) for data storage
- **Development Tools**:
- TypeScript for type safety
- Vite for development and building
- ESLint and Prettier for code quality
- Vitest and Playwright for testing
- **Performance**:
- LRU Cache for response caching
- Server-sent events for real-time progress updates

## Setup

1. Clone the repository
2. Copy `.env.example` to `.env`:

```bash
cp .env.example .env
```

3. Fill in your environment variables in `.env`
4. Install dependencies:

```bash
npm install
```

5. Start the development server:

```bash
npm run dev
```

## Configuration

### Environment Variables

Required environment variables in your `.env` file:

```env
# Required
ANTHROPIC_API_KEY=your_api_key_here
TURSO_URL=your_turso_url_here
TURSO_AUTH_TOKEN=your_turso_auth_token_here
DEEPGRAM_API_KEY=your_deepgram_key_here
VOYAGE_API_KEY=your_voyage_key_here
```

### Application Configuration

The application can be customized by modifying
`src/lib/config/app-config.ts`. This file contains non-sensitive
configuration options:

```typescript
{
// Application name and description
app_name: 'My Custom Assistant',
app_description: 'Chat with your content using AI',

// AI Configuration
ai: {
// Available models:
// - claude-3-opus-20240229 (most capable)
// - claude-3-sonnet-20240229 (balanced)
// - claude-3-haiku-20240307 (fastest)
model: 'claude-3-haiku-20240307',

max_tokens: 1024,

// Customize the AI's behaviour
system_prompt: 'Your custom system prompt...',

// Define different response styles
style_instructions: {
normal: {
description: 'Balanced and clear responses',
instruction: 'Your custom instruction...'
},
// Add more styles as needed
}
},

// Set default response style
default_response_style: 'normal'
}
```

### Response Styles

The application supports multiple response styles that can be
configured:

- `normal`: Balanced and clear responses
- `concise`: Brief and direct responses
- `explanatory`: Detailed explanations
- `formal`: Professional and structured responses

You can modify existing styles or add new ones in the configuration
file.

## Features

- Audio file upload and processing
- Real-time transcription progress updates
- Vector-based semantic search
- Configurable AI response styles
- Interactive chat interface
- Server-side streaming responses
- Response caching for performance

## Security Notes

- Never commit your `.env` file
- Keep all sensitive information (API keys, credentials) in
environment variables
- The `app-config.ts` file should only contain non-sensitive
application settings
- Implement proper authentication before deploying to production
- Consider adding rate limiting to protect your API usage
- Monitor API usage to prevent excessive costs

## Development

To start the development server:

```bash
npm run dev
```

Visit `http://localhost:5173` to see your application.

## License

MIT