Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/developersdigest/ai-devices
AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more
https://github.com/developersdigest/ai-devices
function-calling gpt-4-vision groq langchain langsmith llama3 llava llm openai serper tts whisper
Last synced: 1 day ago
JSON representation
AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more
- Host: GitHub
- URL: https://github.com/developersdigest/ai-devices
- Owner: developersdigest
- License: mit
- Created: 2024-04-20T11:36:12.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-07-22T07:14:13.000Z (6 months ago)
- Last Synced: 2025-01-13T09:05:13.198Z (9 days ago)
- Topics: function-calling, gpt-4-vision, groq, langchain, langsmith, llama3, llava, llm, openai, serper, tts, whisper
- Language: TypeScript
- Homepage: https://developersdigest.tech
- Size: 8.63 MB
- Stars: 284
- Watchers: 3
- Forks: 41
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
AI Device Template
Now supports gpt-4o and gemini-1.5-flash-latest for Vision Inference
YouTube Tutorial
This project is an AI-powered voice assistant utilizing various AI models and services to provide intelligent responses to user queries. It supports voice input, transcription, text-to-speech, image processing, and function calling with conditionally rendered UI components. This was inspired by the recent trend of AI Devices such as the Humane AI Pin and the Rabbit R1.
## Features
- **Voice input and transcription:** Using Whisper models from Groq or OpenAI
- **Text-to-speech output:** Using OpenAI's TTS models
- **Image processing:** Using OpenAI's GPT-4 Vision or Fal.ai's Llava-Next models
- **Function calling and conditionally rendered UI components:** Using OpenAI's GPT-3.5-Turbo model
- **Customizable UI settings:** Includes response times, settings toggle, text-to-speech toggle, internet results toggle, and photo upload toggle
- **(Optional) Rate limiting:** Using Upstash
- **(Optional) Tracing:** With Langchain's LangSmith for function execution## Setup
### 1. Clone the repository
```bash
git clone https://github.com/developersdigest/ai-devices.git
```### 2. Install dependencies
```bash
npm install
# or
bun install
```## 3. Add API Keys
To use this AI-powered voice assistant, you need to provide the necessary API keys for the selected AI models and services.
### Required for core functionality
- **Groq API Key** For Llama + Whisper
- **OpenAI API Key** for TTS and Vision + Whisper
- **Serper API Key** for Internet Results### Optional for advanced configuration
- **Langchain Tracing** for function execution tracing
- **Upstash Redis** for IP-based rate limiting
- **Spotify** for Spotify API interactions
- **Fal.AI (Lllava Image Model)** Alternative vision model to GPT-4-VisionReplace 'API_KEY_GOES_HERE' with your actual API keys for each service.
### 4. Start the development server
```bash
npm run dev
# or
bun dev
```Access the application at `http://localhost:3000` or through the provided URL.
### 5. Deployment
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/developersdigests-projects/clone?repository-url=https%3A%2F%2Fgithub.com%2Fdevelopersdigest%2Fai-devices&env=GROQ_API_KEY&env=OPENAI_API_KEY&project-name=ai-devices&repository-name=ai-devices)
## Configuration
Modify `app/config.tsx` to adjust settings and configurations for the AI-powered voice assistant. Here’s an overview of the available options:
```typescript
export const config = {
// Inference settings
inferenceModelProvider: 'groq', // 'groq' or 'openai'
inferenceModel: 'llama3-8b-8192', // Groq: 'llama3-70b-8192' or 'llama3-8b-8192'.. OpenAI: 'gpt-4-turbo etc// BELOW OPTIONAL are some options for the app to use
// Whisper settings
whisperModelProvider: 'openai', // 'groq' or 'openai'
whisperModel: 'whisper-1', // Groq: 'whisper-large-v3' OpenAI: 'whisper-1'// TTS settings
ttsModelProvider: 'openai', // only openai supported for now...
ttsModel: 'tts-1', // only openai supported for now...s
ttsvoice: 'alloy', // only openai supported for now... [alloy, echo, fable, onyx, nova, and shimmer]// OPTIONAL:Vision settings
visionModelProvider: 'google', // 'openai' or 'fal.ai' or 'google'
visionModel: 'gemini-1.5-flash-latest', // OpenAI: 'gpt-4o' or Fal.ai: 'llava-next' or Google: 'gemini-1.5-flash-latest'// Function calling + conditionally rendered UI
functionCallingModelProvider: 'openai', // 'openai' current only
functionCallingModel: 'gpt-3.5-turbo', // OpenAI: 'gpt-3-5-turbo'// UI settings
enableResponseTimes: false, // Display response times for each message
enableSettingsUIToggle: true, // Display the settings UI toggle
enableTextToSpeechUIToggle: true, // Display the text to speech UI toggle
enableInternetResultsUIToggle: true, // Display the internet results UI toggle
enableUsePhotUIToggle: true, // Display the use photo UI toggle
enabledRabbitMode: true, // Enable the rabbit mode UI toggle
enabledLudicrousMode: true, // Enable the ludicrous mode UI toggle
useAttributionComponent: true, // Use the attribution component to display the attribution of the AI models/services used// Rate limiting settings
useRateLimiting: false, // Use Upstash rate limiting to limit the number of requests per user// Tracing with Langchain
useLangSmith: true, // Use LangSmith by Langchain to trace the execution of the functions in the config.tsx set to true to use.
};
```## Contributing
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
I'm the developer behind Developers Digest. If you find my work helpful or enjoy what I do, consider supporting me. Here are a few ways you can do that:
- **Patreon**: Support me on Patreon at [patreon.com/DevelopersDigest](https://www.patreon.com/DevelopersDigest)
- **Buy Me A Coffee**: You can buy me a coffee at [buymeacoffee.com/developersdigest](https://www.buymeacoffee.com/developersdigest)
- **Website**: Check out my website at [developersdigest.tech](https://developersdigest.tech)
- **Github**: Follow me on GitHub at [github.com/developersdigest](https://github.com/developersdigest)
- **Twitter**: Follow me on Twitter at [twitter.com/dev__digest](https://twitter.com/dev__digest)