{"id":14964695,"url":"https://github.com/developersdigest/ai-devices","last_synced_at":"2025-04-06T23:17:02.107Z","repository":{"id":234692295,"uuid":"789379581","full_name":"developersdigest/ai-devices","owner":"developersdigest","description":"AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more","archived":false,"fork":false,"pushed_at":"2024-07-22T07:14:13.000Z","size":9054,"stargazers_count":289,"open_issues_count":0,"forks_count":41,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-30T21:07:26.256Z","etag":null,"topics":["function-calling","gpt-4-vision","groq","langchain","langsmith","llama3","llava","llm","openai","serper","tts","whisper"],"latest_commit_sha":null,"homepage":"https://developersdigest.tech","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/developersdigest.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-20T11:36:12.000Z","updated_at":"2025-03-28T13:38:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"4a7431f8-443a-423f-a331-87667c1ef96b","html_url":"https://github.com/developersdigest/ai-devices","commit_stats":{"total_commits":42,"total_committers":2,"mean_commits":21.0,"dds":"0.023809523809523836","last_synced_commit":"8fa8afa217f950b4a8f8046f2095f27cf05c2db5"},"previous_names":["developersdigest/ai-pin","developersdigest/ai-devices"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/developersdigest%2Fai-devices","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/developersdigest%2Fai-devices/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/developersdigest%2Fai-devices/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/developersdigest%2Fai-devices/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/developersdigest","download_url":"https://codeload.github.com/developersdigest/ai-devices/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247563936,"owners_count":20958971,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["function-calling","gpt-4-vision","groq","langchain","langsmith","llama3","llava","llm","openai","serper","tts","whisper"],"created_at":"2024-09-24T13:33:39.518Z","updated_at":"2025-04-06T23:17:02.087Z","avatar_url":"https://github.com/developersdigest.png","language":"TypeScript","readme":"\u003ch1 align=\"center\"\u003eAI Device Template\u003c/h1\u003e\n\u003cdiv\u003e\n    \u003cdiv align=\"center\"\u003e\n        \u003ca href=\"https://twitter.com/dev__digest\"\u003e\n            \u003cimg src=\"https://img.shields.io/badge/X/Twitter-000000?style=for-the-badge\u0026logo=x\u0026logoColor=white\" /\u003e\n        \u003c/a\u003e\n        \u003ca href=\"https://www.youtube.com/@developersdigest\"\u003e\n            \u003cimg src=\"https://img.shields.io/badge/YouTube-FF0000?style=for-the-badge\u0026logo=youtube\u0026logoColor=white\" /\u003e\n        \u003c/a\u003e\n    \u003c/div\u003e\n\u003c/div\u003e\n\u003ch2 style=\"display: flex; justify-content: center; align-items: left; width: 100%;\"\u003eNow supports gpt-4o and gemini-1.5-flash-latest for Vision Inference\u003c/h2\u003e\n\u003cdiv style=\"display: flex; justify-content: center; align-items: left; width: 100%;\" \u003e\n    \u003cdiv align=\"center\" style=\"width:100%\"\u003e \n        \u003ca href=\"https://www.youtube.com/@developersdigest\" \u003e\n            \u003cimg src=\"https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExMmV0bjdkYzNpcDNka3BoaTFoNDJ3MTl0c3dmN3pqZGdjanh6N3c2YSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/UjkWj9Q6yInxQHp1tY/giphy.gif\" style=\"width: 100%; height: auto;\"/\u003e\n        \u003c/a\u003e\n        \u003ca href=\"https://twitter.com/dev__digest\" \u003e\n            \u003cimg src=\"https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExbzZtaTB5eHVzcjlnaHQ5b2c5OGJqeG9kcTk3N3V4eG5xY25mdHlpayZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/GIcUHNXLsv0pEHHQ2i/giphy.gif\" style=\"width: 100%; height: auto;\" /\u003e\n        \u003c/a\u003e\n    \u003c/div\u003e\n\u003c/div\u003e\n\n\u003ch2 align=\"center\"\u003eYouTube Tutorial\u003c/h2\u003e\n\n\u003cdiv style=\"display: flex; justify-content: center; align-items: left;\" \u003e\n    \u003ca href=\"https://youtu.be/CXDFGyO2FUI\"\u003e\n        \u003cimg src=\"https://img.youtube.com/vi/CXDFGyO2FUI/0.jpg\" alt=\"Tutorial 2\" style=\"width: 100%; height: auto;\"\u003e\n    \u003c/a\u003e\n\u003c/div\u003e\n\nThis project is an AI-powered voice assistant utilizing various AI models and services to provide intelligent responses to user queries. It supports voice input, transcription, text-to-speech, image processing, and function calling with conditionally rendered UI components. This was inspired by the recent trend of AI Devices such as the Humane AI Pin and the Rabbit R1. \n\n## Features\n\n- **Voice input and transcription:** Using Whisper models from Groq or OpenAI\n- **Text-to-speech output:** Using OpenAI's TTS models\n- **Image processing:** Using OpenAI's GPT-4 Vision or Fal.ai's Llava-Next models\n- **Function calling and conditionally rendered UI components:** Using OpenAI's GPT-3.5-Turbo model\n- **Customizable UI settings:** Includes response times, settings toggle, text-to-speech toggle, internet results toggle, and photo upload toggle\n- **(Optional) Rate limiting:** Using Upstash\n- **(Optional) Tracing:** With Langchain's LangSmith for function execution\n\n## Setup\n\n### 1. Clone the repository\n```bash\ngit clone https://github.com/developersdigest/ai-devices.git\n```\n\n### 2. Install dependencies\n```bash\nnpm install \n# or\nbun install\n```\n\n## 3. Add API Keys\n\nTo use this AI-powered voice assistant, you need to provide the necessary API keys for the selected AI models and services. \n\n### Required for core functionality\n- **Groq API Key** For Llama + Whisper\n- **OpenAI API Key** for TTS and Vision + Whisper\n- **Serper API Key** for Internet Results \n\n### Optional for advanced configuration\n- **Langchain Tracing** for function execution tracing\n- **Upstash Redis** for IP-based rate limiting\n- **Spotify** for Spotify API interactions\n- **Fal.AI (Lllava Image Model)** Alternative vision model to GPT-4-Vision\n\nReplace 'API_KEY_GOES_HERE' with your actual API keys for each service.\n\n### 4. Start the development server\n```bash\nnpm run dev\n# or\nbun dev\n```\n\nAccess the application at `http://localhost:3000` or through the provided URL.\n\n### 5. Deployment\n\n[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/developersdigests-projects/clone?repository-url=https%3A%2F%2Fgithub.com%2Fdevelopersdigest%2Fai-devices\u0026env=GROQ_API_KEY\u0026env=OPENAI_API_KEY\u0026project-name=ai-devices\u0026repository-name=ai-devices)\n\n## Configuration\n\nModify `app/config.tsx` to adjust settings and configurations for the AI-powered voice assistant. Here’s an overview of the available options:\n\n```typescript\nexport const config = {\n    // Inference settings\n    inferenceModelProvider: 'groq', // 'groq' or 'openai'\n    inferenceModel: 'llama3-8b-8192', // Groq: 'llama3-70b-8192' or 'llama3-8b-8192'.. OpenAI: 'gpt-4-turbo etc\n\n    // BELOW OPTIONAL are some options for the app to use\n    \n    // Whisper settings\n    whisperModelProvider: 'openai', // 'groq' or 'openai'\n    whisperModel: 'whisper-1', // Groq: 'whisper-large-v3' OpenAI: 'whisper-1'\n\n    // TTS settings\n    ttsModelProvider: 'openai', // only openai supported for now...\n    ttsModel: 'tts-1', // only openai supported for now...s\n    ttsvoice: 'alloy', // only openai supported for now... [alloy, echo, fable, onyx, nova, and shimmer]\n\n    // OPTIONAL:Vision settings \n    visionModelProvider: 'google', // 'openai' or 'fal.ai' or 'google'\n    visionModel: 'gemini-1.5-flash-latest', // OpenAI: 'gpt-4o' or  Fal.ai: 'llava-next' or  Google: 'gemini-1.5-flash-latest'\n\n    // Function calling + conditionally rendered UI \n    functionCallingModelProvider: 'openai', // 'openai' current only\n    functionCallingModel: 'gpt-3.5-turbo', // OpenAI: 'gpt-3-5-turbo'\n\n    // UI settings \n    enableResponseTimes: false, // Display response times for each message\n    enableSettingsUIToggle: true, // Display the settings UI toggle\n    enableTextToSpeechUIToggle: true, // Display the text to speech UI toggle\n    enableInternetResultsUIToggle: true, // Display the internet results UI toggle\n    enableUsePhotUIToggle: true, // Display the use photo UI toggle\n    enabledRabbitMode: true, // Enable the rabbit mode UI toggle\n    enabledLudicrousMode: true, // Enable the ludicrous mode UI toggle\n    useAttributionComponent: true, // Use the attribution component to display the attribution of the AI models/services used\n\n    // Rate limiting settings\n    useRateLimiting: false, // Use Upstash rate limiting to limit the number of requests per user\n\n    // Tracing with Langchain\n    useLangSmith: true, // Use LangSmith by Langchain to trace the execution of the functions in the config.tsx set to true to use.\n};\n```\n\n## Contributing\n\nContributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.\n\nI'm the developer behind Developers Digest. If you find my work helpful or enjoy what I do, consider supporting me. Here are a few ways you can do that:\n\n- **Patreon**: Support me on Patreon at [patreon.com/DevelopersDigest](https://www.patreon.com/DevelopersDigest)\n- **Buy Me A Coffee**: You can buy me a coffee at [buymeacoffee.com/developersdigest](https://www.buymeacoffee.com/developersdigest)\n- **Website**: Check out my website at [developersdigest.tech](https://developersdigest.tech)\n- **Github**: Follow me on GitHub at [github.com/developersdigest](https://github.com/developersdigest)\n- **Twitter**: Follow me on Twitter at [twitter.com/dev__digest](https://twitter.com/dev__digest)","funding_links":["https://www.patreon.com/DevelopersDigest","https://www.buymeacoffee.com/developersdigest"],"categories":["TypeScript"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevelopersdigest%2Fai-devices","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevelopersdigest%2Fai-devices","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevelopersdigest%2Fai-devices/lists"}