{"id":13341881,"url":"https://github.com/astrologos/py-speakeasy","last_synced_at":"2025-03-11T23:30:26.149Z","repository":{"id":171723572,"uuid":"648323145","full_name":"astrologos/py-speakeasy","owner":"astrologos","description":"Speakeasy GPT is a Jupyter notebook that utilizes several natural language processing utilities to provide a seamless and low-latency speech interface to ChatGPT and other large language models.  ","archived":false,"fork":false,"pushed_at":"2023-06-01T18:05:03.000Z","size":1141,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-10-24T10:07:56.223Z","etag":null,"topics":["automatic-speech-recognition","chat-gpt","coqui-ai","coqui-tts","elevenlabs-api","mimic","mycroftai","text-to-speech","whisper"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/astrologos.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-01T17:50:35.000Z","updated_at":"2024-01-27T00:49:52.000Z","dependencies_parsed_at":null,"dependency_job_id":"69cda799-1816-4cfb-aff1-75dfbe7be4dd","html_url":"https://github.com/astrologos/py-speakeasy","commit_stats":null,"previous_names":["astrologos/py-speakeasy"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astrologos%2Fpy-speakeasy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astrologos%2Fpy-speakeasy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astrologos%2Fpy-speakeasy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astrologos%2Fpy-speakeasy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/astrologos","download_url":"https://codeload.github.com/astrologos/py-speakeasy/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243129493,"owners_count":20241025,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automatic-speech-recognition","chat-gpt","coqui-ai","coqui-tts","elevenlabs-api","mimic","mycroftai","text-to-speech","whisper"],"created_at":"2024-07-29T19:26:40.656Z","updated_at":"2025-03-11T23:30:26.121Z","avatar_url":"https://github.com/astrologos.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Speakeasy GPT\n\nSpeakeasy GPT is a Jupyter notebook that utilizes several natural language processing utilities to provide a seamless and low-latency speech interface to ChatGPT and other large language models.\n\nVoice prompts are transcribed using OpenAI's whisper model, run locally on CPU or GPU. The transcription is sent as a prompt to the OpenAI gpt-3.5.turbo API. The response is synthesized into speech by several text to speech engines, including ElevenLabs' API, Mimic 3, and Coqui TTS.\n\n## Installation and Dependencies\n\n- Mount Drive: Mount the Google Drive to access the notebook files.\n- Installs: Install the required dependencies and packages, including espeak, ElevenLabs, Mimic 3 TTS, TTS, ffmpeg-python, pydub, and OpenAI.\n- Imports: Import the necessary libraries and modules for audio processing, natural language processing, and TTS.\n\n## Usage\n\n1. Mount Drive: Mount the Google Drive to access the notebook files.\n2. Installation: Install the required dependencies and packages by running the provided installation commands.\n3. Imports: Import the necessary libraries and modules for audio processing, natural language processing, and TTS.\n4. Check CUDA: Check if CUDA is available for GPU acceleration.\n5. Load Whisper: Load the whisper model for speech-to-text transcription using OpenAI's whisper model.\n6. Load Mimic TTS: Load the Mimic 3 Text-to-Speech System for generating speech using the Mimic 3 TTS engine.\n7. Load Coqui TTS: Load the Coqui TTS engine for generating speech using the Coqui TTS model.\n8. Load ElevenLabs: Load the ElevenLabs API for generating speech using the ElevenLabs TTS model.\n9. Load ChatGPT: Set up the OpenAI API for ChatGPT and define the initial system message.\n10. Record prompt from microphone: Record audio prompts from the microphone and save them as WAV files.\n11. Transcribe prompt audio to text prompt: Transcribe the audio prompts to text using the whisper model.\n12. Prompt ChatGPT: Send the text prompts to the ChatGPT model for generating responses.\n13. Generate response audio: Generate speech audio for the ChatGPT responses using the selected TTS engine.\n14. Full loop - ElevenLabs TTS: Perform the full loop of transcription, ChatGPT prompt, and response audio generation using the ElevenLabs TTS engine.\n15. Full loop - Coqui TTS: Perform the full loop of transcription, ChatGPT prompt, and response audio generation using the Coqui TTS engine.\n16. Full loop - Mimic TTS: Perform the full loop of transcription, ChatGPT prompt, and response audio generation using the Mimic 3 TTS engine.\n\nNote: Modify the code and parameters as needed for your specific use case.\n\n## License\nThis code is licensed under the Creative Commons Attribution-NonCommercial (CC BY-NC) license, allowing for non-commercial use and modification with proper attribution.\nSee the license here:  https://creativecommons.org/licenses/by-nc/2.0/\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrologos%2Fpy-speakeasy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fastrologos%2Fpy-speakeasy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrologos%2Fpy-speakeasy/lists"}