{"id":27522957,"url":"https://github.com/gradio-app/fastrtc","last_synced_at":"2025-05-12T20:51:15.645Z","repository":{"id":257826345,"uuid":"863050120","full_name":"gradio-app/fastrtc","owner":"gradio-app","description":"The python library for real-time communication","archived":false,"fork":false,"pushed_at":"2025-04-23T16:53:13.000Z","size":4694,"stargazers_count":3692,"open_issues_count":40,"forks_count":319,"subscribers_count":30,"default_branch":"main","last_synced_at":"2025-04-23T17:32:38.735Z","etag":null,"topics":["artificial-intelligence","llm","python","real-time","speech-to-text","text-to-speech"],"latest_commit_sha":null,"homepage":"https://fastrtc.org/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gradio-app.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-09-25T16:19:19.000Z","updated_at":"2025-04-23T17:20:55.000Z","dependencies_parsed_at":"2024-10-25T03:33:33.414Z","dependency_job_id":"1839b98e-8294-4411-a831-be3e06e84f72","html_url":"https://github.com/gradio-app/fastrtc","commit_stats":{"total_commits":41,"total_committers":1,"mean_commits":41.0,"dds":0.0,"last_synced_commit":"d7acdd7eb4c9d2c0a2ab60ff351c15911eae6217"},"previous_names":["freddyaboulton/gradio-webrtc","freddyaboulton/fastrtc","gradio-app/fastrtc"],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gradio-app%2Ffastrtc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gradio-app%2Ffastrtc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gradio-app%2Ffastrtc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gradio-app%2Ffastrtc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gradio-app","download_url":"https://codeload.github.com/gradio-app/fastrtc/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251561228,"owners_count":21609358,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","llm","python","real-time","speech-to-text","text-to-speech"],"created_at":"2025-04-18T11:01:50.313Z","updated_at":"2025-04-29T18:38:30.618Z","avatar_url":"https://github.com/gradio-app.png","language":"JavaScript","funding_links":[],"categories":["JavaScript","Frameworks \u0026 Platforms | 框架与平台"],"sub_categories":["Comprehensive Frameworks | 综合性框架"],"readme":"\u003cdiv style='text-align: center; margin-bottom: 1rem; display: flex; justify-content: center; align-items: center;'\u003e\n    \u003ch1 style='color: white; margin: 0;'\u003eFastRTC\u003c/h1\u003e\n    \u003cimg src='https://huggingface.co/datasets/freddyaboulton/bucket/resolve/main/fastrtc_logo_small.png'\n         alt=\"FastRTC Logo\" \n         style=\"margin-right: 10px;\"\u003e\n\u003c/div\u003e\n\n\u003cdiv style=\"display: flex; flex-direction: row; justify-content: center\"\u003e\n\u003cimg style=\"display: block; padding-right: 5px; height: 20px;\" alt=\"Static Badge\" src=\"https://img.shields.io/pypi/v/fastrtc\"\u003e \n\u003ca href=\"https://github.com/gradio-app/fastrtc\" target=\"_blank\"\u003e\u003cimg alt=\"Static Badge\" src=\"https://img.shields.io/badge/github-white?logo=github\u0026logoColor=black\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\n\u003ch3 style='text-align: center'\u003e\nThe Real-Time Communication Library for Python. \n\u003c/h3\u003e\n\nTurn any python function into a real-time audio and video stream over WebRTC or WebSockets.\n\n## Installation\n\n```bash\npip install fastrtc\n```\n\nto use built-in pause detection (see [ReplyOnPause](https://fastrtc.org/userguide/audio/#reply-on-pause)), and text to speech (see [Text To Speech](https://fastrtc.org/userguide/audio/#text-to-speech)), install the `vad` and `tts` extras:\n\n```bash\npip install \"fastrtc[vad, tts]\"\n```\n\n## Key Features\n\n- 🗣️ Automatic Voice Detection and Turn Taking built-in, only worry about the logic for responding to the user.\n- 💻 Automatic UI - Use the `.ui.launch()` method to launch the webRTC-enabled built-in Gradio UI.\n- 🔌 Automatic WebRTC Support - Use the `.mount(app)` method to mount the stream on a FastAPI app and get a webRTC endpoint for your own frontend! \n- ⚡️ Websocket Support - Use the `.mount(app)` method to mount the stream on a FastAPI app and get a websocket endpoint for your own frontend! \n- 📞 Automatic Telephone Support - Use the `fastphone()` method of the stream to launch the application and get a free temporary phone number!\n- 🤖 Completely customizable backend - A `Stream` can easily be mounted on a FastAPI app so you can easily extend it to fit your production application. See the [Talk To Claude](https://huggingface.co/spaces/fastrtc/talk-to-claude) demo for an example on how to serve a custom JS frontend.\n\n## Docs\n\n[https://fastrtc.org](https://fastrtc.org)\n\n## Examples\nSee the [Cookbook](https://fastrtc.org/cookbook/) for examples of how to use the library.\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🗣️👀 Gemini Audio Video Chat\u003c/h3\u003e\n\u003cp\u003eStream BOTH your webcam video and audio feeds to Google Gemini. You can also upload images to augment your conversation!\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/9636dc97-4fee-46bb-abb8-b92e69c08c71\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/freddyaboulton/gemini-audio-video-chat\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/freddyaboulton/gemini-audio-video-chat/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🗣️ Google Gemini Real Time Voice API\u003c/h3\u003e\n\u003cp\u003eTalk to Gemini in real time using Google's voice API.\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/ea6d18cb-8589-422b-9bba-56332d9f61de\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/talk-to-gemini\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/talk-to-gemini/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\n\u003ctr\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🗣️ OpenAI Real Time Voice API\u003c/h3\u003e\n\u003cp\u003eTalk to ChatGPT in real time using OpenAI's voice API.\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/178bdadc-f17b-461a-8d26-e915c632ff80\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/talk-to-openai\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/talk-to-openai/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🤖 Hello Computer\u003c/h3\u003e\n\u003cp\u003eSay computer before asking your question!\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/afb2a3ef-c1ab-4cfb-872d-578f895a10d5\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/hello-computer\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/hello-computer/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\n\u003ctr\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🤖 Llama Code Editor\u003c/h3\u003e\n\u003cp\u003eCreate and edit HTML pages with just your voice! Powered by SambaNova systems.\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/98523cf3-dac8-4127-9649-d91a997e3ef5\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/llama-code-editor\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/llama-code-editor/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🗣️ Talk to Claude\u003c/h3\u003e\n\u003cp\u003eUse the Anthropic and Play.Ht APIs to have an audio conversation with Claude.\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/fb6ef07f-3ccd-444a-997b-9bc9bdc035d3\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/talk-to-claude\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/talk-to-claude/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\n\u003ctr\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🎵 Whisper Transcription\u003c/h3\u003e\n\u003cp\u003eHave whisper transcribe your speech in real time!\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/87603053-acdc-4c8a-810f-f618c49caafb\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/whisper-realtime\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/whisper-realtime/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e📷 Yolov10 Object Detection\u003c/h3\u003e\n\u003cp\u003eRun the Yolov10 model on a user webcam stream in real time!\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/f82feb74-a071-4e81-9110-a01989447ceb\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/object-detection\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/fastrtc/object-detection/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\n\u003ctr\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🗣️ Kyutai Moshi\u003c/h3\u003e\n\u003cp\u003eKyutai's moshi is a novel speech-to-speech model for modeling human conversations.\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/becc7a13-9e89-4a19-9df2-5fb1467a0137\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/freddyaboulton/talk-to-moshi\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/freddyaboulton/talk-to-moshi/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd width=\"50%\"\u003e\n\u003ch3\u003e🗣️ Hello Llama: Stop Word Detection\u003c/h3\u003e\n\u003cp\u003eA code editor built with Llama 3.3 70b that is triggered by the phrase \"Hello Llama\". Build a Siri-like coding assistant in 100 lines of code!\u003c/p\u003e\n\u003cvideo width=\"100%\" src=\"https://github.com/user-attachments/assets/3e10cb15-ff1b-4b17-b141-ff0ad852e613\" controls\u003e\u003c/video\u003e\n\u003cp\u003e\n\u003ca href=\"https://huggingface.co/spaces/freddyaboulton/hey-llama-code-editor\"\u003eDemo\u003c/a\u003e |\n\u003ca href=\"https://huggingface.co/spaces/freddyaboulton/hey-llama-code-editor/blob/main/app.py\"\u003eCode\u003c/a\u003e\n\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n## Usage\n\nThis is an shortened version of the official [usage guide](https://freddyaboulton.github.io/gradio-webrtc/user-guide/). \n\n- `.ui.launch()`: Launch a built-in UI for easily testing and sharing your stream. Built with [Gradio](https://www.gradio.app/).\n- `.fastphone()`: Get a free temporary phone number to call into your stream. Hugging Face token required.\n- `.mount(app)`: Mount the stream on a [FastAPI](https://fastapi.tiangolo.com/) app. Perfect for integrating with your already existing production system.\n\n\n## Quickstart\n\n### Echo Audio\n\n```python\nfrom fastrtc import Stream, ReplyOnPause\nimport numpy as np\n\ndef echo(audio: tuple[int, np.ndarray]):\n    # The function will be passed the audio until the user pauses\n    # Implement any iterator that yields audio\n    # See \"LLM Voice Chat\" for a more complete example\n    yield audio\n\nstream = Stream(\n    handler=ReplyOnPause(echo),\n    modality=\"audio\", \n    mode=\"send-receive\",\n)\n```\n\n### LLM Voice Chat\n\n```py\nfrom fastrtc import (\n    ReplyOnPause, AdditionalOutputs, Stream,\n    audio_to_bytes, aggregate_bytes_to_16bit\n)\nimport gradio as gr\nfrom groq import Groq\nimport anthropic\nfrom elevenlabs import ElevenLabs\n\ngroq_client = Groq()\nclaude_client = anthropic.Anthropic()\ntts_client = ElevenLabs()\n\n\n# See \"Talk to Claude\" in Cookbook for an example of how to keep \n# track of the chat history.\ndef response(\n    audio: tuple[int, np.ndarray],\n):\n    prompt = groq_client.audio.transcriptions.create(\n        file=(\"audio-file.mp3\", audio_to_bytes(audio)),\n        model=\"whisper-large-v3-turbo\",\n        response_format=\"verbose_json\",\n    ).text\n    response = claude_client.messages.create(\n        model=\"claude-3-5-haiku-20241022\",\n        max_tokens=512,\n        messages=[{\"role\": \"user\", \"content\": prompt}],\n    )\n    response_text = \" \".join(\n        block.text\n        for block in response.content\n        if getattr(block, \"type\", None) == \"text\"\n    )\n    iterator = tts_client.text_to_speech.convert_as_stream(\n        text=response_text,\n        voice_id=\"JBFqnCBsd6RMkjVDRZzb\",\n        model_id=\"eleven_multilingual_v2\",\n        output_format=\"pcm_24000\"\n        \n    )\n    for chunk in aggregate_bytes_to_16bit(iterator):\n        audio_array = np.frombuffer(chunk, dtype=np.int16).reshape(1, -1)\n        yield (24000, audio_array)\n\nstream = Stream(\n    modality=\"audio\",\n    mode=\"send-receive\",\n    handler=ReplyOnPause(response),\n)\n```\n\n### Webcam Stream\n\n```python\nfrom fastrtc import Stream\nimport numpy as np\n\n\ndef flip_vertically(image):\n    return np.flip(image, axis=0)\n\n\nstream = Stream(\n    handler=flip_vertically,\n    modality=\"video\",\n    mode=\"send-receive\",\n)\n```\n\n### Object Detection\n\n```python\nfrom fastrtc import Stream\nimport gradio as gr\nimport cv2\nfrom huggingface_hub import hf_hub_download\nfrom .inference import YOLOv10\n\nmodel_file = hf_hub_download(\n    repo_id=\"onnx-community/yolov10n\", filename=\"onnx/model.onnx\"\n)\n\n# git clone https://huggingface.co/spaces/fastrtc/object-detection\n# for YOLOv10 implementation\nmodel = YOLOv10(model_file)\n\ndef detection(image, conf_threshold=0.3):\n    image = cv2.resize(image, (model.input_width, model.input_height))\n    new_image = model.detect_objects(image, conf_threshold)\n    return cv2.resize(new_image, (500, 500))\n\nstream = Stream(\n    handler=detection,\n    modality=\"video\", \n    mode=\"send-receive\",\n    additional_inputs=[\n        gr.Slider(minimum=0, maximum=1, step=0.01, value=0.3)\n    ]\n)\n```\n\n## Running the Stream\n\nRun:\n\n### Gradio\n\n```py\nstream.ui.launch()\n```\n\n### Telephone (Audio Only)\n\n    ```py\n    stream.fastphone()\n    ```\n\n### FastAPI\n\n```py\napp = FastAPI()\nstream.mount(app)\n\n# Optional: Add routes\n@app.get(\"/\")\nasync def _():\n    return HTMLResponse(content=open(\"index.html\").read())\n\n# uvicorn app:app --host 0.0.0.0 --port 8000\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgradio-app%2Ffastrtc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgradio-app%2Ffastrtc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgradio-app%2Ffastrtc/lists"}