{"id":22344443,"url":"https://github.com/knuckles-team/audio-transcriber","last_synced_at":"2026-05-07T01:02:00.565Z","repository":{"id":88641712,"uuid":"582748036","full_name":"Knuckles-Team/audio-transcriber","owner":"Knuckles-Team","description":"Transcribe audio into text. Agentic AI supported through MCP Server.","archived":false,"fork":false,"pushed_at":"2026-04-20T07:43:10.000Z","size":664,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-20T09:35:21.891Z","etag":null,"topics":["a2a","a2a-server","ag-ui","mcp-server","openai","transcription","transformers","whisper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Knuckles-Team.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2022-12-27T18:57:18.000Z","updated_at":"2026-04-20T07:43:15.000Z","dependencies_parsed_at":"2026-03-04T07:01:40.473Z","dependency_job_id":null,"html_url":"https://github.com/Knuckles-Team/audio-transcriber","commit_stats":null,"previous_names":[],"tags_count":82,"template":false,"template_full_name":null,"purl":"pkg:github/Knuckles-Team/audio-transcriber","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Knuckles-Team%2Faudio-transcriber","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Knuckles-Team%2Faudio-transcriber/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Knuckles-Team%2Faudio-transcriber/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Knuckles-Team%2Faudio-transcriber/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Knuckles-Team","download_url":"https://codeload.github.com/Knuckles-Team/audio-transcriber/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Knuckles-Team%2Faudio-transcriber/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32718323,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-07T00:29:05.620Z","status":"ssl_error","status_checked_at":"2026-05-07T00:28:57.074Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["a2a","a2a-server","ag-ui","mcp-server","openai","transcription","transformers","whisper"],"created_at":"2024-12-04T09:11:37.377Z","updated_at":"2026-05-07T01:02:00.550Z","avatar_url":"https://github.com/Knuckles-Team.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Audio-Transcriber - A2A | AG-UI | MCP\n\n![PyPI - Version](https://img.shields.io/pypi/v/audio-transcriber)\n![MCP Server](https://badge.mcpx.dev?type=server 'MCP Server')\n![PyPI - Downloads](https://img.shields.io/pypi/dd/audio-transcriber)\n![GitHub Repo stars](https://img.shields.io/github/stars/Knuckles-Team/audio-transcriber)\n![GitHub forks](https://img.shields.io/github/forks/Knuckles-Team/audio-transcriber)\n![GitHub contributors](https://img.shields.io/github/contributors/Knuckles-Team/audio-transcriber)\n![PyPI - License](https://img.shields.io/pypi/l/audio-transcriber)\n![GitHub](https://img.shields.io/github/license/Knuckles-Team/audio-transcriber)\n\n![GitHub last commit (by committer)](https://img.shields.io/github/last-commit/Knuckles-Team/audio-transcriber)\n![GitHub pull requests](https://img.shields.io/github/issues-pr/Knuckles-Team/audio-transcriber)\n![GitHub closed pull requests](https://img.shields.io/github/issues-pr-closed/Knuckles-Team/audio-transcriber)\n![GitHub issues](https://img.shields.io/github/issues/Knuckles-Team/audio-transcriber)\n\n![GitHub top language](https://img.shields.io/github/languages/top/Knuckles-Team/audio-transcriber)\n![GitHub language count](https://img.shields.io/github/languages/count/Knuckles-Team/audio-transcriber)\n![GitHub repo size](https://img.shields.io/github/repo-size/Knuckles-Team/audio-transcriber)\n![GitHub repo file count (file type)](https://img.shields.io/github/directory-file-count/Knuckles-Team/audio-transcriber)\n![PyPI - Wheel](https://img.shields.io/pypi/wheel/audio-transcriber)\n![PyPI - Implementation](https://img.shields.io/pypi/implementation/audio-transcriber)\n\n*Version: 0.11.2*\n\n## Overview\n\nTranscribe your .wav .mp4 .mp3 .flac files to text or record your own audio!\n\nThis repository is actively maintained - Contributions are welcome!\n\nContribution Opportunities:\n- Support new models\n\nWrapped around [OpenAI Whisper](https://pypi.org/project/openai-whisper)\n\n## MCP\n\n## MCP Tools\n\n| Function Name      | Description                                                                 | Tag(s)             |\n|:-------------------|:----------------------------------------------------------------------------|:-------------------|\n| `transcribe_audio` | Transcribes audio from a provided file or by recording from the microphone. | `audio_processing` |\n\n## A2A Agent\n\n### Architecture Summary\n\n```mermaid\n---\nconfig:\n  layout: dagre\n---\nflowchart TB\n subgraph subGraph0[\"Agent Capabilities\"]\n        C[\"Agent\"]\n        B[\"A2A Server - Uvicorn/FastAPI\"]\n        D[\"MCP Tools\"]\n        F[\"Agent Skills\"]\n  end\n    C --\u003e D \u0026 F\n    A[\"User Query\"] --\u003e B\n    B --\u003e C\n    D --\u003e E[\"Platform API\"]\n\n     C:::agent\n     B:::server\n     A:::server\n    classDef server fill:#f9f,stroke:#333\n    classDef agent fill:#bbf,stroke:#333,stroke-width:2px\n    style B stroke:#000000,fill:#FFD600\n    style D stroke:#000000,fill:#BBDEFB\n    style F fill:#BBDEFB\n    style A fill:#C8E6C9\n    style subGraph0 fill:#FFF9C4\n```\n\n### Component Interaction Diagram\n\n```mermaid\nsequenceDiagram\n    participant User\n    participant Server as A2A Server\n    participant Agent as Agent\n    participant Skill as Agent Skills\n    participant MCP as MCP Tools\n\n    User-\u003e\u003eServer: Send Query\n    Server-\u003e\u003eAgent: Invoke Agent\n    Agent-\u003e\u003eSkill: Analyze Skills Available\n    Skill-\u003e\u003eAgent: Provide Guidance on Next Steps\n    Agent-\u003e\u003eMCP: Invoke Tool\n    MCP--\u003e\u003eAgent: Tool Response Returned\n    Agent--\u003e\u003eAgent: Return Results Summarized\n    Agent--\u003e\u003eServer: Final Response\n    Server--\u003e\u003eUser: Output\n```\n\n## Usage\n\n### CLI\n\n| Short Flag | Long Flag        | Description                            |\n|------------|------------------|----------------------------------------|\n| -h         | --help           | See Usage                              |\n| -b         | --bitrate   | Bitrate to use during recording                               |\n| -c         | --channels  | Number of channels to use during recording                    |\n| -d         | --directory | Directory to save recording                                   |\n| -e         | --export    | Export txt, srt, and vtt files                                |\n| -f         | --file      | File to transcribe                                            |\n| -l         | --language  | Language to transcribe                                        |\n| -m         | --model     | Model to use: \u003ctiny, base, small, medium, large\u003e              |\n| -n         | --name      | Name of recording                                             |\n| -r         | --record    | Specify number of seconds to record to record from microphone |\n\n\n\n```bash\naudio-transcriber --file '~/Downloads/Federal_Reserve.mp4' --model 'large'\n```\n\n```bash\naudio-transcriber --record 60 --directory '~/Downloads/' --name 'my_recording.wav' --model 'tiny'\n```\n\n### MCP CLI\n\n| Short Flag | Long Flag                          | Description                                                                 |\n|------------|------------------------------------|-----------------------------------------------------------------------------|\n| -h         | --help                             | Display help information                                                    |\n| -t         | --transport                        | Transport method: 'stdio', 'http', or 'sse' [legacy] (default: stdio)       |\n| -s         | --host                             | Host address for HTTP transport (default: 0.0.0.0)                          |\n| -p         | --port                             | Port number for HTTP transport (default: 8000)                              |\n|            | --auth-type                        | Authentication type: 'none', 'static', 'jwt', 'oauth-proxy', 'oidc-proxy', 'remote-oauth' (default: none) |\n|            | --token-jwks-uri                   | JWKS URI for JWT verification                                              |\n|            | --token-issuer                     | Issuer for JWT verification                                                |\n|            | --token-audience                   | Audience for JWT verification                                              |\n|            | --oauth-upstream-auth-endpoint     | Upstream authorization endpoint for OAuth Proxy                             |\n|            | --oauth-upstream-token-endpoint    | Upstream token endpoint for OAuth Proxy                                    |\n|            | --oauth-upstream-client-id         | Upstream client ID for OAuth Proxy                                         |\n|            | --oauth-upstream-client-secret     | Upstream client secret for OAuth Proxy                                     |\n|            | --oauth-base-url                   | Base URL for OAuth Proxy                                                   |\n|            | --oidc-config-url                  | OIDC configuration URL                                                     |\n|            | --oidc-client-id                   | OIDC client ID                                                             |\n|            | --oidc-client-secret               | OIDC client secret                                                         |\n|            | --oidc-base-url                    | Base URL for OIDC Proxy                                                    |\n|            | --remote-auth-servers              | Comma-separated list of authorization servers for Remote OAuth             |\n|            | --remote-base-url                  | Base URL for Remote OAuth                                                  |\n|            | --allowed-client-redirect-uris     | Comma-separated list of allowed client redirect URIs                       |\n|            | --eunomia-type                     | Eunomia authorization type: 'none', 'embedded', 'remote' (default: none)   |\n|            | --eunomia-policy-file              | Policy file for embedded Eunomia (default: mcp_policies.json)              |\n|            | --eunomia-remote-url               | URL for remote Eunomia server                                              |\n\n### Using as an MCP Server\n\nThe MCP Server can be run in two modes: `stdio` (for local testing) or `http` (for networked access). To start the server, use the following commands:\n\n#### Run in stdio mode (default):\n```bash\naudio-transcriber-mcp\n```\n\n#### Run in HTTP mode:\n```bash\naudio-transcriber-mcp --transport \"http\"  --host \"0.0.0.0\"  --port \"8000\"\n```\n\n#### Model Information\n\n[Courtesy of and Credits to OpenAI: Whisper.ai](https://github.com/openai/whisper/blob/main/README.md)\n\n|  Size  | Parameters | English-only model | Multilingual model | Required VRAM | Relative speed |\n|:------:|:----------:|:------------------:|:------------------:|:-------------:|:--------------:|\n|  tiny  |    39 M    |     `tiny.en`      |       `tiny`       |     ~1 GB     |      ~32x      |\n|  base  |    74 M    |     `base.en`      |       `base`       |     ~1 GB     |      ~16x      |\n| small  |   244 M    |     `small.en`     |      `small`       |     ~2 GB     |      ~6x       |\n| medium |   769 M    |    `medium.en`     |      `medium`      |     ~5 GB     |      ~2x       |\n| large  |   1550 M   |        N/A         |      `large`       |    ~10 GB     |       1x       |\n\n\n\n### Deploy MCP Server as a Service\n\nThe ServiceNow MCP server can be deployed using Docker, with configurable authentication, middleware, and Eunomia authorization.\n\n#### Using Docker Run\n\n```bash\ndocker pull knucklessg1/audio-transcriber:latest\n\ndocker run -d \\\n  --name audio-transcriber-mcp \\\n  -p 8004:8004 \\\n  -e HOST=0.0.0.0 \\\n  -e PORT=8004 \\\n  -e TRANSPORT=http \\\n  -e AUTH_TYPE=none \\\n  -e EUNOMIA_TYPE=none \\\n  knucklessg1/audio-transcriber:latest\n```\n\nFor advanced authentication (e.g., JWT, OAuth Proxy, OIDC Proxy, Remote OAuth) or Eunomia, add the relevant environment variables:\n\n```bash\ndocker run -d \\\n  --name audio-transcriber-mcp \\\n  -p 8004:8004 \\\n  -e HOST=0.0.0.0 \\\n  -e PORT=8004 \\\n  -e TRANSPORT=http \\\n  -e AUTH_TYPE=oidc-proxy \\\n  -e OIDC_CONFIG_URL=https://provider.com/.well-known/openid-configuration \\\n  -e OIDC_CLIENT_ID=your-client-id \\\n  -e OIDC_CLIENT_SECRET=your-client-secret \\\n  -e OIDC_BASE_URL=https://your-server.com \\\n  -e ALLOWED_CLIENT_REDIRECT_URIS=http://localhost:*,https://*.example.com/* \\\n  -e EUNOMIA_TYPE=embedded \\\n  -e EUNOMIA_POLICY_FILE=/app/mcp_policies.json \\\n  knucklessg1/audio-transcriber:latest\n```\n\n#### Using Docker Compose\n\nCreate a `docker-compose.yml` file:\n\n```yaml\nservices:\n  audio-transcriber-mcp:\n    image: knucklessg1/audio-transcriber:latest\n    environment:\n      - HOST=0.0.0.0\n      - PORT=8004\n      - TRANSPORT=http\n      - AUTH_TYPE=none\n      - EUNOMIA_TYPE=none\n    ports:\n      - 8004:8004\n```\n\nFor advanced setups with authentication and Eunomia:\n\n```yaml\nservices:\n  audio-transcriber-mcp:\n    image: knucklessg1/audio-transcriber:latest\n    environment:\n      - HOST=0.0.0.0\n      - PORT=8004\n      - TRANSPORT=http\n      - AUTH_TYPE=oidc-proxy\n      - OIDC_CONFIG_URL=https://provider.com/.well-known/openid-configuration\n      - OIDC_CLIENT_ID=your-client-id\n      - OIDC_CLIENT_SECRET=your-client-secret\n      - OIDC_BASE_URL=https://your-server.com\n      - ALLOWED_CLIENT_REDIRECT_URIS=http://localhost:*,https://*.example.com/*\n      - EUNOMIA_TYPE=embedded\n      - EUNOMIA_POLICY_FILE=/app/mcp_policies.json\n    ports:\n      - 8004:8004\n    volumes:\n      - ./mcp_policies.json:/app/mcp_policies.json\n```\n\nRun the service:\n\n```bash\ndocker-compose up -d\n```\n\n#### Configure `mcp.json` for AI Integration\n\nConfigure `mcp.json`\n```json\n{\n  \"mcpServers\": {\n    \"audio_transcriber\": {\n      \"command\": \"uv\",\n      \"args\": [\n        \"run\",\n        \"--with\",\n        \"audio-transcriber\",\n        \"audio-transcriber-mcp\"\n      ],\n      \"env\": {\n        \"WHISPER_MODEL\": \"medium\",            // Optional\n        \"TRANSCRIBE_DIRECTORY\": \"~/Downloads\" // Optional\n      },\n      \"timeout\": 200000\n    }\n  }\n}\n```\n### A2A CLI\n#### Endpoints\n- **Web UI**: `http://localhost:8000/` (if enabled)\n- **A2A**: `http://localhost:8000/a2a` (Discovery: `/a2a/.well-known/agent.json`)\n- **AG-UI**: `http://localhost:8000/ag-ui` (POST)\n\n| Short Flag | Long Flag         | Description                                                            |\n|------------|-------------------|------------------------------------------------------------------------|\n| -h         | --help            | Display help information                                               |\n|            | --host            | Host to bind the server to (default: 0.0.0.0)                          |\n|            | --port            | Port to bind the server to (default: 9000)                             |\n|            | --reload          | Enable auto-reload                                                     |\n|            | --provider        | LLM Provider: 'openai', 'anthropic', 'google', 'huggingface'           |\n|            | --model-id        | LLM Model ID (default: nvidia/nemotron-3-super)                                       |\n|            | --base-url        | LLM Base URL (for OpenAI compatible providers)                         |\n|            | --api-key         | LLM API Key                                                            |\n\n|            | --mcp-url         | MCP Server URL (default: http://localhost:8000/mcp)                    |\n|            | --web             | Enable Pydantic AI Web UI                                              | False (Env: ENABLE_WEB_UI) |\n\n\n## Install Python Package\n\n```bash\npython -m pip install audio-transcriber\n```\n\nor\n\n```bash\nuv pip install --upgrade audio-transcriber\n```\n\n##### Ubuntu Dependencies\n```bash\nsudo apt-get update\nsudo apt-get install libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0 ffmpeg gcc -y\n```\n\n\n## Repository Owners\n\n\n\u003cimg width=\"100%\" height=\"180em\" src=\"https://github-readme-stats.vercel.app/api?username=Knucklessg1\u0026show_icons=true\u0026hide_border=true\u0026\u0026count_private=true\u0026include_all_commits=true\" /\u003e\n\n![GitHub followers](https://img.shields.io/github/followers/Knucklessg1)\n![GitHub User's stars](https://img.shields.io/github/stars/Knucklessg1)\n\u003c/details\u003e\n\n\n## MCP Configuration Examples\n\n### 1. Standard IO (stdio) Deployment\n\n```json\n{\n  \"mcpServers\": {\n    \"audio-transcriber\": {\n      \"command\": \"uv\",\n      \"args\": [\n        \"run\",\n        \"audio-transcriber-mcp\"\n      ],\n      \"env\": {\n        \"AGENT_DESCRIPTION\": \"\u003cYOUR_AGENT_DESCRIPTION\u003e\",\n        \"AGENT_SYSTEM_PROMPT\": \"\u003cYOUR_AGENT_SYSTEM_PROMPT\u003e\",\n        \"AUDIO_PROCESSINGTOOL\": \"True\",\n        \"DEFAULT_AGENT_NAME\": \"\u003cYOUR_DEFAULT_AGENT_NAME\u003e\",\n        \"TRANSCRIBE_DIRECTORY\": \"\u003cYOUR_TRANSCRIBE_DIRECTORY\u003e\",\n        \"WHISPER_MODEL\": \"\u003cYOUR_WHISPER_MODEL\u003e\"\n      }\n    }\n  }\n}\n```\n\n### 2. Streamable HTTP (SSE) Deployment\n\n```json\n{\n  \"mcpServers\": {\n    \"audio-transcriber\": {\n      \"command\": \"uv\",\n      \"args\": [\n        \"run\",\n        \"audio-transcriber-mcp\",\n        \"--transport\",\n        \"http\",\n        \"--host\",\n        \"0.0.0.0\",\n        \"--port\",\n        \"8000\"\n      ],\n      \"env\": {\n        \"AGENT_DESCRIPTION\": \"\u003cYOUR_AGENT_DESCRIPTION\u003e\",\n        \"AGENT_SYSTEM_PROMPT\": \"\u003cYOUR_AGENT_SYSTEM_PROMPT\u003e\",\n        \"AUDIO_PROCESSINGTOOL\": \"True\",\n        \"DEFAULT_AGENT_NAME\": \"\u003cYOUR_DEFAULT_AGENT_NAME\u003e\",\n        \"TRANSCRIBE_DIRECTORY\": \"\u003cYOUR_TRANSCRIBE_DIRECTORY\u003e\",\n        \"WHISPER_MODEL\": \"\u003cYOUR_WHISPER_MODEL\u003e\"\n      }\n    }\n  }\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fknuckles-team%2Faudio-transcriber","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fknuckles-team%2Faudio-transcriber","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fknuckles-team%2Faudio-transcriber/lists"}