{"id":17756045,"url":"https://github.com/kristofferv98/whisper_turboapi","last_synced_at":"2025-05-12T00:41:36.689Z","repository":{"id":259471011,"uuid":"877576466","full_name":"kristofferv98/whisper_turboapi","owner":"kristofferv98","description":"An optimized FastAPI server for OpenAI's Whisper whisper-large-v3-turbo model using MLX optimization","archived":false,"fork":false,"pushed_at":"2024-10-24T15:06:27.000Z","size":386,"stargazers_count":9,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-12T00:41:32.575Z","etag":null,"topics":["ai","api","asynchronous","audio","audio-processing","fastapi","huggingface","machine-learning","macos","mlx","model-serving","nlp","openai","optimization","python","speech-to-text","synchronous","transcription","whisper","whisper-turbo"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kristofferv98.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-23T22:25:58.000Z","updated_at":"2025-05-10T07:29:44.000Z","dependencies_parsed_at":"2024-10-25T19:53:13.517Z","dependency_job_id":"bc2482b8-9700-4e9c-9aca-5934473c8d59","html_url":"https://github.com/kristofferv98/whisper_turboapi","commit_stats":null,"previous_names":["kristofferv98/whisper_turboapi"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kristofferv98%2Fwhisper_turboapi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kristofferv98%2Fwhisper_turboapi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kristofferv98%2Fwhisper_turboapi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kristofferv98%2Fwhisper_turboapi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kristofferv98","download_url":"https://codeload.github.com/kristofferv98/whisper_turboapi/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253655919,"owners_count":21943072,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","api","asynchronous","audio","audio-processing","fastapi","huggingface","machine-learning","macos","mlx","model-serving","nlp","openai","optimization","python","speech-to-text","synchronous","transcription","whisper","whisper-turbo"],"created_at":"2024-10-26T15:08:57.935Z","updated_at":"2025-05-12T00:41:36.657Z","avatar_url":"https://github.com/kristofferv98.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# WhisperTurboAPI\n\nAn optimized FastAPI server implementation of OpenAI's Whisper large-v3-turbo model using MLX optimization, designed for low-latency, asynchronous and synchronous audio transcription on macOS.\n\n## Features\n\n- 🎙️ **Fast Audio Transcription:** Leverage the turbocharged, MLX-optimized Whisper large-v3-turbo model for quick and accurate transcriptions.\n- 🌐 **RESTful API Access:** Easily integrate with any environment that supports HTTP requests.\n- ⚡ **Async/Sync Support:** Seamlessly handle both asynchronous and synchronous transcription requests.\n- 🔄 **Low Latency:** Optimized for minimal delay by preloading models and efficient processing.\n- 🔧 **High Throughput:** Capable of handling multiple concurrent transcription requests.\n- 🚀 **Easy Setup:** Simple setup process.\n\n## System Requirements\n\n- **Operating System:** macOS with Apple Silicon (recommended for optimal performance due to MLX optimizations)\n- **Python Version:** 3.12.3 (required for MLX optimization)\n- **Python Version Manager:** [pyenv](https://github.com/pyenv/pyenv) (recommended but optional)\n\n## Installation\n\n### 1. Clone the Repository\n\n```bash\ngit clone https://github.com/kristofferv98/whisper_turboapi.git\ncd whisper_turboapi\n```\n\n### 2. Run the Setup Script\n\nThe `setup.sh` script will handle the environment setup, including checking for the required Python version, creating a virtual environment, installing dependencies, and verifying the installation.\n\n```bash\n./setup.sh\n```\n\nOptional Flags:\n\n- `-y` or `--yes`: Run in headless mode (assume 'yes' to all prompts)\n- `-p` or `--python-version`: Specify the required Python version (default: 3.12.3)\n- `-f` or `--force`: Force actions like recreating the virtual environment\n- `-h` or `--help`: Display help message\n\nExample:\n\n```bash\n./setup.sh -y\n```\n\nThis will run the setup in headless mode, assuming 'yes' to all prompts.\n\n### 3. Start the Server\n\nAfter the setup completes, you can start the server using:\n\n```bash\n./start_server.sh\n```\n\nThe server supports several command-line options:\n\n```bash\n./start_server.sh --host=0.0.0.0  # Custom host (default: 0.0.0.0)\n./start_server.sh --port=8080     # Custom port (default: 8000)\n./start_server.sh --help          # Show help message\n```\n\nYou can combine options:\n```bash\n./start_server.sh --host=127.0.0.1 --port=8080\n```\n\nThe server will automatically:\n- Activate the virtual environment if not already active\n- Verify all required packages are installed\n- Start the FastAPI server with the specified host and port\n\n**Note**: The `start_server.sh` script ensures that the virtual environment is activated and that all required packages are installed before starting the server.\n\n## Usage Examples\n\n### Simple Python Client\n\nHere’s a basic synchronous client example using `requests`.\n\n```python\nimport requests\nimport os\n\ndef transcribe_audio(file_path, quick=True, any_lang=True, server_url=\"http://localhost:8000\"):\n    \"\"\"\n    Simplified function to transcribe an audio file using the WhisperTurboAPI server.\n\n    Args:\n        file_path (str): Path to the audio file.\n        quick (bool, optional): Whether to use quick mode. Default is True.\n        any_lang (bool, optional): Whether to allow any language detection. Default is True.\n        server_url (str, optional): URL of the transcription server.\n\n    Returns:\n        str: The transcribed text.\n    \"\"\"\n    if not os.path.exists(file_path):\n        raise FileNotFoundError(f\"Audio file not found: {file_path}\")\n        \n    if not file_path.lower().endswith(('.wav', '.mp3', '.m4a', '.flac')):\n        raise ValueError(\"Unsupported audio format. Use WAV, MP3, M4A, or FLAC\")\n        \n    with open(file_path, 'rb') as f:\n        files = {'file': (os.path.basename(file_path), f, 'audio/wav')}\n        params = {\n            'quick': str(quick).lower(),\n            'any_lang': str(any_lang).lower()\n        }\n        response = requests.post(f\"{server_url}/transcribe\", files=files, params=params)\n    if response.status_code == 200:\n        return response.json()['text']\n    else:\n        raise Exception(f\"Request failed: {response.status_code}, {response.text}\")\n\n# Example usage\nif __name__ == \"__main__\":\n    # Replace 'path/to/audio.wav' with the path to your audio file\n    audio_file = 'path/to/audio.wav'\n    try:\n        transcription = transcribe_audio(audio_file, quick=True, any_lang=False)\n        print(\"Transcription:\")\n        print(transcription)\n    except Exception as e:\n        print(f\"An error occurred: {e}\")\n```\n\n### Asynchronous Client\n\nAn asynchronous client example using `aiohttp`.\n\n```python\nimport aiohttp\nimport asyncio\nimport os\n\nasync def transcribe_async(file_path, quick=True, any_lang=True, server_url=\"http://localhost:8000\"):\n    \"\"\"\n    Asynchronous function to transcribe an audio file using the WhisperTurboAPI server.\n\n    Args:\n        file_path (str): Path to the audio file.\n        quick (bool, optional): Whether to use quick mode. Default is True.\n        any_lang (bool, optional): Whether to allow any language detection. Default is True.\n        server_url (str, optional): URL of the transcription server.\n\n    Returns:\n        str: The transcribed text.\n    \"\"\"\n    if not os.path.exists(file_path):\n        raise FileNotFoundError(f\"Audio file not found: {file_path}\")\n\n    if not file_path.lower().endswith(('.wav', '.mp3', '.m4a', '.flac')):\n        raise ValueError(\"Unsupported audio format. Use WAV, MP3, M4A, or FLAC\")\n\n    async with aiohttp.ClientSession() as session:\n        with open(file_path, 'rb') as f:\n            data = aiohttp.FormData()\n            data.add_field('file', f, filename=os.path.basename(file_path))\n            params = {\n                'quick': str(quick).lower(),\n                'any_lang': str(any_lang).lower()\n            }\n\n            async with session.post(f\"{server_url}/transcribe\", data=data, params=params) as response:\n                if response.status == 200:\n                    result = await response.json()\n                    return result['text']\n                else:\n                    error_detail = await response.text()\n                    raise Exception(f\"Error: {response.status} - {error_detail}\")\n\n# Example usage\nif __name__ == \"__main__\":\n    async def main():\n        audio_file = 'path/to/audio.wav'\n        try:\n            transcription = await transcribe_async(audio_file, quick=False, any_lang=True)\n            print(\"Transcription:\")\n            print(transcription)\n        except Exception as e:\n            print(f\"An error occurred: {e}\")\n\n    asyncio.run(main())\n```\n\n### CURL\n\nYou can use `curl` to test the API. Here's a working example:\n\n```bash\ncurl -X POST \\\n  -H \"Content-Type: multipart/form-data\" \\\n  -F \"file=@path/to/your/audio.wav\" \\\n  http://localhost:8000/transcribe\n```\n\nFor example with a sample audio file:\n```bash\ncurl -X POST \\\n  -H \"Content-Type: multipart/form-data\" \\\n  -F \"file=@sample_audio.wav\" \\\n  http://localhost:8000/transcribe\n```\n\nThe response will be JSON formatted:\n```json\n{\n    \"text\": \"Transcribed text here...\",\n    \"elapsed_time\": 1.44,\n    \"quick_mode\": true,\n    \"any_lang\": true\n}\n```\n\n## API Reference\n\n### POST /transcribe\n\nTranscribe an audio file.\n\n**Headers**:\n- `Content-Type: multipart/form-data` (required)\n\n**Parameters**:\n- `file` (form data, required): Audio file (WAV, MP3, M4A, FLAC)\n\n**Example Request**:\n```bash\ncurl -X POST \\\n  -H \"Content-Type: multipart/form-data\" \\\n  -F \"file=@audio.wav\" \\\n  http://localhost:8000/transcribe\n```\n\n**Response**:\n```json\n{\n    \"text\": \"The transcribed text will appear here...\",\n    \"elapsed_time\": 1.44,\n    \"quick_mode\": true,\n    \"any_lang\": true\n}\n```\n\n**Status Codes**:\n- `200 OK`: Successful transcription\n- `400 Bad Request`: Invalid file format or missing file\n- `500 Internal Server Error`: Server processing error\n\n### GET /health\n\nCheck server status.\n\n**Response**:\n\n```json\n{\n    \"status\": \"healthy\",\n    \"version\": \"1.0.0\"\n}\n```\n\n## Demo:\n\nFor testing, example clients are provided in:\n\n- `examples/demo.py`: Demonstrates both synchronous and asynchronous client\n- `examples/simple_demo.py`: Basic synchronous client\n\n## License\n\nMIT License\n\n## Acknowledgements\n\nBased on `whisper-turbo-mlx` by JosefAlbers, which provides a fast and lightweight implementation of the Whisper model using MLX.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkristofferv98%2Fwhisper_turboapi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkristofferv98%2Fwhisper_turboapi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkristofferv98%2Fwhisper_turboapi/lists"}