{"id":21343520,"url":"https://github.com/cloudmercato/whisper-benchmark","last_synced_at":"2025-10-28T10:08:38.413Z","repository":{"id":206521842,"uuid":"716886394","full_name":"cloudmercato/whisper-benchmark","owner":"cloudmercato","description":"A simple tool to evaluate performance of whisper models","archived":false,"fork":false,"pushed_at":"2023-11-11T15:23:47.000Z","size":15,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-18T07:02:43.906Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cloudmercato.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-10T04:35:42.000Z","updated_at":"2023-11-10T04:36:10.000Z","dependencies_parsed_at":"2025-01-22T15:46:55.840Z","dependency_job_id":"2d85762f-66c4-42b8-82d5-56e6f2e1d0f5","html_url":"https://github.com/cloudmercato/whisper-benchmark","commit_stats":null,"previous_names":["cloudmercato/whisper-benchmark"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cloudmercato/whisper-benchmark","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudmercato%2Fwhisper-benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudmercato%2Fwhisper-benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudmercato%2Fwhisper-benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudmercato%2Fwhisper-benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cloudmercato","download_url":"https://codeload.github.com/cloudmercato/whisper-benchmark/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudmercato%2Fwhisper-benchmark/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281418901,"owners_count":26497898,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-28T02:00:06.022Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-22T01:13:37.284Z","updated_at":"2025-10-28T10:08:38.381Z","avatar_url":"https://github.com/cloudmercato.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Whisper Benchmark\n========================\n\n**Whisper-Benchmark** is a simple tool to evaluate performance of `Whisper \u003chttps://github.com/openai/whisper\u003e`_ models and configurations.\n\nInstall\n-------\n\n.. note::\n\n  This project must be currently coupled with the `Cloud Mercato's version \u003chttps://github.com/cloudmercato/whisper\u003e`_ of Whisper.\n  A `pull request \u003chttps://github.com/openai/whisper/pull/1787\u003e`_ is in progress about that.\n\nAfter installing Whisper: ::\n\n  pip install https://github.com/cloudmercato/whisper-benchmark/archive/refs/heads/master.zip\n  \n  \nUsage\n-----\n\nCommand line\n~~~~~~~~~~~~\n\nMost of the original Whisper options are available:::\n\n  $ whisper-benchmark --help\n  usage: whisper-benchmark [-h]\n                           [--model-name {tiny,base,small,medium,large,large-v1,large-v2,large-v3,tiny.en,base.en,small.en,medium.en}]\n                           [--device DEVICE] [--task {transcribe,translate}] [--verbose VERBOSE]\n                           [--temperature TEMPERATURE] [--best_of BEST_OF] [--beam_size BEAM_SIZE]\n                           [--patience PATIENCE] [--length_penalty LENGTH_PENALTY]\n                           [--suppress_tokens SUPPRESS_TOKENS] [--initial_prompt INITIAL_PROMPT]\n                           [--condition_on_previous_text] [--fp16]\n                           [--temperature_increment_on_fallback TEMPERATURE_INCREMENT_ON_FALLBACK]\n                           [--compression_ratio_threshold COMPRESSION_RATIO_THRESHOLD]\n                           [--logprob_threshold LOGPROB_THRESHOLD]\n                           [--no_speech_threshold NO_SPEECH_THRESHOLD] [--threads THREADS]\n                           audio_id\n\n  positional arguments:\n    audio_id              audio file to transcribe\n\n  options:\n    -h, --help            show this help message and exit\n    --model-name {tiny,base,small,medium,large,large-v1,large-v2,large-v3,tiny.en,base.en,small.en,medium.en}\n                          name of the Whisper model to use (default: small)\n    --device DEVICE       device to use for PyTorch inference (default: cpu)\n    --task {transcribe,translate}\n                          whether to perform X-\u003eX speech recognition ('transcribe') or X-\u003eEnglish\n                          translation ('translate') (default: transcribe)\n    --verbose VERBOSE     0: Muted, 1: Info, 2: Verbose (default: 0)\n    --temperature TEMPERATURE\n                          temperature to use for sampling (default: 0)\n    --best_of BEST_OF     number of candidates when sampling with non-zero temperature (default: 5)\n    --beam_size BEAM_SIZE\n                          number of beams in beam search, only applicable when temperature is zero\n                          (default: 5)\n    --patience PATIENCE   optional patience value to use in beam decoding, as in\n                          https://arxiv.org/abs/2204.05424, the default (1.0) is equivalent to conventional\n                          beam search (default: None)\n    --length_penalty LENGTH_PENALTY\n                          optional token length penalty coefficient (alpha) as in\n                          https://arxiv.org/abs/1609.08144, uses simple length normalization by default\n                          (default: None)\n    --suppress_tokens SUPPRESS_TOKENS\n                          comma-separated list of token ids to suppress during sampling; '-1' will suppress\n                          most special characters except common punctuations (default: -1)\n    --initial_prompt INITIAL_PROMPT\n                          optional text to provide as a prompt for the first window. (default: None)\n    --condition_on_previous_text\n                          if True, provide the previous output of the model as a prompt for the next\n                          window; disabling may make the text inconsistent across windows, but the model\n                          becomes less prone to getting stuck in a failure loop (default: False)\n    --fp16                whether to perform inference in fp16; True by default (default: False)\n    --temperature_increment_on_fallback TEMPERATURE_INCREMENT_ON_FALLBACK\n                          temperature to increase when falling back when the decoding fails to meet either\n                          of the thresholds below (default: 0.2)\n    --compression_ratio_threshold COMPRESSION_RATIO_THRESHOLD\n                          if the gzip compression ratio is higher than this value, treat the decoding as\n                          failed (default: 2.4)\n    --logprob_threshold LOGPROB_THRESHOLD\n                          if the average log probability is lower than this value, treat the decoding as\n                          failed (default: -1.0)\n    --no_speech_threshold NO_SPEECH_THRESHOLD\n                          if the probability of the \u003c|nospeech|\u003e token is higher than this value AND the\n                          decoding has failed due to `logprob_threshold`, consider the segment as silence\n                          (default: 0.6)\n    --threads THREADS     number of threads used by torch for CPU inference; supercedes\n                          MKL_NUM_THREADS/OMP_NUM_THREADS (default: 0)\nTest example\n~~~~~~~~~~~~\n\nTranscribe an English male voice with `tiny` model: ::\n\n  $ whisper-benchmark en-male-1 --model-name tiny\n  content_frames : 16297\n  dtype : torch.float32\n  language : en\n  start_time : 1699593688.3675494\n  end_time : 1699593693.0126545\n  elapsed : 4.6451051235198975\n  fps : 3508.4243664330047   \u003c-- You'll mainly put your attention to this value\n  device : cuda\n  audio_id : en-male-1\n  version : 0.0.1\n  torch_version : 2.0.1+cu117\n  cuda_version : 11.7\n  python_version : 3.10.12\n  whisper_version : 20231106\n  numba_version : 0.58.1\n  numpy_version : 1.26.1\n  threads : 1\n\nAudio source\n------------\n\nThe audio files are selected from `Wikimedia Commons \u003chttps://commons.wikimedia.org/wiki/Main_Page\u003e`_. Here's the list:\n\n- **en-male-1**: `The Call of South Africa\", read by Philip Burgers \u003chttps://commons.wikimedia.org/wiki/File:%22The_Call_of_South_Africa%22,_read_by_Philip_Burgers.flac\u003e`_\n- **en-male-2**: `Nanotechnology lead reading \u003chttps://commons.wikimedia.org/wiki/File:0_nanolead_q10.ogg\u003e`_\n- **en-male-3**: `Why There's A Cat Curfew in My House \u003chttps://commons.wikimedia.org/wiki/File:12_Why_There%27s_A_Cat_Curfew_in_My_House.oga\u003e`_\n- **en-female-1**: `Alessia Cara's voice, from Border Crossings on VOA at Jingle Ball 2016 \u003chttps://commons.wikimedia.org/wiki/File:Alessia_Cara%27s_voice,_from_Border_Crossings_on_VOA_at_Jingle_Ball_2016.mp3\u003e`_\n- **en-female-2**: `Jabberwocky \u003chttps://commons.wikimedia.org/wiki/File:Jabberwocky.ogg\u003e`_\n- **en-female-3**: `Joely Richardson on the Albert Memorial \u003chttps://commons.wikimedia.org/wiki/File:Joely_Richardson_on_the_Albert_Memorial.ogg\u003e`_\n\nFeel free to contribute by adding more audio, especially for non-english language.\n\nContribute\n----------\n\nThis project is created with ❤️ for free by `Cloud Mercato`_ under BSD License. Feel free to contribute by submitting a pull request or an issue.\n\n.. _`Cloud Mercato`: https://www.cloud-mercato.com/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcloudmercato%2Fwhisper-benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcloudmercato%2Fwhisper-benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcloudmercato%2Fwhisper-benchmark/lists"}