{"id":32542837,"url":"https://github.com/cyberagentailab/mbr-for-asr","last_synced_at":"2025-10-28T16:53:04.722Z","repository":{"id":320304544,"uuid":"1080352916","full_name":"CyberAgentAILab/mbr-for-asr","owner":"CyberAgentAILab","description":"Code for Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition","archived":false,"fork":false,"pushed_at":"2025-10-23T02:14:59.000Z","size":270,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-23T04:12:29.254Z","etag":null,"topics":["automatic-speech-recognition"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2510.19471","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CyberAgentAILab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-21T08:44:53.000Z","updated_at":"2025-10-23T02:15:03.000Z","dependencies_parsed_at":"2025-10-23T04:12:40.087Z","dependency_job_id":"492de1af-a0cf-4120-b551-452465d962c6","html_url":"https://github.com/CyberAgentAILab/mbr-for-asr","commit_stats":null,"previous_names":["cyberagentailab/mbr-for-asr"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/CyberAgentAILab/mbr-for-asr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAgentAILab%2Fmbr-for-asr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAgentAILab%2Fmbr-for-asr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAgentAILab%2Fmbr-for-asr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAgentAILab%2Fmbr-for-asr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CyberAgentAILab","download_url":"https://codeload.github.com/CyberAgentAILab/mbr-for-asr/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CyberAgentAILab%2Fmbr-for-asr/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281476676,"owners_count":26508145,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-28T02:00:06.022Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automatic-speech-recognition"],"created_at":"2025-10-28T16:53:03.099Z","updated_at":"2025-10-28T16:53:04.717Z","avatar_url":"https://github.com/CyberAgentAILab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Minimum Bayes Risk Decoding for Automated Speech Recognition\n\nThis repository contains the experiment code for [Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition](https://arxiv.org/abs/2510.19471).\n\n![Minimum Bayes Risk Decoding](mbr-asr.png)\n\n### Setup\n\nOur codebase is developed and tested on Ubuntu 22.04. \nIt is not rigorously tested on other platforms.\nA Dockerfile is provided for reproducing the code.\nThe following procedure builds a Docker image used for the experiment.\n\n```\ngit clone git@github.com:CyberAgentAILab/mbr-for-asr.git\ncd mbr-for-asr\n\ndocker build . -t mbrasr:latest\n```\n\nThe codebase can likely run on a native macOS environment by installing dependencies directly with astral-uv instead of using Docker, though this hasn't been rigorously tested. We officially support Ubuntu. PRs to improve macOS compatibility are welcome.\n\n### Experiment\n\nThen, the experiments can be conducted inside the Docker container.\nYou need to set an environment variable HF_READ_TOKEN to your [huggingface's token](https://huggingface.co/docs/hub/en/security-tokens) to run the code. \n```\ndocker run -it -e HF_READ_TOKEN=${YOUR HUGGINGFACES TOKEN} mbrasr:latest\n```\n\nInside the docker image, one can run experiments using the scripts in experiments/ directory.\n\nThe following command (inside the Docker image) generates samples for MBR decoding.\n```\n./experiments/sample.sh -d {DOMAIN} -m {MODEL} -s {NSAMPLES}\n```\nBy default, it runs on LibriSpeech domain using [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) with 4 samples so that it runs swiftly on CPU. For larger models, we recommend using a GPU.\n\nThe generated samples are stored in samples/ directory.\n\nThen, the following command computes the MBR decoding using the sampled outputs.\n```\n./experiments/run_mbr.sh -d {DOMAIN} -m {MODEL} -s {NSAMPLES} -v {EVALUATION_METRIC}\n```\nThis code runs the evaluation at the same time.\nThis codebase supports various evaluation metrics, including WER, CER, BLEU, ROUGE, and METEOR.\nYou can also add your own metric by following the interface defined in [mbr/utility/utility_class.py](mbr/utility/utility_class.py).\n\nThe result of the evaluation is stored in results/ directory.\n\n\n### Demo\n\nWe have a Gradio app to compare beam search and MBR decoding in demo/ directory.\nBy default, it transcribes English speech in the audio. You can change the task, language, and the ASR model by editting the code [demo/app.py](demo/app.py).\nIt would be useful to qualitatively evaluate the two decoding algorithms.\nTo run the app, execute the following command.\n\n```\ncd demo\npip install -r requirements.txt\npython3 app.py\n```\n\n### LICENSE\n\nThe codebase is [MIT License](LICENSE), except for the implementation of MetricX in [mbr/utility/metricx.py](mbr/utility/metricx.py), which is owned by Google and distributed under Apache 2.0 license.\n\n### Reference\n\n[Yuu Jinnai. 2025. Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition. arXiv preprint arxiv:2510.19471.](https://arxiv.org/abs/2510.19471)\n\n### Contact\n\nFor any questions, feel free to raise an issue or contact me at jinnai_yu@cyberagent.co.jp.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyberagentailab%2Fmbr-for-asr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcyberagentailab%2Fmbr-for-asr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyberagentailab%2Fmbr-for-asr/lists"}