{"id":13705431,"url":"https://github.com/mallorbc/whisper_mic","last_synced_at":"2025-05-16T13:04:26.608Z","repository":{"id":61188851,"uuid":"540469444","full_name":"mallorbc/whisper_mic","owner":"mallorbc","description":"Project that allows one to use a microphone with OpenAI whisper.","archived":false,"fork":false,"pushed_at":"2024-07-04T07:19:05.000Z","size":56,"stargazers_count":762,"open_issues_count":25,"forks_count":167,"subscribers_count":20,"default_branch":"main","last_synced_at":"2025-04-02T06:12:23.822Z","etag":null,"topics":["microphone","speech-recognition","speech-to-text","whisper","whisper-ai","whisper-api"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mallorbc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-23T14:05:52.000Z","updated_at":"2025-03-29T01:26:28.000Z","dependencies_parsed_at":"2023-02-01T01:46:09.905Z","dependency_job_id":"0861bcf7-c9d6-43d0-a863-90a2d27b2c6a","html_url":"https://github.com/mallorbc/whisper_mic","commit_stats":{"total_commits":59,"total_committers":13,"mean_commits":4.538461538461538,"dds":0.576271186440678,"last_synced_commit":"8e700b75a152ee36db5fdd16af125cea8f9843a8"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mallorbc%2Fwhisper_mic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mallorbc%2Fwhisper_mic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mallorbc%2Fwhisper_mic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mallorbc%2Fwhisper_mic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mallorbc","download_url":"https://codeload.github.com/mallorbc/whisper_mic/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247999859,"owners_count":21031046,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["microphone","speech-recognition","speech-to-text","whisper","whisper-ai","whisper-api"],"created_at":"2024-08-02T22:00:40.902Z","updated_at":"2025-04-09T08:04:35.768Z","avatar_url":"https://github.com/mallorbc.png","language":"Python","funding_links":[],"categories":["Applications","Repositories","Python"],"sub_categories":[],"readme":"# Whisper Mic\r\nThis repo is based on the work done [here](https://github.com/openai/whisper) by OpenAI.  This repo allows you use use a mic as demo. This repo copies some of the README from the original project.\r\n\r\n## Video Tutorial\r\n\r\nThe latest video tutorial for this repo can be seen [here](https://youtu.be/S58MGCU7Wgg)\r\n\r\nAn older video tutorial for this repo can be seen [here](https://www.youtube.com/watch?v=nwPaRSlDSaY)\r\n\r\n### Professional Assistance\r\n\r\nIf are in need of paid professional help, that is available through this [email](mailto:blakecmallory@gmail.com)\r\n\r\n## Setup\r\n\r\nNow a pip package!\r\n\r\n1. Create a venv of your choice.\r\n2. Run ```pip install whisper-mic```\r\n\r\n## Available models and languages\r\n\r\nThere are five model sizes, four with English-only versions, offering speed and accuracy tradeoffs. Below are the names of the available models and their approximate memory requirements and relative speed. \r\n\r\n\r\n|  Size  | Parameters | English-only model | Multilingual model | Required VRAM | Relative speed |\r\n|:------:|:----------:|:------------------:|:------------------:|:-------------:|:--------------:|\r\n|  tiny  |    39 M    |     `tiny.en`      |       `tiny`       |     ~1 GB     |      ~32x      |\r\n|  base  |    74 M    |     `base.en`      |       `base`       |     ~1 GB     |      ~16x      |\r\n| small  |   244 M    |     `small.en`     |      `small`       |     ~2 GB     |      ~6x       |\r\n| medium |   769 M    |    `medium.en`     |      `medium`      |     ~5 GB     |      ~2x       |\r\n| large  |   1550 M   |        N/A         |      `large`       |    ~10 GB     |       1x       |\r\n\r\nFor English-only applications, the `.en` models tend to perform better, especially for the `tiny.en` and `base.en` models. We observed that the difference becomes less significant for the `small.en` and `medium.en` models.\r\n\r\n## Microphone Demo\r\n\r\nYou can use the model with a microphone using the ```whisper_mic``` program.  Use ```-h``` to see flag options.\r\n\r\nSome of the more important flags are the ```--model``` and ```--english``` flags.\r\n\r\n## Transcribing To A File\r\n\r\nUsing the command: ```whisper_mic --loop --dictate``` will type the words you say on your active cursor.\r\n\r\n## Usage In Other Projects\r\n\r\nYou can use this code in other projects rather than just use it for a demo.  You can do this with the ```listen``` method.\r\n\r\n```python\r\nfrom whisper_mic import WhisperMic\r\n\r\nmic = WhisperMic()\r\nresult = mic.listen()\r\nprint(result)\r\n```\r\n\r\nCheck out what the possible arguments are by looking at the ```cli.py``` file\r\n\r\n## Troubleshooting\r\n\r\nIf you are having issues, try the following:\r\n```\r\nsudo apt install portaudio19-dev python3-pyaudio\r\n```\r\n\r\n## Contributing\r\n\r\nSome ideas that you can add are:\r\n1. Supporting different implementations of Whisper\r\n2. Adding additional optional functionality.\r\n3. Add tests\r\n\r\n## License\r\n\r\nThe model weights of Whisper are released under the MIT License. See their repo for more information.\r\n\r\nThis code under this repo is under the MIT license.  See [LICENSE](LICENSE) for further details.\r\n\r\n## Thanks\r\nUntil recently, access to high performing speech to text models was only available through paid serviecs.  With this release, I am excited for the many applications that will come.\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmallorbc%2Fwhisper_mic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmallorbc%2Fwhisper_mic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmallorbc%2Fwhisper_mic/lists"}