{"id":19671424,"url":"https://github.com/lasithaamarasinghe/real-time-speech-recognition","last_synced_at":"2026-05-06T14:39:28.594Z","repository":{"id":244856145,"uuid":"812144296","full_name":"LasithaAmarasinghe/Real-Time-Speech-Recognition","owner":"LasithaAmarasinghe","description":"This project includes a system that can record live speech using your microphone and then transcribe it using speech recognition.","archived":false,"fork":false,"pushed_at":"2024-06-16T06:13:08.000Z","size":15,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-10T03:39:26.102Z","etag":null,"topics":["ipywidgets","jupyter-notebook","machine-learning","pyaudio","pydub","python3","pytorch","realtime-speech-recognition","speech-to-text","transformers","vosk"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LasithaAmarasinghe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-08T04:38:59.000Z","updated_at":"2024-06-17T19:50:11.000Z","dependencies_parsed_at":"2024-06-17T22:55:49.355Z","dependency_job_id":"1b257425-465b-400c-905e-ae7e4db46eb9","html_url":"https://github.com/LasithaAmarasinghe/Real-Time-Speech-Recognition","commit_stats":null,"previous_names":["lasithaamarasinghe/real-time-speech-recognition"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LasithaAmarasinghe%2FReal-Time-Speech-Recognition","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LasithaAmarasinghe%2FReal-Time-Speech-Recognition/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LasithaAmarasinghe%2FReal-Time-Speech-Recognition/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LasithaAmarasinghe%2FReal-Time-Speech-Recognition/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LasithaAmarasinghe","download_url":"https://codeload.github.com/LasithaAmarasinghe/Real-Time-Speech-Recognition/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240980888,"owners_count":19888344,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ipywidgets","jupyter-notebook","machine-learning","pyaudio","pydub","python3","pytorch","realtime-speech-recognition","speech-to-text","transformers","vosk"],"created_at":"2024-11-11T17:08:45.572Z","updated_at":"2026-05-06T14:39:23.573Z","avatar_url":"https://github.com/LasithaAmarasinghe.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Real-Time-Speech-Recognition\n\n![file](https://github.com/LasithaAmarasinghe/Real-Time-Speech-Recognition/assets/106037441/701d01e0-54aa-4156-8510-353ab5319441)\n\n## Overview\n\n* This project includes a system that can record live speech using the microphone and then transcribe it using speech recognition.  \n* This can be used to automatically record and transcribe meetings, lectures, and other events.\n* This repository contains all the codes and resources of this project.\n\n## Steps\n\n* Creating Jupyter widgets to record audio and stop recording\n* Using pyaudio to record microphone audio\n* Creating a speech recognition system using vosk\n* Adding punctuation to the text transcript using recasepunc\n\n## Code\n\nYou can find the code for this project here.\n* [microphone.ipynb](https://github.com/LasithaAmarasinghe/Real-Time-Speech-Recognition/blob/main/microphone.ipynb).\n\n## Technologies/Tools\n\n* Jupyter Notebook / JupyterLab\n* Python 3.10.12\n* Pytorch `pip install torch -f https://download.pytorch.org/whl/torch_stable.html`\n* Python packages\n    * vosk `pip install vosk`\n    * pydub `pip install pydub`\n    * transformers `pip install transformers`\n    * pyaudio `pip install pyaudio`\n    * ipywidgets `pip install ipywidgets`\n\n![Python](https://img.shields.io/badge/python-3670A0?logo=python\u0026logoColor=FFFF00)\n![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?logo=jupyter\u0026logoColor=white)\n![Pytorch](https://img.shields.io/badge/pytorch_-%20darkorange?logo=pytorch\u0026logoColor=white)\n![vosk](https://img.shields.io/badge/vosk_-%20dark%20green)\n![pydub](https://img.shields.io/badge/pydub_-%20purple)\n![transformers](https://img.shields.io/badge/transformers_-%20blue)\n![pyaudio](https://img.shields.io/badge/pyaudio_-%20orange)\n![ipywidgets](https://img.shields.io/badge/ipywidgets_-%20black)\n\n## Installation Guidelines\n\n### Vosk\n\nYou need to download a model file to run vosk properly.  This automatically downloads when you run this code:\n\n```\nfrom vosk import Model\nModel(model_name=\"vosk-model-small-en-us-0.15\")\n```\n\nThe full vosk model is large (1GB+).  If you want to use it, just specify `vosk-model-en-us-0.22` as the model name.\n\nIf the models don't automatically download, you can find them [here](https://alphacephei.com/vosk/models).\n\n### Punctuation\n\nBy default, vosk outputs text with no punctuation.  To add in punctuation, we need a different model.  To get this, follow these steps:\n\n* Download the model [here](https://alphacephei.com/vosk/models/vosk-recasepunc-en-0.22.zip) - caution: 1GB+ in size.\n* Extract the zip file into the same directory as your code.\n\n### Pyaudio\n\nPyaudio can be a little tricky to install, since it depends on system packages.  Check the [homepage](http://people.csail.mit.edu/hubert/pyaudio/) for specific instructions for each OS.\n\nYou also want to figure out the right device to record from.  Run this code to find the index of your microphone:\n\n```\n# Find audio device index\nimport pyaudio\np = pyaudio.PyAudio()\nfor i in range(p.get_device_count()):\n    print(p.get_device_info_by_index(i))\n\np.terminate()\n```\n\n\n## Data\n\nAll audio will come from the microphone.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flasithaamarasinghe%2Freal-time-speech-recognition","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flasithaamarasinghe%2Freal-time-speech-recognition","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flasithaamarasinghe%2Freal-time-speech-recognition/lists"}