{"id":40814402,"url":"https://github.com/kavehtehrani/speech2text-extension","last_synced_at":"2026-05-01T06:31:37.304Z","repository":{"id":297397261,"uuid":"995352229","full_name":"kavehtehrani/speech2text-extension","owner":"kavehtehrani","description":"GNOME Speech2Text - shell extension for speech to text (dictation) using OpenAI's automated speech recognition Whisper","archived":false,"fork":false,"pushed_at":"2026-02-08T21:20:51.000Z","size":934,"stargazers_count":50,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-02-09T02:50:35.797Z","etag":null,"topics":["dictation","gnome","gnome-shell-extension","speech-to-text","ubuntu","whisper"],"latest_commit_sha":null,"homepage":"https://extensions.gnome.org/extension/8238/gnome-speech2text/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kavehtehrani.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-03T10:55:45.000Z","updated_at":"2026-02-08T21:20:54.000Z","dependencies_parsed_at":"2025-06-05T09:21:26.867Z","dependency_job_id":"e9ad7ba7-bf50-4162-b41b-75583c6ed628","html_url":"https://github.com/kavehtehrani/speech2text-extension","commit_stats":null,"previous_names":["kavehtehrani/gnome-speech2text"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/kavehtehrani/speech2text-extension","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kavehtehrani%2Fspeech2text-extension","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kavehtehrani%2Fspeech2text-extension/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kavehtehrani%2Fspeech2text-extension/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kavehtehrani%2Fspeech2text-extension/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kavehtehrani","download_url":"https://codeload.github.com/kavehtehrani/speech2text-extension/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kavehtehrani%2Fspeech2text-extension/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32487300,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dictation","gnome","gnome-shell-extension","speech-to-text","ubuntu","whisper"],"created_at":"2026-01-21T21:37:37.050Z","updated_at":"2026-05-01T06:31:37.298Z","avatar_url":"https://github.com/kavehtehrani.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Speech2Text\n\n![GPLv3](https://img.shields.io/badge/License-GPLv3-yellow.svg)\n![Linux](https://img.shields.io/badge/Linux-FCC624?style=flat\u0026logo=linux\u0026logoColor=black)\n![GNOME](https://img.shields.io/badge/GNOME-4A90D9?style=flat\u0026logo=gnome\u0026logoColor=white)\n![JavaScript](https://img.shields.io/badge/JavaScript-F7DF1E?style=flat\u0026logo=javascript\u0026logoColor=black)\n![Python](https://img.shields.io/badge/Python-3776AB?style=flat\u0026logo=python\u0026logoColor=white)\n![D-Bus](https://img.shields.io/badge/D--Bus-000000?style=flat\u0026logo=dbus\u0026logoColor=white)\n![Whisper](https://img.shields.io/badge/Whisper-412991?style=flat\u0026logo=openai\u0026logoColor=white)\n[![Download from GNOME Extensions](https://img.shields.io/badge/Download%20from-GNOME%20Extensions-blue)](https://extensions.gnome.org/extension/8238/speech2text-extension/)\n\nA GNOME Shell extension that adds speech-to-text functionality\nusing OpenAI's automated speech recognition [Whisper](https://github.com/openai/whisper) model. Speak into your microphone and have your words transcribed with the option to automatically insert at your cursor (on X11 only).\n\n![recording-modal](./images/recording-modal.png)\n\n## Features\n\n- 🎤 **Speech Recognition** using OpenAI Whisper\n- 🖱️ **Click to Record** from top panel microphone icon\n- ⌨️ **Keyboard Shortcut** support (default: Alt+Super+R)\n- 🌍 **Multi-language Support** (depending on Whisper model)\n- 🔒 **Privacy-First** - All processing happens locally\n- ⌨️ **Automatic Text Insertion** at cursor location (only on X11)\n- 🔄 **Non-blocking Mode** - Continue working while transcription processes in the background\n\n## Architecture\n\nThe extension consists of two components:\n\n1. **GNOME Extension** (lightweight UI) - Provides the panel button, keyboard shortcuts, and settings\n2. **D-Bus Service** (separate package) - Handles audio recording, speech transcription, and text insertion\n\n**Important for GNOME Extensions Store**: This extension follows GNOME's architectural guidelines by using a separate\nD-Bus service for speech processing. The extension itself is lightweight and communicates with the external service over\nD-Bus using the `org.gnome.Shell.Extensions.Speech2Text` interface. The service is **not bundled** with the extension\nand must be installed separately as a dependency. This extension requires the external background\nservice [speech2text-extension-service](https://pypi.org/project/speech2text-extension-service/) to be installed.\nSee [Service Installation](#Service-Installation) below.\n\n## Requirements\n\n### System Dependencies\n\n- **GNOME Shell 46 or later** (tested up to GNOME 49)\n- **Python 3.8–3.13** (Python 3.14+ not supported yet due to ML dependency compatibility)\n- **python3-venv** (for virtual environment creation)\n- **D-Bus Python library** is installed inside the service virtualenv (`dbus-next`; no system `python3-dbus` / `python3-gi` required)\n- **FFmpeg** (for audio recording)\n- **xdotool** (for text insertion on X11 only)\n- **Clipboard tools**: xclip/xsel (X11) or wl-clipboard (Wayland)\n\nIf you are missing any of the required dependencies the installation script will let you know.\n\n# Installation\n\n## 1- Extension Installation\n\n### GNOME Extensions Store (recommended)\n\n[![Download from GNOME Extensions](https://img.shields.io/badge/Download%20from-GNOME%20Extensions-blue)](https://extensions.gnome.org/extension/8238/speech2text-extension/)\n\n1. Visit [GNOME Extensions](https://extensions.gnome.org/extension/8238/speech2text-extension/) and click \"Install\"\n2. The extension will automatically detect required system packages and let you know what you will need to install\n3. Follow the setup dialog to install the required D-Bus service (automatically downloads from PyPI)\n4. Restart GNOME Shell to complete the installation\n\n### Manual Installation\n\nFor the manual installation experience, use the repository installer script:\n\n```bash\ngit clone https://github.com/kavehtehrani/speech2text-extension.git\ncd speech2text-extension\nmake install\n```\n\n#### IMPORTANT: Restart GNOME Shell After Installation\n\n**For X11 sessions:**\n\n1. Press `Alt+F2`\n2. Type `r`\n3. Press `Enter`\n\n**For Wayland sessions:**\n\n1. Log out of your current session\n2. Log back in\n\n## 2- Service Installation\n\nThe D-Bus service has to be manually installed per GNOME's guidelines. For most people, the 'base' model and 'cpu' processing is sufficient and most compatible across platforms.\n\n```bash\ncurl -sSL https://raw.githubusercontent.com/kavehtehrani/speech2text-extension/refs/heads/main/service/install-service.sh | bash -s -- --pypi --non-interactive --service-version 1.2.0 --whisper-model base\n```\n\n#### Whisper model \u0026 CPU/GPU settings\n\nSpeech2Text uses OpenAI Whisper locally. You configure model/device by (re)installing the D-Bus service with the appropriate installer flags:\n\n- **Whisper model**: `tiny`, `base`, `small`, `medium`, `large`, and variants. See [here](https://github.com/openai/whisper) for more info.\n- **Device**:\n  - **CPU (default)**: recommended for most users; easier install and compatibility.\n  - **GPU**: attempts to use an accelerator backend via PyTorch. On Linux this usually means **NVIDIA CUDA**.\n    (Advanced users may be able to use other backends depending on their PyTorch build.)\n\nImportant: switching CPU/GPU will require reinstalling the background service so the correct ML dependencies are installed.\n\nFor instance if you wanted to run the whisper model 'medium' and use 'gpu' processing, then install the service with:\n\n```bash\ncurl -sSL https://raw.githubusercontent.com/kavehtehrani/speech2text-extension/refs/heads/main/service/install-service.sh | bash -s -- --pypi --non-interactive --service-version 1.2.0 --gpu --whisper-model medium\n```\n\nNotes about installers and distributions:\n\n- This repository includes `service/install-service.sh`, a distro-agnostic service installer that only verifies system\n  dependencies and installs the Python D-Bus service into `~/.local/share/speech2text-extension-service`.\n- You must install system packages yourself using your distro’s package manager. The setup dialog will list any missing\n  packages.\n  - Note: the setup dialog’s **Automatic Install** uses `--pypi` (PyPI). If you are developing locally from a git clone,\n    use `./service/install-service.sh --local` instead.\n  - Note: the installer supports **GPU mode** via `--gpu`.\n\nThe service is available as a Python package on\nPyPI: [speech2text-extension-service](https://pypi.org/project/speech2text-extension-service/)\n\n### Upgrading from older versions (CUDA/NVIDIA pip packages cleanup)\n\nOlder versions of the service installer could pull GPU-related _pip packages_ (e.g. `nvidia-*`) into the service’s\nvirtual environment. New versions default to **CPU-only** PyTorch wheels unless you explicitly choose GPU mode.\n\nIf you are using **CPU mode** and want to remove legacy GPU-related pip packages, simply re-run the installer\n(from the setup dialog or manually). The installer rebuilds the service virtual environment from scratch, so it will\nremove any old GPU-related pip packages from the service venv automatically.\n\n## Usage\n\n### Quick Start\n\n1. **Click** the microphone icon in the top panel, or\n2. **Press** the keyboard shortcut (default: Alt+Super+R)\n3. **Speak** when the recording dialog appears\n4. **Review** the transcribed text in the preview dialog\n5. **Click Insert** to type the text, or **Copy** to clipboard\n\n#### Non-blocking Mode\n\nWith non-blocking transcription enabled:\n\n1. Record your speech as usual\n2. The modal closes immediately when recording stops\n3. A \"...\" appears next to the microphone icon while processing\n4. Click the notification when transcription is ready to review/copy\n\n## Troubleshooting\n\nIf the extension doesn't appear in GNOME Extensions:\n\nFirst make sure 1- extension is enabled in the GNOME Extensions, and 2- you have restarted your shell already. Otherwise, proceed to troubleshoot:\n\n```bash\n# View extension logs\njournalctl -f | grep -E \"(gnome-shell|speech2text-extension-service|speech2text|ffmpeg|org\\.gnome\\.Speech2Text|Whisper|transcrib)\"\n\n# Check installation status\nmake status\n\n# Verify schema compilation\nmake verify-schema\n\n```\n\nIf the D-Bus service isn't working:\n\n```bash\n# Check if service is running\ndbus-send --session --print-reply --dest=org.gnome.Shell.Extensions.Speech2Text /org/gnome/Shell/Extensions/Speech2Text org.gnome.Shell.Extensions.Speech2Text.GetServiceStatus\n\n# Start the service manually\n~/.local/share/speech2text-extension-service/speech2text-extension-service\n\n# Check D-Bus service file\nls ~/.local/share/dbus-1/services/org.gnome.Shell.Extensions.Speech2Text.service\n```\n\nYou can read more about the D-Bus service here: [D-Bus Service Documentation](./service/README.md).\n\n### GNOME Shell Crashes\n\nIf you experience GNOME Shell crashes when using the extension, use the crash analysis script:\n\n```bash\n# After a crash, run the debug script\n./debug-crash.sh\n```\n\nThis script will analyze system logs and generate a detailed crash report. Choose option 1 (last 30 minutes) after\nexperiencing a crash. The script will create a timestamped file with all relevant crash information.\n\n### Text Insertion Not Working\n\n1. **On X11**: Ensure xdotool is installed\n2. **On Wayland**: Text insertion is limited - use Copy to Clipboard instead\n3. Check if target application accepts simulated keyboard input\n\n## Uninstallation\n\n### Gnome Extensions\n\nYou should be able to uninstall the extension directly using the GNOME Extensions tool.\n\n### Manual Uninstallation\n\n```bash\n# Remove everything (extension + service)\nmake clean\n```\n\n## Privacy \u0026 Security\n\n🔒 **100% Local Processing** - All speech recognition happens on your local machine. Nothing is ever sent to the cloud or\nexternal servers. The extension uses OpenAI's Whisper model locally, ensuring privacy of your voice data.\n\n## Development\n\n### Building from Source\n\n```bash\n# Complete development setup (install extension + service + compile schemas)\nmake setup\n\n# Check installation status\nmake status\n\n# Clean installation (extension + d-bus service)\nmake clean\n```\n\n## License\n\nThis project is licensed under the GPLv3 - see the LICENSE file for details.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a pull request or open issues.\n\n### Reporting Issues\n\nPlease include:\n\n- GNOME Shell version (`gnome-shell --version`)\n- Operating system and version (`lsb_release -a`)\n- Session type (`echo $XDG_SESSION_TYPE`)\n- Extension logs (`journalctl /usr/bin/gnome-shell | grep speech2text`)\n- Service logs (`journalctl --user -u speech2text-service`)\n- **For crashes**: Run `./debug-crash.sh` and include the generated report\n- Steps to reproduce the issue\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkavehtehrani%2Fspeech2text-extension","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkavehtehrani%2Fspeech2text-extension","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkavehtehrani%2Fspeech2text-extension/lists"}