{"id":28387644,"url":"https://github.com/calebe94/vocalize","last_synced_at":"2025-06-26T20:31:32.283Z","repository":{"id":241000885,"uuid":"762060673","full_name":"Calebe94/vocalize","owner":"Calebe94","description":"Vocalize is a shell script-based application that combines multiple tools such as whisper, tgpt, and whiptail to provide an efficient and user-friendly audio-to-text transcription process. ","archived":false,"fork":false,"pushed_at":"2024-10-15T15:42:06.000Z","size":305,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-31T00:52:09.454Z","etag":null,"topics":["ai","gpt","shell","shell-script","tgpt","whiptail","whisper-ai","whisper-cpp"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Calebe94.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-23T02:15:19.000Z","updated_at":"2024-10-15T15:43:10.000Z","dependencies_parsed_at":"2024-05-21T22:03:20.196Z","dependency_job_id":null,"html_url":"https://github.com/Calebe94/vocalize","commit_stats":null,"previous_names":["calebe94/vocalize"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Calebe94/vocalize","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Calebe94%2Fvocalize","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Calebe94%2Fvocalize/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Calebe94%2Fvocalize/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Calebe94%2Fvocalize/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Calebe94","download_url":"https://codeload.github.com/Calebe94/vocalize/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Calebe94%2Fvocalize/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262139493,"owners_count":23265164,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","gpt","shell","shell-script","tgpt","whiptail","whisper-ai","whisper-cpp"],"created_at":"2025-05-30T18:38:16.893Z","updated_at":"2025-06-26T20:31:32.195Z","avatar_url":"https://github.com/Calebe94.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Vocalize\n\n**Vocalize** is an interactive tool designed to facilitate audio transcription and text refinement through an AI-driven workflow. It supports recording audio, transcribing it into text, and enriching the transcription with more concise and coherent summaries using artificial intelligence.\n\n![](res/vocalize.png)\n\n## Project Overview\n\nVocalize is a shell script-based application that combines multiple tools such as **whisper**, **tgpt**, and **whiptail** to provide an efficient and user-friendly audio-to-text transcription process. This tool is particularly useful for users who need to transcribe audio files into text, refine the transcribed text with the help of AI, and manage the transcriptions seamlessly within the terminal.\n\nThe tool offers two main modes of operation:\n1. **Interactive Mode**: This mode presents a menu-driven interface where users can record audio, transcribe it, refine it with AI, and manage saved files easily.\n2. **Minimal Mode**: This mode hides most of the UI elements, only showing the audio recording screen, allowing for a more focused experience.\n\n## How It Works\n\n1. **Audio Recording**: Users can record audio directly from their microphone using the `arecord` command. Once recorded, the audio is converted to the appropriate format using `ffmpeg` and saved in a temporary location.\n2. **Transcription**: Using **Whisper**, the recorded audio is transcribed into text in Portuguese. The transcription can be viewed, copied to the clipboard, or saved as a file.\n3. **Text Enrichment**: The transcribed text can be improved with the help of **tgpt**, which rewrites the transcription to be more concise and clear, adhering to the user-defined prompt for summarization.\n4. **Clipboard Integration**: Both the original transcription and the AI-enhanced text can be copied to the clipboard for easy pasting into other applications.\n5. **File Management**: The application allows saving the recorded audio and transcriptions, or removing them from the system when no longer needed. It also supports playback of the recorded audio files directly from the terminal using **mpv**.\n\n## Features\n\n- **Audio Recording**: Record audio directly from the microphone with a simplified interface.\n- **Transcription**: Convert audio to text using Whisper with support for Portuguese (Brazil).\n- **Text Enrichment**: Improve the quality of the transcribed text with an AI-based prompt.\n- **Clipboard Support**: Easily copy the transcribed or AI-enhanced text to the system clipboard.\n- **File Management**: Save, delete, or open transcriptions and audio recordings.\n- **Minimal Mode**: Run the application with only the audio recording interface visible for a distraction-free experience.\n- **Interactive Mode**: Use an easy-to-navigate terminal menu to access all features of the tool.\n\n## Requirements\n\n- **[Whisper.cpp](https://github.com/ggerganov/whisper.cpp)**: A speech-to-text engine for audio transcription.\n- **[tgpt](https://github.com/aandrew-me/tgpt)**: A command-line interface to interact with GPT-based AI models.\n- **whiptail**: A utility for creating dialog boxes in shell scripts.\n- **mpv**: A media player to playback recorded audio.\n- **arecord**: For recording audio from the microphone.\n- **ffmpeg**: To convert audio files into the appropriate format for transcription.\n- **xclip**: A clipboard manager that allows copying text to the system clipboard.\n\n## Installation\n\n### Prerequisites\n\nBefore installing **Vocalize**, ensure that you have the necessary dependencies installed on your system. For a Debian-based OS, you can install them using the following commands:\n\n```bash\nsudo apt update\nsudo apt install -y arecord ffmpeg whiptail mpv xclip\n```\n\n- **arecord**: A command-line sound recorder for capturing audio from your microphone.\n- **ffmpeg**: A powerful tool to process audio and video files.\n- **whiptail**: A tool to create dialog boxes in shell scripts.\n- **mpv**: A media player used to play back the recorded audio.\n- **xclip**: A clipboard manager that allows copying text to the system clipboard.\n\n#### tgpt\n\nTo install [tgpt](https://github.com/aandrew-me/tgpt) you can do it by running the following command:\n\n``` bash\ncurl -sSL https://raw.githubusercontent.com/aandrew-me/tgpt/main/install | bash -s /usr/local/bin\n```\n\n#### whisper.cpp\n\nTo install [whisper.cpp](https://github.com/ggerganov/whisper.cpp), you need to first clone the repo:\n\n```bash\ngit clone https://github.com/ggerganov/whisper.cpp.git\n```\n\nNavigate into the directory:\n\n```\ncd whisper.cpp\n```\n\nThen, download one of the Whisper models converted in `ggml`. For example:\n\n```bash\nsh ./models/download-ggml-model.sh ggml-base.en\n```\n\nNow build the `main` example and transcribe an audio file like this:\n\n```bash\n# build the main example\nmake\n```\n\nNow export the `whisper.cpp` path to your `$PATH` env var:\n\n``` bash\nexport PATH=$PATH:/path/to/whisper.cpp\n```\n\nAnd then create a `WHISPER_MODELS_DIR` variable by exporting the models path:\n\n``` bash\nexport WHISPER_MODELS_DIR=$WHISPER_MODELS_DIR:/path/to/whisper.cpp/models\n```\n\n### Installing Vocalize\n\n1. Clone the repository or download the source files for **Vocalize**.\n\n2. Navigate to the **Vocalize** directory and run the following command to install the tool:\n\n```bash\nsudo make install\n```\n\nThis will install **Vocalize** into `/usr/local/bin` by default. You can specify a different installation path by setting the `prefix` variable during installation, like this:\n\n```bash\nsudo make install prefix=/custom/path\n```\n\nAfter installation, the tool will be available to use from the terminal as `vocalize`.\n\n### Uninstallation\n\nTo remove **Vocalize** from your system, you can use the following command:\n\n```bash\nsudo make uninstall\n```\n\nThis will remove the `vocalize` executable from the installation directory.\n\n## Usage\n\n(To be added...)\n\n## License\n\nAll software is covered by the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcalebe94%2Fvocalize","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcalebe94%2Fvocalize","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcalebe94%2Fvocalize/lists"}