{"id":16095967,"url":"https://github.com/henestrosadev/audiotext","last_synced_at":"2025-09-07T06:06:36.987Z","repository":{"id":77707307,"uuid":"595771315","full_name":"HenestrosaDev/audiotext","owner":"HenestrosaDev","description":"A desktop application that transcribes audio from files, microphone input or YouTube videos with the option to translate the content and create subtitles.","archived":false,"fork":false,"pushed_at":"2024-10-15T15:06:46.000Z","size":84429,"stargazers_count":192,"open_issues_count":8,"forks_count":17,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-03-28T10:09:41.113Z","etag":null,"topics":["audio-to-text","customtkinter","ffmpeg","python","speech-recognition","speech-to-text","speech-to-text-api","subtitles-generator","transcriber","video-to-text","whisperx"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HenestrosaDev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":"docs/supported-files.png","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"ko_fi":"henestrosadev"}},"created_at":"2023-01-31T19:23:31.000Z","updated_at":"2025-03-23T11:40:58.000Z","dependencies_parsed_at":"2023-10-03T16:13:11.741Z","dependency_job_id":"edd17508-0aef-44ae-82ef-b8759fe24490","html_url":"https://github.com/HenestrosaDev/audiotext","commit_stats":{"total_commits":455,"total_committers":5,"mean_commits":91.0,"dds":"0.14725274725274728","last_synced_commit":"8c729bc0d4ac0b6853b7b84190d032158e03cf2b"},"previous_names":["henestrosadev/audiotext"],"tags_count":14,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenestrosaDev%2Faudiotext","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenestrosaDev%2Faudiotext/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenestrosaDev%2Faudiotext/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HenestrosaDev%2Faudiotext/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HenestrosaDev","download_url":"https://codeload.github.com/HenestrosaDev/audiotext/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247166168,"owners_count":20894654,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio-to-text","customtkinter","ffmpeg","python","speech-recognition","speech-to-text","speech-to-text-api","subtitles-generator","transcriber","video-to-text","whisperx"],"created_at":"2024-10-09T17:09:49.358Z","updated_at":"2025-04-04T11:14:06.935Z","avatar_url":"https://github.com/HenestrosaDev.png","language":"Python","funding_links":["https://ko-fi.com/henestrosadev"],"categories":[],"sub_categories":[],"readme":"\u003cdiv id=\"top\"\u003e\u003c/div\u003e\n\n\u003c!-- PROJECT SHIELDS --\u003e\n\u003c!--\n*** I am using markdown \"reference style\" links for readability.\n*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).\n*** See the bottom of this document for the declaration of the reference variables\n*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.\n*** https://www.markdownguide.org/basic-syntax/#reference-style-links\n--\u003e\n\n\u003c!-- PROJECT LOGO --\u003e\n\u003cdiv align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/icon.png\"\n      width=\"128\"\n      height=\"128\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/icon.png\"\n      width=\"128\"\n      height=\"128\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg src=\"docs/light/icon.png\" alt=\"Logo\" width=\"128\" height=\"128\"\u003e\n  \u003c/picture\u003e\n  \u003ch1 align=\"center\"\u003eAudiotext\u003c/h1\u003e\n  \u003cp align=\"center\"\u003eA desktop application that transcribes audio from files, microphone input or YouTube videos with the option to translate the content and create subtitles.\u003c/p\u003e\n  \u003cp\u003e\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/actions/workflows/code-quality.yml\"\u003e\n      \u003cimg\n        src=\"https://github.com/HenestrosaDev/audiotext/actions/workflows/code-quality.yml/badge.svg\"\n        alt=\"Code Quality badge status\"\n      /\u003e\n    \u003c/a\u003e\n    \u003cbr\u003e\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/releases/latest\"\u003e\n      \u003cimg\n        src=\"https://img.shields.io/github/v/release/HenestrosaDev/audiotext\"\n        alt=\"Version\"\n      /\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/stargazers\"\u003e\n      \u003cimg\n        src=\"https://img.shields.io/github/stars/HenestrosaDev/audiotext\"\n        alt=\"GitHub Contributors\"\n      /\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/blob/main/LICENSE\"\u003e\n      \u003cimg\n        src=\"https://img.shields.io/badge/license-BSD--4--Clause-lightgray\"\n        alt=\"License\"\n      /\u003e\n    \u003c/a\u003e\n    \u003cbr\u003e\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/graphs/contributors\"\u003e\n      \u003cimg\n        src=\"https://img.shields.io/github/contributors/HenestrosaDev/audiotext\"\n        alt=\"GitHub Contributors\"\n      /\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/issues\"\u003e\n      \u003cimg\n        src=\"https://img.shields.io/github/issues/HenestrosaDev/audiotext\"\n        alt=\"Issues\"\n      /\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/pulls\"\u003e\n      \u003cimg\n        src=\"https://img.shields.io/github/issues-pr/HenestrosaDev/audiotext\"\n        alt=\"GitHub pull requests\"\n      /\u003e\n    \u003c/a\u003e\n  \u003c/p\u003e\n  \u003cp\u003e\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/issues/new/choose\"\u003e\n      Report Bug\n    \u003c/a\u003e\n    ·\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/issues/new/choose\"\u003e\n      Request Feature\n    \u003c/a\u003e\n    ·\n    \u003ca href=\"https://github.com/HenestrosaDev/audiotext/discussions\"\u003e\n      Ask Question\n    \u003c/a\u003e\n  \u003c/p\u003e\n\u003c/div\u003e\n\n\u003c!-- TABLE OF CONTENTS --\u003e\n\n## Table of Contents\n\n- [About the Project](#about-the-project)\n  - [Supported Languages](#supported-languages)\n  - [Supported File Types](#supported-file-types)\n  - [Project Structure](#project-structure)\n  - [Built With](#built-with)\n- [Getting Started](#getting-started)\n  - [Installation](#installation)\n  - [Setting Up the Project Locally](#setting-up-the-project-locally)\n  - [Notes](#notes)\n- [Usage](#usage)\n  - [Transcription Language](#transcription-language)\n  - [Transcription Method](#transcription-method)\n  - [Audio Source](#audio-source)\n  - [Save Transcription](#save-transcription)\n    - [Autosave](#autosave)\n    - [Overwrite Existing Files](#overwrite-existing-files)\n  - [Google Speech-To-Text API Options](#google-speech-to-text-api-options)\n    - [Google API Key](#google-api-key)\n  - [Whisper API Options](#whisper-api-options)\n    - [Whisper API Key](#whisper-api-key)\n    - [Response Format](#response-format)\n    - [Temperature](#temperature)\n    - [Timestamp Granularities](#timestamp-granularities)\n  - [WhisperX Options](#whisperx-options)\n    - [Output File Types](#output-file-types)\n    - [Translate to English](#translate-to-english)\n    - [Subtitle Options](#subtitle-options)\n      - [Highlight Words](#highlight-words)\n      - [Max. Line Width](#max-line-width)\n      - [Max. Line Count](#max-line-count)\n    - [Advanced Options](#advanced-options)\n      - [Model Size](#model-size)\n      - [Compute Type](#compute-type)\n      - [Batch Size](#batch-size)\n      - [Use CPU](#use-cpu)\n- [Troubleshooting](#troubleshooting)\n- [Roadmap](#roadmap)\n- [Authors](#authors)\n- [Contributing](#contributing)\n- [Acknowledgments](#acknowledgments)\n- [License](#license)\n- [Support](#support)\n\n\u003c!-- ABOUT THE PROJECT --\u003e\n\n## About the Project\n\n![Main](docs/main-system.png)\n\n**Audiotext** transcribes the audio from an audio file, video file, microphone input, directory, or YouTube video into any of the 99 different languages it supports. You can transcribe using the [**Google Speech-to-Text API**](https://cloud.google.com/speech-to-text), the [**Whisper API**](https://platform.openai.com/docs/guides/speech-to-text), or [**WhisperX**](https://github.com/m-bain/whisperX). The last two methods can even translate the transcription or generate subtitles!\n\nYou can also choose the theme you like best. It can be dark, light, or the one configured in the system.\n\n\u003cdetails\u003e\n  \u003csummary\u003eDark\u003c/summary\u003e\n  \u003cimg src=\"docs/dark/from-file.png\" alt=\"Dark theme\"\u003e\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eLight\u003c/summary\u003e\n  \u003cimg src=\"docs/light/from-file.png\" alt=\"Light theme\"\u003e\n\u003c/details\u003e\n\n\u003c!-- SUPPORTED LANGUAGES --\u003e\n\n### Supported Languages\n\n\u003cdetails\u003e\n  \u003csummary\u003eClick here to display\u003c/summary\u003e\n\n  - Afrikaans\n  - Albanian\n  - Amharic\n  - Arabic\n  - Armenian\n  - Assamese\n  - Azerbaijan\n  - Bashkir\n  - Basque\n  - Belarusian\n  - Bengali\n  - Bosnian\n  - Breton\n  - Bulgarian\n  - Burmese\n  - Catalan\n  - Chinese\n  - Chinese (Yue)\n  - Croatian\n  - Czech\n  - Danish\n  - Dutch\n  - English\n  - Estonian\n  - Faroese\n  - Farsi\n  - Finnish\n  - French\n  - Galician\n  - Georgian\n  - German\n  - Greek\n  - Gujarati\n  - Haitian\n  - Hausa\n  - Hawaiian\n  - Hebrew\n  - Hindi\n  - Hungarian\n  - Icelandic\n  - Indonesian\n  - Italian\n  - Japanese\n  - Javanese\n  - Kannada\n  - Kazakh\n  - Khmer\n  - Korean\n  - Lao\n  - Latin\n  - Latvian\n  - Lingala\n  - Lithuanian\n  - Luxembourgish\n  - Macedonian\n  - Malagasy\n  - Malay\n  - Malayalam\n  - Maltese\n  - Maori\n  - Marathi\n  - Mongolian\n  - Nepali\n  - Norwegian\n  - Norwegian Nynorsk\n  - Occitan\n  - Pashto\n  - Polish\n  - Português\n  - Punjabi\n  - Romanian\n  - Russian\n  - Sanskrit\n  - Serbian\n  - Shona\n  - Sindhi\n  - Sinhala\n  - Slovak\n  - Slovenian\n  - Somali\n  - Spanish\n  - Sundanese\n  - Swahili\n  - Swedish\n  - Tagalog\n  - Tajik\n  - Tamil\n  - Tatar\n  - Telugu\n  - Thai\n  - Tibetan\n  - Turkish\n  - Turkmen\n  - Ukrainian\n  - Urdu\n  - Uzbek\n  - Vietnamese\n  - Welsh\n  - Yiddish\n  - Yoruba\n\u003c/details\u003e\n\n### Supported File Types\n\n\u003cdetails\u003e\n  \u003csummary\u003eAudio file formats\u003c/summary\u003e\n\n  - `.aac`\n  - `.flac`\n  - `.mp3`\n  - `.mpeg`\n  - `.oga`\n  - `.ogg`\n  - `.opus`\n  - `.wav`\n  - `.wma`\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eVideo file formats\u003c/summary\u003e\n\n  - `.3g2`\n  - `.3gp2`\n  - `.3gp`\n  - `.3gpp2`\n  - `.3gpp`\n  - `.asf`\n  - `.avi`\n  - `.f4a`\n  - `.f4b`\n  - `.f4v`\n  - `.flv`\n  - `.m4a`\n  - `.m4b`\n  - `.m4r`\n  - `.m4v`\n  - `.mkv`\n  - `.mov`\n  - `.mp4`\n  - `.ogv`\n  - `.ogx`\n  - `.webm`\n  - `.wmv`\n\u003c/details\u003e\n\n\u003c!-- PROJECT STRUCTURE --\u003e\n\n### Project Structure\n\n\u003cdetails\u003e\n  \u003csummary\u003eASCII folder structure\u003c/summary\u003e\n\n  ```\n  │   .gitignore\n  │   audiotext.spec\n  │   LICENSE\n  │   README.md\n  │   requirements.txt\n  │\n  ├───.github\n  │   │   CONTRIBUTING.md\n  │   │   FUNDING.yml\n  │   │\n  │   ├───ISSUE_TEMPLATE\n  │   │       bug_report_template.md\n  │   │       feature_request_template.md\n  │   │\n  │   └───PULL_REQUEST_TEMPLATE\n  │           pull_request_template.md\n  │\n  ├───docs/\n  │\n  ├───res\n  │   ├───img\n  │   │       icon.ico\n  │   │\n  │   └───locales\n  │       │   main_controller.pot\n  │       │   main_window.pot\n  │       │\n  │       ├───en\n  │       │   └───LC_MESSAGES\n  │       │           app.mo\n  │       │           app.po\n  │       │           main_controller.po\n  │       │           main_window.po\n  │       │\n  │       └───es\n  │           └───LC_MESSAGES\n  │                   app.mo\n  │                   app.po\n  │                   main_controller.po\n  │                   main_window.po\n  │\n  └───src\n      │   app.py\n      │\n      ├───controllers\n      │       __init__.py\n      │       main_controller.py\n      │\n      ├───handlers\n      │       file_handler.py\n      │       google_api_handler.py\n      │       openai_api_handler.py\n      │       whisperx_handler.py\n      │       youtube_handler.py\n      │\n      ├───interfaces\n      │       transcribable.py\n      │\n      ├───models\n      │   │   __init__.py\n      │   │   transcription.py\n      │   │\n      │   └───config\n      │           __init__.py\n      │           config_subtitles.py\n      │           config_system.py\n      │           config_transcription.py\n      │           config_whisper_api.py\n      │           config_whisperx.py\n      │\n      ├───utils\n      │       __init__.py\n      │       audio_utils.py\n      │       config_manager.py\n      │       constants.py\n      │       dict_utils.py\n      │       enums.py\n      │       env_keys.py\n      │       path_helper.py\n      │\n      └───views\n          │   __init__.py\n          │   main_window.py\n          │\n          └───custom_widgets\n                  __init__.py\n                  ctk_scrollable_dropdown/\n                  ctk_input_dialog.py\n  ```\n\u003c/details\u003e\n\n\u003c!-- BUILT WITH --\u003e\n\n### Built With\n\n- [CTkScrollableDropdown](https://github.com/Akascape/CTkScrollableDropdown) for the scrollable option menu to display the full list of supported languages.\n- [CustomTkinter](https://github.com/TomSchimansky/CustomTkinter) for the GUI.\n- [moviepy](https://pypi.org/project/moviepy/) for video processing, from which the program extracts the audio to be transcribed.\n- [OpenAI Python API library](https://pypi.org/project/openai/) for using the **Whisper API**.\n- [PyAudio](https://pypi.org/project/PyAudio/) for recording microphone audio.\n- [pydub](https://github.com/jiaaro/pydub) for audio processing.\n- [python-dotenv](https://pypi.org/project/python-dotenv/) for handling environment variables.\n- [PyTorch](https://github.com/pytorch/pytorch) for building and training neural networks.\n- [PyTorch-CUDA](https://pytorch.org/docs/stable/cuda.html) for enabling GPU support (CUDA) with PyTorch. CUDA is a parallel computing platform and application programming interface model created by NVIDIA.\n- [pytube](https://github.com/pytube/pytube) for audio download of YouTube videos.\n- [SpeechRecognition](https://pypi.org/project/SpeechRecognition/) for using the **Google Speech-To-Text API**.\n- [Torchaudio](https://pytorch.org/audio/stable/index.html) for audio processing tasks, including speech recognition and audio classification.\n- [WhisperX](https://github.com/m-bain/whisperX) for fast automatic speech recognition. This product includes software developed by Max Bain. Uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper), which is a reimplementation of [OpenAI's Whisper](https://github.com/openai/whisper) model using [CTranslate2](https://github.com/OpenNMT/CTranslate2/).\n\n\u003cp align=\"right\"\u003e(\u003ca href=\"#top\"\u003eback to top\u003c/a\u003e)\u003c/p\u003e\n\n\u003c!-- GETTING STARTED --\u003e\n\n## Getting Started\n\n### Installation\n\n1. Install [FFmpeg](https://ffmpeg.org) to execute the program. Otherwise, it won't be able to process the audio files.\n\n    To check if you have it installed on your system, run `ffmpeg -version`. It should return something similar to this:\n    ```\n    ffmpeg version 5.1.2-essentials_build-www.gyan.dev Copyright (c) 2000-2022 the FFmpeg developers\n    built with gcc 12.1.0 (Rev2, Built by MSYS2 project)\n    configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband\n    libavutil      57. 28.100 / 57. 28.100\n    libavcodec     59. 37.100 / 59. 37.100\n    libavformat    59. 27.100 / 59. 27.100\n    libavdevice    59.  7.100 / 59.  7.100\n    libavfilter     8. 44.100 /  8. 44.100\n    libswscale      6.  7.100 /  6.  7.100\n    libswresample   4.  7.100 /  4.  7.100\n    ```\n\n    If the output is an error, it is because your system cannot find the `ffmpeg` system variable, which is probably because you don't have it installed on your system. To install `ffmpeg`, open a command prompt and run one of the following commands, depending on your operating system:\n    ```\n    # on Ubuntu or Debian\n    sudo apt update \u0026\u0026 sudo apt install ffmpeg\n\n    # on Arch Linux\n    sudo pacman -S ffmpeg\n\n    # on MacOS using Homebrew (https://brew.sh/)\n    brew install ffmpeg\n\n    # on Windows using Chocolatey (https://chocolatey.org/)\n    choco install ffmpeg\n\n    # on Windows using Scoop (https://scoop.sh/)\n    scoop install ffmpeg\n    ```\n2. Go to [releases](https://github.com/HenestrosaDev/audiotext/releases) and download the latest.\n3. Decompress the downloaded file.\n4. Open the `audiotext` folder and double-click the `Audiotext` executable file.\n\n### Setting Up the Project Locally\n\n1. Clone the repository by running `git clone https://github.com/HenestrosaDev/audiotext.git`.\n2. Change the current working directory to `audiotext` by running `cd audiotext`.\n3. (Optional but recommended) Create a Python virtual environment in the project root. If you're using `virtualenv`, you would run `virtualenv venv`.\n4. (Optional but recommended) Activate the virtual environment:\n   ```bash\n   # on Windows\n   . venv/Scripts/activate\n   # if you get the error `FullyQualifiedErrorId : UnauthorizedAccess`, run this:\n   Set-ExecutionPolicy Unrestricted -Scope Process\n   # and then . venv/Scripts/activate\n\n   # on macOS and Linux\n   source venv/Scripts/activate\n   ```\n5. Run `pip install -r requirements.txt` to install the dependencies.\n6. (Optional) If you intend to contribute to the project, run `pip install -r requirements-dev.txt` to install the development dependencies.\n7. (Optional) If you followed step 6, run `pre-commit install` to install the pre-commit hooks in your `.git/` directory.\n8. Copy and paste the `.env.example` file as `.env` to the root of the directory.\n9. Run `python src/app.py` to start the program.\n\n### Notes\n\n- You cannot generate a single executable file for this project with PyInstaller due to the dependency with the CustomTkinter package (reason [here](https://github.com/TomSchimansky/CustomTkinter/wiki/Packaging)).\n- For **Apple Silicon Macs** and **Ubuntu** users: An error occurs when trying to install the `pyaudio` package. [Here](https://stackoverflow.com/questions/73268630/error-could-not-build-wheels-for-pyaudio-which-is-required-to-install-pyprojec) is a StackOverflow post explaining how to solve this issue.\n- I had to comment out the lines `pprint(response_text, indent=4)` in the `recognize_google` function from the `__init__.py` file of the `SpeechRecognition` package to avoid opening a command line along with the GUI. Otherwise, the program would not be able to use the Google API transcription method because `pprint` throws an error if it cannot print to the CLI, preventing the code from generating the transcription. The same applies to the lines using the `logger` package in the `moviepy/audio/io/ffmpeg_audiowriter` file from the `moviepy` package. There is also a change in the line 169 that changes `logger=logger` to `logger=None` to avoid more errors related to opening the console.\n\n\u003cp align=\"right\"\u003e(\u003ca href=\"#top\"\u003eback to top\u003c/a\u003e)\u003c/p\u003e\n\n\u003c!-- USAGE --\u003e\n\n## Usage\n\nOnce you open the **Audiotext** executable file (explained in the [Getting Started](#getting-started) section), you'll see something like this:\n\n\u003cpicture\u003e\n  \u003csource\n    srcset=\"docs/light/main.png\"\n    media=\"(prefers-color-scheme: light)\"\n  /\u003e\n  \u003csource\n    srcset=\"docs/dark/main.png\"\n    media=\"(prefers-color-scheme: dark)\"\n  /\u003e\n  \u003cimg\n    src=\"docs/light/main.png\"\n    alt=\"Main\"\n  \u003e\n\u003c/picture\u003e\n\n### Transcription Language\n\nThe target language for the transcription. If you use the **Whisper API** or the **WhisperX** transcription methods, you can set this to a language other than the one spoken in the audio in order to translate it to the selected language.\n\nFor example, to translate an English audio into French, you would set `Transcription language` to French, as shown in the video below:\n\n\u003c!-- english-to-french.mp4 --\u003e\nhttps://github.com/user-attachments/assets/e68d9b90-3978-4ffb-9b62-bd3d57a1a33d\n\nThis is an unofficial way to perform translations, so be sure to double-check the generated transcription for errors.\n\n### Transcription Method\n\nThere are three transcription methods available in **Audiotext**:\n\n- **Google Speech-To-Text API** (hereafter referred to as **Google API**): Requires an Internet connection. It doesn't punctuate sentences (the punctuation is produced by **Audiotext**), and the quality of the resulting transcriptions often requires manual adjustment due to lower quality compared to the **Whisper API** or **WhisperX**. In its free tier, usage is limited to 60 minutes per month, but this limit can be extended by adding an [API key](#google-api-key).\n\n- **Whisper API**: Requires an Internet connection. This method is intended for people whose machines are not powerful enough to run **WhisperX** gracefully. It has fewer options than **WhisperX**, but the quality of the transcriptions is similar to those generated by the `large-v2` model of **Whisper X**. However, you need to set an OpenAI API key to use this method. See the [Whisper API Key](#whisper-api-key) section for more information.\n\n- **WhisperX**: Selected by default. It doesn't require an Internet connection because the entire transcription process takes place locally on your computer. As a result, it's much more demanding of hardware resources than the other remote transcription methods. **WhisperX** can run on CPUs and CUDA GPUs, although it performs better on the latter. The quality of the transcription depends on the selected [model size](#model-size) and [computation type](#compute-type). In addition, **WhisperX** offers a wider range of features, including a more customizable subtitle generation process than the **Whisper API** and more output file types. It has no usage restrictions while remaining completely free.\n\n### Audio Source\n\nYou can transcribe from four different audio sources:\n\n- **File** (see image above): Click the file explorer icon to select the file you want to transcribe, or manually enter the path to the file in the `Path` input field. You can transcribe audio from both audio and video files.\n\n  Note that the file explorer has the `All supported files` option selected by default. To select only audio files or video files, click the combo box in the lower right corner of the file explorer to change the file type, as marked in red in the following image:\n\n  ![File explorer](docs/file-explorer.png)\n\n  ![Supported files](docs/supported-files.png)\n\n- **Directory**: Click the file explorer icon to select the directory containing the files you want to transcribe, or manually enter the path to the directory in the `Path` input field. Note that the `Autosave` option is checked and cannot be unchecked because each file's transcription will automatically be saved in the same path as the source file.\n\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/from-directory.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/from-directory.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg\n      src=\"docs/light/from-directory.png\"\n      alt=\"Main\"\n    \u003e\n  \u003c/picture\u003e\n\n  For example, let's use the following directory as a reference:\n\n  ```\n  └───files-to-transcribe\n      │   paranoid-android.mp3\n      │   the-past-recedes.flac\n      │\n      └───movies\n              mulholland-dr-2001.avi\n              seul-contre-tous-1998.mp4\n  ```\n\n  After transcribing the `files-to-transcribe` directory using **WhisperX**, with the `Overwrite existing files` option unchecked and the output file types `.vtt` and `.txt` selected, the folder structure will look like this:\n\n  ```\n  └───files-to-transcribe\n      │   paranoid-android.mp3\n      │   paranoid-android.txt\n      │   paranoid-android.vtt\n      │   the-past-recedes.flac\n      │   the-past-recedes.txt\n      │   the-past-recedes.vtt\n      │\n      └───movies\n              mulholland-dr-2001.avi\n              mulholland-dr-2001.txt\n              mulholland-dr-2001.vtt\n              seul-contre-tous-1998.mp4\n              seul-contre-tous-1998.txt\n              seul-contre-tous-1998.vtt\n  ```\n\n  If we transcribe the directory again with the **Google API** and the `Overwrite existing files` option unchecked, **Audiotext** won't process any files because there are already `.txt` files corresponding to all the files in the directory. However, if we added the file `endors-toi.wav` to the root of `files-to-transcribe`, it would be the only file that would be processed because it doesn't have a `.txt` attached to it. The same would happen in the **WhisperX** scenario, since `endors-toi.wav` has no transcription files generated.\n\n  Note that if we check the `Overwrite existing files` option, all files will be processed again and the existing transcription files will be overwritten.\n\n- **Microphone**: To start recording, simply click the `Start recording` button to begin the process. The text of the button will change to `Stop recording` and its color will change to red. Click it to stop recording and generate the transcription.\n\n  Here is a video demonstrating this feature:\n\n  \u003c!-- english.mp4 --\u003e\n  https://github.com/user-attachments/assets/61f2173b-bcfb-4251-a910-0cf6b37598c6\n\n  Note that your operating system must recognize an input source, otherwise an error message will appear in the text box indicating that no input source was detected.\n\n- **YouTube video**: Requires an Internet connection to get the audio of the video. To generate the transcription, simply enter the URL of the video in the `YouTube video URL` field and click the `Generate transcription` button when you are finished adjusting the settings.\n\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/from-youtube.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/from-youtube.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg\n      src=\"docs/light/from-youtube.png\"\n      alt=\"From microphone\"\n    \u003e\n  \u003c/picture\u003e\n\n### Save Transcription\n\nWhen you click on the `Save transcription` button, you'll be prompted for a file explorer where you can name the transcription file and select the path where you want to save it. Please note that any text entered or modified in the textbox **WILL NOT** be included in the saved transcription.\n\n#### Autosave\n\nUnchecked by default. If checked, the transcription will automatically be saved in the root of the folder where the file to transcribe is stored. If there are already existing files with the same name, they won't be overwritten. To do that, you'll need to check the `Overwrite existing files` option (see below).\n\nNote that if you create a transcription using the `Microphone` or `YouTube` audio sources with the `Autosave` action enabled, the transcription files will be saved in the root of the `audiotext-vX.X.X` directory.\n\n#### Overwrite Existing Files\n\nThis option can only be checked if the `Autosave` option is checked. If `Overwrite existing files` is checked, existing transcriptions in the root directory of the file to be transcribed will be overwritten when saving.\n\nFor example, let's use this directory as a reference:\n\n```\n└───audios\n        foo.mp3\n        foo.srt\n        foo.txt\n```\n\nIf we transcribe the audio file `foo.mp3` with the output file types `.json`, `.txt` and `.srt` and the `Autosave` and `Overwrite existing files` options checked, the files `foo.srt` and `foo.txt` will be overwritten and the file `foo.json` will be created.\n\nOn the other hand, if we transcribe the audio file `foo.mp3` with the same output file types, with the option `Autosave` checked but without the option `Overwrite existing files`, the file `foo.json` will still be created, but the files `foo.srt` and `foo.txt` will remain unchanged.\n\n### Google Speech-To-Text API Options\n\nThe `Google API options` frame appears if the selected transcription method is **Google API**. See the [Transcription Method](#transcription-method) section to know more about the **Google API**.\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/google-api-options.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/google-api-options.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg src=\"docs/light/google-api-options.png\" alt=\"google-api-options\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\n#### Google API Key\n\nSince the program uses the free **Google API** tier by default, which allows you to transcribe up to 60 minutes of audio per month for free, you may need to add an API key if you want to make extensive use of this feature. To do so, click the `Set API key` button. You'll be presented with a dialog box where you can enter your API key, which will **only** be used to make requests to the API.\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/google-api-key-dialog.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/google-api-key-dialog.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg src=\"docs/light/google-api-key-dialog.png\" alt=\"Google API key dialog\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\nRemember that **WhisperX** provides fast, unlimited audio transcription that supports translation and subtitle generation for free, unlike the **Google API**. Also note that Google charges for the use of the API key, for which **Audiotext** is not responsible.\n\n### Whisper API Options\n\nThe `Whisper API options` frame appears if the selected transcription method is **Whisper API**. See the [Transcription Method](#transcription-method) section to know more about the **Whisper API**.\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/whisper-api-options.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/whisper-api-options.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg src=\"docs/light/whisper-api-options.png\" alt=\"Whisper API options\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\n#### Whisper API Key\n\nAs noted in the [Transcription Method](#transcription-method) section, an [OpenAI API key]((https://platform.openai.com/api-keys)) is required to use this transcription method. Otherwise, you won't be able to use it.\n\nTo add it, click the `Set OpenAI API key` button. You'll be presented with a dialog box where you can enter your API key, which will **only** be used to make requests to the API.\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/open-ai-api-key-dialog.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/open-ai-api-key-dialog.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg src=\"docs/light/open-ai-api-key-dialog.png\" alt=\"OpenAI API key dialog\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\nOpenAI charges for the use of the API key, for which **Audiotext** is not responsible. See the [Troubleshooting](#troubleshooting) section if you get error `429` on your first request with an API key.\n\n#### Response Format\n\nThe format of the transcript output, in one of these options:\n\n- `json`\n- `srt` (subtitle file type)\n- `text`\n- `verbose_json`\n- `vtt` (subtitle file type)\n\nDefaults to `text`.\n\n#### Temperature\n\nThe sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit.\n\nDefaults to 0.\n\n#### Timestamp Granularities\n\nThe timestamp granularities to populate for this transcription. `Response format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`.\n\n**Note**: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.\n\nDefaults to `segment`.\n\n### WhisperX Options\n\nThe **WhisperX** options appear when the selected transcription method is **WhisperX**. You can select the output file types of the transcription and whether to translate the transcription into English.\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/whisperx-options.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/whisperx-options.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg\n      src=\"docs/light/whisperx-options.png\"\n      alt=\"WhisperX options\"\n    \u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\n#### Output File Types\n\nYou can select one or more of the following transcription output file types:\n\n- `.aud`\n- `.json`\n- `.srt` (subtitle file type)\n- `.tsv`\n- `.txt`\n- `.vtt` (subtitle file type)\n\nIf you select one of the two subtitle file types (`.vtt` and `.srt`), the `Subtitle options` frame will be displayed with more options (read more [here](#subtitle-options)).\n\n#### Translate to English\n\nTo translate the transcription to English, simply check the `Translate to English` checkbox before generating the transcription, as shown in the video below.\n\n\u003c!-- spanish-to-english.mp4 --\u003e\nhttps://github.com/user-attachments/assets/e614201c-25f2-4ec7-8478-3b63aade0c44\n\nIf you want to translate the audio to another language, check the [Transcription Language](#transcription-language) section.\n\n### Subtitle Options\n\nWhen you select the `.srt` and/or the `.vtt` output file type(s), the `Subtitle options` frame will be displayed. Note that the input options only apply to the `.srt` and `.vtt` files:\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/subtitle-options.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/subtitle-options.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg\n      src=\"docs/light/subtitle-options.png\"\n      alt=\"Subtitle options\"\n    \u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\nTo get the subtitle file(s) after the audio is transcribed, you can either check the `Autosave` option before generating the transcription or click `Save transcription` and select the path where you want to save them as explained in the [Save Transcription](#save-transcription) section.\n\n#### Highlight Words\n\nUnderline each word as it's spoken in `.srt` and `.vtt` subtitle files. Not checked by default.\n\n#### Max. Line Count\n\nThe maximum number of lines in a segment. `2` by default.\n\n#### Max. Line Width\n\nThe maximum number of characters in a line before breaking the line. `42` by default.\n\n### Advanced Options\n\nWhen you click the `Show advanced options` button in the `WhisperX options` frame, the `Advanced options` frame appears, as shown in the figure below.\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource\n      srcset=\"docs/light/whisperx-advanced-options.png\"\n      media=\"(prefers-color-scheme: light)\"\n    /\u003e\n    \u003csource\n      srcset=\"docs/dark/whisperx-advanced-options.png\"\n      media=\"(prefers-color-scheme: dark)\"\n    /\u003e\n    \u003cimg src=\"docs/light/whisperx-advanced-options.png\" alt=\"WhisperX advanced options\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\nIt's highly recommended that you don't change the default configuration unless you're having problems with **WhisperX** or you know exactly what you're doing, especially the `Compute type` and `Batch size` options. Change them at your own risk and be aware that you may experience problems, such as having to reboot your system if the GPU runs out of VRAM.\n\n#### Model Size\n\nThere are five main ASR (Automatic Speech Recognition) model sizes that offer tradeoffs between speed and accuracy. The larger the model size, the more VRAM it uses and the longer it takes to transcribe. Unfortunately, **WhisperX** hasn't provided specific performance data for each model, so the table below is based on the one detailed in [OpenAI's Whisper README](https://github.com/openai/whisper). According to **WhisperX**, the `large-v2` model requires \u003c8GB of GPU memory and batches inference for 70x real-time transcription (taken from the project's [README](https://github.com/m-bain/whisperX)).\n\n|  Model   | Parameters | Required VRAM  |\n|:--------:|:----------:|:--------------:|\n|  `tiny`  |    39 M    |     ~1 GB      |\n|  `base`  |    74 M    |     ~1 GB      |\n| `small`  |   244 M    |     ~2 GB      |\n| `medium` |   769 M    |     ~5 GB      |\n| `large`  |   1550 M   |     \u003c8 GB      |\n\n\u003e [!NOTE]\n\u003e`large` is divided into three versions: `large-v1`, `large-v2`, and `large-v3`. The default model size is `large-v2`, since `large-v3` has some bugs that weren't as common in `large-v2`, such as hallucination and repetition, especially for certain languages like Japanese. There are also more prevalent problems with missing punctuation and capitalization. See the announcements for the [`large-v2`](https://github.com/openai/whisper/discussions/661) and the [`large-v3`](https://github.com/openai/whisper/discussions/1762) models for more insight into their differences and the issues encountered with each.\n\nThe larger the model size, the lower the WER (Word Error Rate in %). The table below is taken from [this Medium article](https://blog.ml6.eu/fine-tuning-whisper-for-dutch-language-the-crucial-role-of-size-dd5a7012d45f), which analyzes the performance of pre-trained Whisper models on common Dutch speech.\n\n|  Model   |  WER  |\n|:--------:|:-----:|\n|   tiny   | 50.98 |\n|  small   | 17.90 |\n| large-v2 | 7.81  |\n\n#### Compute Type\n\nThis term refers to different data types used in computing, particularly in the context of numerical representation. It determines how numbers are stored and represented in a computer's memory. The higher the precision, the more resources will be needed and the better the transcription will be.\n\nThere are three possible values for **Audiotext**:\n- `int8`: Default if using CPU. It represents whole numbers without any fractional part. Its size is 8 bits (1 byte) and it can represent integer values from -128 to 127 (signed) or 0 to 255 (unsigned). It is used in scenarios where memory efficiency is critical, such as in quantized neural networks or edge devices with limited computational resources.\n- `float16`: Default if using CUDA GPU. It's a half precision type representing 16-bit floating point numbers. Its size is 16 bits (2 bytes). It has a smaller range and precision compared to `float32`. It's often used in applications where memory is a critical resource, such as in deep learning models running on GPUs or TPUs.\n- `float32`: Recommended for CUDA GPUs with more than 8 GB of VRAM. It's a single precision type representing 32-bit floating point numbers, which is a standard for representing real numbers in computers. Its size is 32 bits (4 bytes). It can represent a wide range of real numbers with a reasonable level of precision.\n\n#### Batch Size\n\nThis option determines how many samples are processed together before the model parameters are updated. It doesn't affect the quality of the transcription, only the generation speed (the smaller, the slower).\n\nFor simplicity, let's divide the possible batch size values into two groups:\n\n- **Small batch size (0\u003cx\u003c=8)**: Training with small batch sizes means that model weights are updated more frequently, potentially leading to more stable convergence. They use less memory, which can be important when working with limited resources. `8` is the default value.\n- **Large batch size (\u003e8)**: Speeds up in training, especially on hardware optimized for parallel processing such as GPUs. Max. recommended to `16`.\n\n#### Use CPU\n\n**WhisperX** will use the CPU for transcription if checked. Checked by default if there is no CUDA GPU.\n\nAs noted in the [Compute Type](#compute-type) section, the default compute type value for the CPU is `int8`, since many CPUs don't support efficient `float16` or `float32` computation, which would result in an error. Change it at your own risk.\n\n## Troubleshooting\n\n### The program is unresponsive when using WhisperX\n\nThe first transcription created by **WhisperX** will take longer than subsequent ones. That's because **Audiotext** needs to load the model, which can take a few minutes, depending on the hardware the program is running on. It may appear to be unresponsive, but do not close it, as it will eventually return to a normal state.\n\nOnce the model is loaded, you'll notice a dramatic increase in the speed of subsequent transcriptions using this method.\n\n### I get the error `RuntimeError: CUDA Out of memory` when using WhisperX\n\nTry any of the following (2 and 3 can affect quality) (taken from [WhisperX README](https://github.com/m-bain/whisperX#technical-details-%EF%B8%8F)):\n1. Reduce batch size, e.g. `4`\n2. Use a smaller ASR model, e.g. `base`\n3. Use lighter compute type, e.g. `int8`\n\n### Is it possible to use less GPU/CPU memory requirements when using WhisperX?\n\nYou can follow the steps above. See the [Model Size](#model-size) section for how much memory you need for each model.\n\n### The program takes _too_ much time to generate a transcription\n\nTry using a smaller ASR model and/or a lighter computation type, as indicated in the point above. Keep in mind that the first **WhisperX** transcription will take some time to load the model. Also remember that the transcription process depends heavily on your system's hardware, so don't expect instant results on modest CPUs. Alternatively, you can use the **Whisper API** or **Google API** transcription methods, which are much less hardware intensive than **WhisperX** because the transcriptions are generated remotely, but you'll be dependent on the speed of your Internet connection.\n\n### When I try to generate a transcription using the Whisper API method, I get the error `429`\n\nYou'll be prompted with an error like this:\n\n```\nRateLimitError(\"Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}\")\n```\n\nThis is either because your account run out of credits or because you need to fund your account before you can use the API for the first time (even if you have free credits available). To fix this, you need to purchase credits for your account (starting at $5) with a credit or debit card by going to the [Billing](https://platform.openai.com/settings/organization/billing/overview) section of your OpenAI account settings.\n\nAfter funds are added to your account, it may take up to 10 minutes for your account to become active.\n\nIf you are using an API key that was created before you funded your account for the first time, and the error still persists after about 10 minutes, you'll need to create a new API key and change it in **Audiotext** (see the [Whisper API Key](#whisper-api-key) section to change it).\n\n\u003cp align=\"right\"\u003e(\u003ca href=\"#top\"\u003eback to top\u003c/a\u003e)\u003c/p\u003e\n\n\u003c!-- ROADMAP --\u003e\n\n## Roadmap\n\nSee the [project backlog](https://github.com/users/HenestrosaDev/projects/1).\n\nYou can propose a new feature by creating a [discussion](https://github.com/HenestrosaDev/audiotext/discussions/new?category=ideas)!\n\n\u003c!-- AUTHORS --\u003e\n\n## Authors\n\n- HenestrosaDev \u003chenestrosadev@gmail.com\u003e (José Carlos López Henestrosa)\n\nSee also the list of [contributors](https://github.com/HenestrosaDev/audiotext/contributors) who participated in this project.\n\n\u003c!-- CONTRIBUTING --\u003e\n\n## Contributing\n\nContributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**. Please read the [CONTRIBUTING.md](https://github.com/HenestrosaDev/audiotext/blob/main/.github/CONTRIBUTING.md) file, where you can find more detailed information about how to contribute to the project.\n\n\u003c!-- ACKNOWLEDGMENTS --\u003e\n\n## Acknowledgments\n\nI used the following resources to create this project:\n\n- [Extracting speech from video using Python](https://towardsdatascience.com/extracting-speech-from-video-using-python-f0ec7e312d38)\n- [How to translate Python applications with the GNU gettext module](https://phrase.com/blog/posts/translate-python-gnu-gettext/)\n- [Speech recognition on large audio files](https://www.geeksforgeeks.org/python-speech-recognition-on-large-audio-files/)\n\n\u003c!-- LICENSE --\u003e\n\n## License\n\nDistributed under the BSD-4-Clause license. See [`LICENSE`](https://github.com/HenestrosaDev/audiotext/blob/main/LICENSE) for more information.\n\n\u003c!-- SUPPORT --\u003e\n\n## Support\n\nWould you like to support the project? That's very kind of you! However, I would suggest that you to consider supporting the packages that I've used to build this project first. If you still want to support this particular project, you can go to my Ko-Fi profile by clicking on the button down below!\n\n[![ko-fi](https://ko-fi.com/img/githubbutton_sm.svg)](https://ko-fi.com/henestrosadev)\n\n\u003cp align=\"right\"\u003e(\u003ca href=\"#top\"\u003eback to top\u003c/a\u003e)\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhenestrosadev%2Faudiotext","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhenestrosadev%2Faudiotext","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhenestrosadev%2Faudiotext/lists"}