{"id":19417787,"url":"https://github.com/bradsec/audio-utils-webui","last_synced_at":"2025-02-25T03:41:20.353Z","repository":{"id":196331309,"uuid":"695770978","full_name":"bradsec/audio-utils-webui","owner":"bradsec","description":"A Gradio WebUI for audio utilities. Demucs, Whisper, Audio Slicer","archived":false,"fork":false,"pushed_at":"2023-09-24T07:04:57.000Z","size":113,"stargazers_count":1,"open_issues_count":1,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-07T17:24:35.693Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bradsec.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE_audioslicer.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-09-24T07:04:10.000Z","updated_at":"2024-02-22T06:24:00.000Z","dependencies_parsed_at":"2023-09-24T14:50:44.681Z","dependency_job_id":null,"html_url":"https://github.com/bradsec/audio-utils-webui","commit_stats":null,"previous_names":["bradsec/audio-utils-webui"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bradsec%2Faudio-utils-webui","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bradsec%2Faudio-utils-webui/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bradsec%2Faudio-utils-webui/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bradsec%2Faudio-utils-webui/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bradsec","download_url":"https://codeload.github.com/bradsec/audio-utils-webui/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240599184,"owners_count":19826959,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T13:11:22.915Z","updated_at":"2025-02-25T03:41:20.329Z","avatar_url":"https://github.com/bradsec.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"#  audio-utils-webui - A Gradio WebUI for audio utilities.\n\n![screenshot](screenshot.png)\n\n### Current audio utilities included in WebUI\n- [Demucs](https://github.com/facebookresearch/demucs)\n- [OpenAI Whisper](https://github.com/openai/whisper)\n- [Audio Slicer](https://github.com/openvpi/audio-slicer)\n\n## Features\n\n- Separates audio tracks into different stems like vocals, instrumental, drums, bass using demucs models.\n- Split audio files based on silence detection using audio slicer.\n- Translate audio speech to text using OpenAI's Whisper models.\n- Run locally on your own hardware.\n- Supports various audio types including MP3, WAV, and FLAC.\n- Provides a simple and intuitive web UI for easy use.\n\n### Pending maybe future utilities\n- Other denoise deecho dereverb utilities\n- WebUI for helpful FFMpeg commands concate audio files etc.\n\n## FFmpeg Requirement\n- **FFmpeg** must be installed for your OS. Refer to [FFMPEG official site](https://ffmpeg.org/download.html).\n\n```terminal\n# Debian based Linux installation of FFmpeg is simple\nsudo apt -y install ffmpeg\n```\n```terminal\n# Windows install, you will need to download and extract the latest release.\n# Add extracted directory to PATH so ffmpeg binaries can be access by the application.\nDirect link to download https://www.gyan.dev/ffmpeg/builds/ffmpeg-git-full.7z\n\n# Windows adding to path\n# Open cmd terminal as administrator and run the following command (example assumes binaries are extracted to c:\\ffmpeg)\nsetx /m PATH \"C:\\ffmpeg\\bin;%PATH%\"\n```\n```terminal\n# MacOS\n# homebrew install will be the easiest if installed, otherwise refer to official FFMPEG install guides.\nbrew install ffmpeg\n```\n\n## Python Environment Setup and WebUI Installation\n\nBefore running the program, ensure you have the necessary Python packages installed. Using a virtual environment. Example below uses Conda.\n\n### Linux, Windows or MacOS\n```terminal\n# Using a miniconda virtual environment\n\nconda create -y -n demucs python=3.10.9\nconda activate demucs\n```\n```terminal\n# Install Python requirements with pip\npip install demucs gradio soundfile\n# Pip install latest version of Whisper\npip install git+https://github.com/openai/whisper.git\n```\nNext, clone the repository and navigate to the project directory:\n\n```terminal\ngit clone https://github.com/bradsec/audio-utils-webui.git\ncd audio-utils-webui\n```\n\n## Usage\n\nTo start the program, run the following command in the project directory:\n\n```terminal\npython webui.py\n```\n\nThis will launch the Gradio WebUI on `http://0.0.0.0:7860`. Open this URL in a web browser to access the user interface. If launching on the same PC you can use `http://localhost:7860`, otherwise `0.0.0.0` allows it to accessed from any computer on the local network using the assigned IP address of machine hosting the webui ie. `http://192.168.0.1:7860`.\n\n**Note:** On first use of demucs and whisper will download the required checkpoints and models. Check the terminal output for status of the download and processing etc.\n\n## Testing\n- September 2023 - Tested above install and webui usage on Debian 12 (Bookworm) and Microsoft Windows 11.\n\n## Limitations\n- May be issues with very long audio files.\n\n## Troubleshooting\n\nCheck terminal output for information. All commands and progress will be shown in terminal. As mentioned, models need to download on first use, the number of files and their size may vary. Other messages such as out of memory (GPU) will also be shown in the terminal.\n\n## Licence Information / Credit\n- MIT Licence - [audio-utils-webui github.com/bradsec/audio-utils-webui](https://github.com/bradsec/audio-utils-webui)\n- Apache-2.0 License - [Gradio github.com/gradio-app/gradio](https://github.com/gradio-app/gradio)\n- MIT Licence - [Demucs - github.com/facebookresearch/demucs](https://github.com/facebookresearch/demucs)\n- MIT Licence - [Audio Slicer - github.com/openvpi/audio-slicer](https://github.com/openvpi/audio-slicer)\n- MIT Licence - [OpenAI Whisper - github.com/openai/whisper](https://github.com/openai/whisper)  \n- [Mixed Licence](https://github.com/FFmpeg/FFmpeg/blob/master/LICENSE.md) - [github.com/FFmpeg/FFmpeg](https://github.com/FFmpeg/FFmpeg)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbradsec%2Faudio-utils-webui","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbradsec%2Faudio-utils-webui","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbradsec%2Faudio-utils-webui/lists"}