{"id":13910599,"url":"https://github.com/rakuri255/UltraSinger","last_synced_at":"2025-07-18T09:32:08.327Z","repository":{"id":74938787,"uuid":"594208922","full_name":"rakuri255/UltraSinger","owner":"rakuri255","description":"AI based tool to convert vocals lyrics and pitch from music to autogenerate Ultrastar Deluxe, Midi and notes. It automatic tapping, adding text, pitch vocals and creates karaoke files.","archived":false,"fork":false,"pushed_at":"2024-11-10T21:23:35.000Z","size":522,"stargazers_count":281,"open_issues_count":48,"forks_count":25,"subscribers_count":24,"default_branch":"main","last_synced_at":"2024-11-10T22:25:00.112Z","etag":null,"topics":["ai","audio","karaoke","lyrics","midi","music","pitch-detection","singing","ultrastar","vocal","voice"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rakuri255.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"rakuri255","patreon":"Rakuri","open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":["https://www.buymeacoffee.com/rakuri255"]}},"created_at":"2023-01-27T21:21:28.000Z","updated_at":"2024-11-10T21:23:38.000Z","dependencies_parsed_at":"2024-02-09T22:28:48.434Z","dependency_job_id":"d5c84563-8966-426a-a919-1e42e2c61a5b","html_url":"https://github.com/rakuri255/UltraSinger","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakuri255%2FUltraSinger","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakuri255%2FUltraSinger/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakuri255%2FUltraSinger/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakuri255%2FUltraSinger/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rakuri255","download_url":"https://codeload.github.com/rakuri255/UltraSinger/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":226388652,"owners_count":17617313,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","audio","karaoke","lyrics","midi","music","pitch-detection","singing","ultrastar","vocal","voice"],"created_at":"2024-08-07T00:01:36.650Z","updated_at":"2025-07-18T09:32:08.317Z","avatar_url":"https://github.com/rakuri255.png","language":"Python","funding_links":["https://github.com/sponsors/rakuri255","https://patreon.com/Rakuri","https://www.buymeacoffee.com/rakuri255"],"categories":["Python"],"sub_categories":[],"readme":"[![Discord](https://img.shields.io/discord/1048892118732656731?logo=discord)](https://discord.gg/rYz9wsxYYK)\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/rakuri255/UltraSinger/blob/master/colab/UltraSinger.ipynb)\n![Status](https://img.shields.io/badge/status-development-yellow)\n\n![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/rakuri255/UltraSinger/main.yml)\n[![GitHub](https://img.shields.io/github/license/rakuri255/UltraSinger)](https://github.com/rakuri255/UltraSinger/blob/main/LICENSE)\n[![CodeFactor](https://www.codefactor.io/repository/github/rakuri255/ultrasinger/badge)](https://www.codefactor.io/repository/github/rakuri255/ultrasinger)\n[![Check Requirements](https://github.com/rakuri255/UltraSinger/actions/workflows/main.yml/badge.svg)](https://github.com/rakuri255/UltraSinger/actions/workflows/main.yml)\n[![Pytest](https://github.com/rakuri255/UltraSinger/actions/workflows/pytest.yml/badge.svg)](https://github.com/rakuri255/UltraSinger/actions/workflows/pytest.yml)\n[![docker](https://github.com/rakuri255/UltraSinger/actions/workflows/docker.yml/badge.svg)](https://hub.docker.com/r/rakuri255/ultrasinger)\n\n\u003cp align=\"center\" dir=\"auto\"\u003e\n\u003cimg src=\"https://repository-images.githubusercontent.com/594208922/4befe3da-a448-4cbc-b6ef-93899119071b\" style=\"height: 300px;width: auto;\" alt=\"UltraSinger Logo\"\u003e\n\u003c/p\u003e\n\n# UltraSinger\n\n\u003e ⚠️ _This project is still under development!_\n\nUltraSinger is a tool to automatically create UltraStar.txt, midi and notes from music. \nIt automatically pitches UltraStar files, adding text and tapping to UltraStar files and creates separate UltraStar karaoke files.\nIt also can re-pitch current UltraStar files and calculates the possible in-game score.\n\nMultiple AI models are used to extract text from the voice and to determine the pitch.\n\nPlease mention UltraSinger in your UltraStar.txt file if you use it. It helps others find this tool, and it helps this tool get improved and maintained.\nYou should only use it on Creative Commons licensed songs.\n\n## ❤️ Support\nThere are many ways to support this project. Starring ⭐️ the repo is just one 🙏\n\nYou can also support this work on \u003ca href=\"https://github.com/sponsors/rakuri255\"\u003eGitHub sponsors\u003c/a\u003e or \u003ca href=\"https://patreon.com/Rakuri\"\u003ePatreon\u003c/a\u003e or \u003ca href=\"https://www.buymeacoffee.com/rakuri255\" target=\"_blank\"\u003eBuy Me a Coffee\u003c/a\u003e.\n\nThis will help me a lot to keep this project alive and improve it.\n\n\u003ca href=\"https://www.buymeacoffee.com/rakuri255\" target=\"_blank\"\u003e\u003cimg src=\"https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png\" alt=\"Buy Me a Coffee\" style=\"height: 60px !important;width: 217px !important;\" \u003e\u003c/a\u003e\n\u003ca href=\"https://patreon.com/Rakuri\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/rakuri255/UltraSinger/main/assets/patreon.png\" alt=\"Become a Patron\" style=\"height: 60px !important;width: 217px !important;\"/\u003e \u003c/a\u003e\n\u003ca href=\"https://github.com/sponsors/rakuri255\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/rakuri255/UltraSinger/main/assets/mona-heart-featured.webp\" alt=\"GitHub Sponsor\" style=\"height: 60px !important;width: auto;\"/\u003e \u003c/a\u003e\n\n## Table of Contents\n\n- [UltraSinger](#ultrasinger)\n  - [❤️ Support](#️-support)\n  - [Table of Contents](#table-of-contents)\n  - [💻 How to use this source code](#-how-to-use-this-source-code)\n    - [Installation](#installation)\n    - [Run](#run)\n  - [📖 How to use the App](#-how-to-use-the-app)\n    - [🎶 Input](#-input)\n      - [Audio (full automatic)](#audio-full-automatic)\n        - [Local file](#local-file)\n        - [Youtube](#youtube)\n      - [UltraStar (re-pitch)](#ultrastar-re-pitch)\n    - [🗣 Transcriber](#-transcriber)\n      - [Whisper](#whisper)\n        - [Whisper languages](#whisper-languages)\n      - [✍️ Hyphenation](#️-hyphenation)\n    - [👂 Pitcher](#-pitcher)\n    - [👄 Separation](#-separation)\n    - [Sheet Music](#sheet-music)\n    - [Format Version](#format-version)\n    - [🏆 Ultrastar Score Calculation](#-ultrastar-score-calculation)\n    - [📟 Use GPU](#-use-gpu)\n      - [Considerations for Windows users](#considerations-for-windows-users)\n      - [Crashes due to low VRAM](#crashes-due-to-low-vram)\n    - [📦 Containerized](#containerized-docker-or-podman)\n\n## 💻 How to use this source code\n\n### Installation\n\n* Install Python 3.10 **(older and newer versions has some breaking changes)**. [Download](https://www.python.org/downloads/)\n* Also download or install ffmpeg with PATH. [Download](https://www.ffmpeg.org/download.html)\n* Go to folder `install` and run install script for your OS.\n  * Choose `GPU` if you have an nvidia CUDA GPU.\n  * Choose `CPU` if you don't have an nvidia CUDA GPU.\n\n### Run\n\n* In root folder just run `run_on_windows.bat` or `run_on_linux.sh` to start the app.\n* Now you can use the UltraSinger source code with `py UltraSinger.py [opt] [mode] [transcription] [pitcher] [extra]`. See [How to use](#how-to-use) for more information.\n\n## 📖 How to use the App\n\n_Not all options working now!_\n```commandline\n    UltraSinger.py [opt] [mode] [transcription] [pitcher] [extra]\n    \n    [opt]\n    -h      This help text.\n    -i      Ultrastar.txt\n            audio like .mp3, .wav, youtube link\n    -o      Output folder\n    \n    [mode]\n    ## if INPUT is audio ##\n    default (Full Automatic Mode) - Creates all, depending on command line options\n    --interactive - Interactive Mode. All options are asked at runtime for easier configuration\n    \n    # Single file creation selection is in progress, you currently getting all!\n    (-u      Create ultrastar txt file) # In Progress\n    (-m      Create midi file) # In Progress\n    (-s      Create sheet file) # In Progress\n    \n    ## if INPUT is ultrastar.txt ##\n    default  Creates all\n\n    [separation]\n    # Default is htdemucs\n    --demucs              Model name htdemucs|htdemucs_ft|htdemucs_6s|hdemucs_mmi|mdx|mdx_extra|mdx_q|mdx_extra_q \u003e\u003e ((default) is htdemucs)\n\n    [transcription]\n    # Default is whisper\n    --whisper               Multilingual model \u003e tiny|base|small|medium|large-v1|large-v2|large-v3  \u003e\u003e ((default) is large-v2)\n                            English-only model \u003e tiny.en|base.en|small.en|medium.en\n    --whisper_align_model   Use other languages model for Whisper provided from huggingface.co\n    --language              Override the language detected by whisper, does not affect transcription but steps after transcription\n    --whisper_batch_size    Reduce if low on GPU mem \u003e\u003e ((default) is 16)\n    --whisper_compute_type  Change to \"int8\" if low on GPU mem (may reduce accuracy) \u003e\u003e ((default) is \"float16\" for cuda devices, \"int8\" for cpu)\n    --keep_numbers          Numbers will be transcribed as numerics instead of as words \n    \n    [pitcher]\n    # Default is crepe\n    --crepe            tiny|full \u003e\u003e ((default) is full)\n    --crepe_step_size  unit is miliseconds \u003e\u003e ((default) is 10)\n    \n    [extra]\n    --disable_hyphenation   Disable word hyphenation. Hyphenation is enabled by default.\n    --disable_separation    Disable track separation. Track separation is enabled by default.\n    --disable_karaoke       Disable creation of karaoke style txt file. Karaoke is enabled by default.\n    --create_audio_chunks   Enable creation of audio chunks. Audio chunks are disabled by default.\n    --keep_cache            Keep cache folder after creation. Cache folder is removed by default.\n    --plot                  Enable creation of plots. Plots are disabled by default.\n    --format_version        0.3.0|1.0.0|1.1.0|1.2.0 \u003e\u003e ((default) is 1.2.0)\n    --musescore_path        path to MuseScore executable\n    --keep_numbers          Transcribe numbers as digits and not words\n    --ffmpeg                Path to ffmpeg and ffprobe executable\n    \n    [yt-dlp]\n    --cookiefile            File name where cookies should be read from\n\n    [device]\n    --force_cpu             Force all steps to be processed on CPU.\n    --force_whisper_cpu     Only whisper will be forced to cpu\n    --force_crepe_cpu       Only crepe will be forced to cpu\n```\n\nFor standard use, you only need to use [opt]. All other options are optional.\n\n### 🎶 Input\n\n### Mode\ndefault (Full Automatic Mode) - Creates all, depending on command line options\n--interactive - Interactive Mode. All options are asked at runtime for easier configuration\n```commandline\n--interactive\n```\n#### Audio (full automatic)\n\n##### Local file\n\n```commandline\n-i \"input/music.mp3\"\n```\n\n##### Youtube\n\n```commandline\n-i https://www.youtube.com/watch?v=YwNs1Z0qRY0\n```\n\nNote that if you run into a yt-dlp error such as `Sign in to confirm you’re not a bot. This helps protect our community` ([yt-dlp issue](https://github.com/yt-dlp/yt-dlp/issues/10128)) you can follow these steps:\n\n* generate a cookies.txt file with [yt-dlp](https://github.com/yt-dlp/yt-dlp/wiki/Installation) `yt-dlp --cookies cookies.txt --cookies-from-browser firefox`\n* then pass the cookies.txt to UltraSinger `--cookiefile cookies.txt`\n\n#### UltraStar (re-pitch)\n\nThis re-pitch the audio and creates a new txt file.\n\n```commandline\n-i \"input/ultrastar.txt\"\n```\n\n### 🗣 Transcriber\n\nKeep in mind that while a larger model is more accurate, it also takes longer to transcribe.\n\n#### Whisper\n\nFor the first test run, use the `tiny`, to be accurate use the `large-v2` model.\n\n```commandline\n-i XYZ --whisper large-v2\n```\n\n##### Whisper languages\n\nCurrently provided default language models are `en, fr, de, es, it, ja, zh, nl, uk, pt`. \nIf the language is not in this list, you need to find a phoneme-based ASR model from \n[🤗 huggingface model hub](https://huggingface.co). It will download automatically.\n\nExample for romanian:\n```commandline\n-i XYZ --whisper_align_model \"gigant/romanian-wav2vec2\"\n```\n\n#### ✍️ Hyphenation\n\nIs on by default. Can also be deactivated if hyphenation does not produce \nanything useful. Note that the word is simply split, \nwithout paying attention to whether the separated word really \nstarts at the place or is heard. To disable:\n\n```commandline\n-i XYZ --disable_hyphenation\n```\n\n### 👂 Pitcher\n\nPitching is done with the `crepe` model.\nAlso consider that a bigger model is more accurate, but also takes longer to pitch.\nFor just testing you should use `tiny`.\nIf you want solid accurate, then use the `full` model.\n\n```commandline\n-i XYZ --crepe full\n```\n\n### 👄 Separation\n\nThe vocals are separated from the audio before they are passed to the models. If problems occur with this, \nyou have the option to disable this function; in which case the original audio file is used instead.\n\n```commandline\n-i XYZ --disable_separation \n```\n\n### Sheet Music\n\nFor Sheet Music generation you need to have `MuseScore` installed on your system.\nOr provide the path to the `MuseScore` executable.\n\n```commandline\n-i XYZ --musescore_path \"C:/Program Files/MuseScore 4/bin/MuseScore4.exe\"\n```\n\n### Format Version\n\nThis defines the format version of the UltraStar.txt file. For more info see [Official UltraStar format specification](https://usdx.eu/format/).\n\nYou can choose between different format versions. The default is `1.2.0`.\n* `0.3.0` is the first format version. Use this if you have an old UltraStar program and problems with the newer format.\n* `1.0.0` should be supported by the most UltraStar programs. Use this if you have problems with the newest format version\n* `1.1.0` is the current format version.\n* `1.2.0` is the upcoming format version. It is not finished yet.\n* `2.0.0` is the next format version. It is not finished yet.\n\n```commandline\n-i XYZ --format_version 1.2.0\n```\n\n### 🏆 Ultrastar Score Calculation\n\nThe score that the singer in the audio would receive will be measured. \nYou get 2 scores, simple and accurate. You wonder where the difference is? \nUltrastar is not interested in pitch hights. As long as it is in the pitch range A-G you get one point. \nThis makes sense for the game, because otherwise men don't get points for high female voices and women don't get points \nfor low male voices. Accurate is the real tone specified in the txt. I had txt files where the pitch was in a range not \nsingable by humans, but you could still reach the 10k points in the game. The accuracy is important here, because from\nthis MIDI and sheet are created. And you also want to have accurate files\n\n\n### 📟 Use GPU\n\nWith a GPU you can speed up the process. Also the quality of the transcription and pitching is better.\n\nYou need a cuda device for this to work. Sorry, there is no cuda device for macOS.\n\nIt is optional (but recommended) to install the cuda driver for your gpu: see [driver](https://developer.nvidia.com/cuda-downloads).\nInstall torch with cuda separately in your `venv`. See [tourch+cuda](https://pytorch.org/get-started/locally/).\nAlso check you GPU cuda support. See [cuda support](https://gist.github.com/standaloneSA/99788f30466516dbcc00338b36ad5acf)\n\nCommand for `pip`:\n```\npip3 install torch==2.0.1+cu117 torchvision==0.15.2+cu117 torchaudio==2.0.2+cu117 --index-url https://download.pytorch.org/whl/cu117\n```\n\nWhen you want to use `conda` instead you need a [different installation command](https://pytorch.org/get-started/locally/).\n\n#### Considerations for Windows users\n\nThe pitch tracker used by UltraSinger (crepe) uses TensorFlow as its backend.\nTensorFlow dropped GPU support for Windows for versions \u003e2.10 as you can see in this [release note](https://github.com/tensorflow/tensorflow/releases/tag/v2.11.1) and their [installation instructions](https://www.tensorflow.org/install/pip#windows-native).\n\nFor now UltraSinger runs the latest version available that still supports GPUs on windows.\n\nFor running later versions of TensorFlow on windows while still taking advantage of GPU support the suggested solution is to [run UltraSinger in a container](container/README.md).\n#### Crashes due to low VRAM\n\nIf something crashes because of low VRAM then use a smaller model.\nWhisper needs more than 8GB VRAM in the `large` model!\n\nYou can also force cpu usage with the extra option `--force_cpu`.\n\n### 📦 Containerized (Docker or Podman)\n\nSee [container/README.md](container/README.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frakuri255%2FUltraSinger","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frakuri255%2FUltraSinger","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frakuri255%2FUltraSinger/lists"}