{"id":15135988,"url":"https://github.com/tuanad121/python-world","last_synced_at":"2025-11-08T06:04:09.215Z","repository":{"id":34087150,"uuid":"128144676","full_name":"tuanad121/Python-WORLD","owner":"tuanad121","description":null,"archived":false,"fork":false,"pushed_at":"2023-12-20T20:23:14.000Z","size":15577,"stargazers_count":152,"open_issues_count":9,"forks_count":32,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-06-07T17:45:15.812Z","etag":null,"topics":["manifold-learning","manifold-vocoder","pitch","pycharm","python","vae","variational-autoencoder","world-vocoder"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tuanad121.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-04-05T01:24:18.000Z","updated_at":"2025-03-13T08:00:32.000Z","dependencies_parsed_at":"2024-09-21T10:00:38.722Z","dependency_job_id":"cb6367be-ae4f-4e9c-86bf-2d360aca43f1","html_url":"https://github.com/tuanad121/Python-WORLD","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tuanad121/Python-WORLD","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tuanad121%2FPython-WORLD","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tuanad121%2FPython-WORLD/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tuanad121%2FPython-WORLD/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tuanad121%2FPython-WORLD/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tuanad121","download_url":"https://codeload.github.com/tuanad121/Python-WORLD/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tuanad121%2FPython-WORLD/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270005591,"owners_count":24510939,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-12T02:00:09.011Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["manifold-learning","manifold-vocoder","pitch","pycharm","python","vae","variational-autoencoder","world-vocoder"],"created_at":"2024-09-26T06:03:05.614Z","updated_at":"2025-11-08T06:04:04.184Z","avatar_url":"https://github.com/tuanad121.png","language":"Python","readme":"# PYTHON WORLD VOCODER: \n*************************************\n\nThis is a line-by-line implementation of WORLD vocoder (Matlab, C++) in python. It supports *python 3.0* and later.\n\nFor technical detail, please check the [website](http://www.kki.yamanashi.ac.jp/~mmorise/world/english/).\n\n# INSTALATION\n*********************\n\nPython WORLD uses the following dependencies:\n\n* numpy, scipy\n* matplotlib\n* numba\n* simpleaudio (just for demonstration)\n\nInstall python dependencies:\n\n```\npip install -r requirements.txt\n```\n\nOr import the project with [PyCharm](https://www.jetbrains.com/pycharm/) and open ```requirements.txt``` in PyCharm. \nIt will ask to install the missing libraries by itself. \n\n# EXAMPLE\n**************\n\nThe easiest way to run those examples is to import the ```Python-WORLD``` folder into PyCharm.\n\nIn ```example/prodosy.py```, there is an example of analysis/modification/synthesis with WORLD vocoder. \nIt has some examples of pitch, duration, spectrum modification.\n\nFirst, we read an audio file:\n\n```python\nfrom scipy.io.wavfile import read as wavread\nfs, x_int16 = wavread(wav_path)\nx = x_int16 / (2 ** 15 - 1) # to float\n```\n\nThen, we declare a vocoder and encode the audio file:\n\n```python\nfrom world import main\nvocoder = main.World()\n# analysis\ndat = vocoder.encode(fs, x, f0_method='harvest')\n```\n\nin which, ```fs``` is sampling frequency and ```x``` is the speech signal.\n\nThe ```dat``` is a dictionary object that contains pitch, magnitude spectrum, and aperiodicity. \n\nWe can scale the pitch:\n\n```python\ndat = vocoder.scale_pitch(dat, 1.5)\n```\n\nBe careful when you scale the pich because there is upper limit and lower limit.\n\nWe can make speech faster or slower:\n\n```python\ndat = vocoder.scale_duration(dat, 2)\n```\n\nIn ```test/speed.py```, we estimate the time of analysis.\n\nTo use d4c_requiem analysis and requiem_synthesis in WORLD version 0.2.2, set the variable ```is_requiem=True```:\n\n```python\n# requiem analysis\ndat = vocoder.encode(fs, x, f0_method='harvest', is_requiem=True)\n```\n\nTo extract log-filterbanks, MCEP-40, VAE-12 as described in the paper `Using a Manifold Vocoder for Spectral Voice and Style Conversion`, check ```test/spectralFeatures.py```. You need Keras 2.2.4 and TensorFlow 1.14.0 to extract VAE-12.\nCheck out [speech samples](https://tuanad121.github.io/samples/2019-09-15-Manifold/)\n\n# NOTE:\n**********\n\n* The vocoder use pitch-synchronous analysis, the size of each window is determined by fundamental frequency ```F0```. The centers of the windows are equally spaced with the distance of ```frame_period``` ms.\n\n* The Fourier transform size (```fft_size```) is determined automatically using sampling frequency and the lowest value of F0 ```f0_floor```. \nWhen you want to specify your own ```fft_size```, you have to use ```f0_floor = 3.0 * fs / fft_size```. \nIf you decrease ```fft_size```, the ```f0_floor``` increases. But, a high ```f0_floor``` might be not good for the analysis of male voices.\n\n* The F0 analysis ```Harvest``` is the slowest one. It's speeded up using ```numba``` and ```python multiprocessing```. The more cores you have, the faster it can become. However, you can use your own F0 analysis. In our case, we support 3 F0 analysis: ```DIO, HARVEST, and SWIPE'```\n\n\n# CITATION:\n\nIf you find the code helpful and want to cite it, please use:\n\nDinh, T., Kain, A., \u0026 Tjaden, K. (2019). Using a manifold vocoder for spectral voice and style conversion. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019-September, 1388-1392.\n\n\n# CONTACT US\n******************\n\n\nPost your questions, suggestions, and discussions to GitHub Issues.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftuanad121%2Fpython-world","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftuanad121%2Fpython-world","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftuanad121%2Fpython-world/lists"}