{"id":15640113,"url":"https://github.com/erogol/fftnet","last_synced_at":"2025-06-27T10:06:50.424Z","repository":{"id":145077485,"uuid":"138598438","full_name":"erogol/FFTNet","owner":"erogol","description":"FFTNet vocoder implementation","archived":false,"fork":false,"pushed_at":"2018-09-28T17:33:52.000Z","size":774,"stargazers_count":81,"open_issues_count":1,"forks_count":8,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-04-30T07:36:27.924Z","etag":null,"topics":["deep-learning","fftnet","pytorch","text2speech","vocoder"],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/erogol.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-06-25T13:25:40.000Z","updated_at":"2024-01-04T16:24:08.000Z","dependencies_parsed_at":null,"dependency_job_id":"d090fecd-005d-488e-945a-f667ed4a629c","html_url":"https://github.com/erogol/FFTNet","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/erogol/FFTNet","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erogol%2FFFTNet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erogol%2FFFTNet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erogol%2FFFTNet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erogol%2FFFTNet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/erogol","download_url":"https://codeload.github.com/erogol/FFTNet/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erogol%2FFFTNet/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262235782,"owners_count":23279566,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","fftnet","pytorch","text2speech","vocoder"],"created_at":"2024-10-03T11:31:11.506Z","updated_at":"2025-06-27T10:06:50.386Z","avatar_url":"https://github.com/erogol.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Unofficial Implementation of [FFTNet vocode](http://gfx.cs.princeton.edu/pubs/Jin_2018_FAR/fftnet-jin2018.pdf) paper.\n\n- [x] implement the model.\n- [x] implement tests.\n- [x] overfit on a single batch (sanity check).\n- [x] linearize weights for eval time.\n- [x] measure the run-time on GPU and CPU. (1 sec audio takes ~47 secs) If anyone knows additional tricks from the paper, let me know. So far I asked the authors but nobody returned. \n- [ ] train on LJSpeech spectrograms.\n- [ ] distill model as in Parallel WaveNet paper.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ferogol%2Ffftnet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ferogol%2Ffftnet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ferogol%2Ffftnet/lists"}