{"id":13532856,"url":"https://github.com/funcwj/setk","last_synced_at":"2025-04-01T21:31:18.593Z","repository":{"id":30052516,"uuid":"123781847","full_name":"funcwj/setk","owner":"funcwj","description":"Tools for Speech Enhancement integrated with Kaldi","archived":false,"fork":false,"pushed_at":"2023-07-06T22:59:55.000Z","size":38095,"stargazers_count":396,"open_issues_count":2,"forks_count":92,"subscribers_count":22,"default_branch":"master","last_synced_at":"2024-11-02T20:31:58.358Z","etag":null,"topics":["beamforming","kaldi","rir-generator","speech","speech-enhancement","speech-separation","time-frequency-masking"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/funcwj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-03-04T11:24:40.000Z","updated_at":"2024-10-11T08:50:59.000Z","dependencies_parsed_at":"2022-08-07T15:00:43.952Z","dependency_job_id":"49e2fece-413b-4943-9e2c-35ef1f2969df","html_url":"https://github.com/funcwj/setk","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/funcwj%2Fsetk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/funcwj%2Fsetk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/funcwj%2Fsetk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/funcwj%2Fsetk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/funcwj","download_url":"https://codeload.github.com/funcwj/setk/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246713071,"owners_count":20821836,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beamforming","kaldi","rir-generator","speech","speech-enhancement","speech-separation","time-frequency-masking"],"created_at":"2024-08-01T07:01:14.332Z","updated_at":"2025-04-01T21:31:13.583Z","avatar_url":"https://github.com/funcwj.png","language":"Python","funding_links":[],"categories":["Tools"],"sub_categories":["Coming soon...","BSS/ICA method"],"readme":"## SETK: Speech Enhancement Tools integrated with Kaldi\n\nHere are some speech enhancement/separation tools integrated with [Kaldi](https://github.com/kaldi-asr/kaldi). I use them for front-end's data processing.\n\n### Python Scripts\n\n* Supervised (mask-based) adaptive beamformer (GEVD/MVDR/MCWF...)\n* Data convertion among MATLAB, Numpy and Kaldi\n* Data visualization (TF-mask, spatial/spectral features, beam pattern...)\n* Unified data and IO handlers for Kaldi's scripts, archives, wave and numpy's ndarray...\n* Unsupervised mask estimation (CGMM/CACGMM)\n* Spatial/Spectral feature computation\n* DS (delay and sum) beamformer, SD (supper-directive) beamformer\n* AuxIVA, WPE \u0026 WPD, FB (Fixed Beamformer)\n* Mask computation (iam, irm, ibm, psm, crm)\n* RIR simulation (1D/2D arrays)\n* Single channel speech separation (TF spectral masking)\n* Si-SDR/SDR/WER evaluation\n* Pywebrtc vad wrapper\n* Mask-based source localization\n* Noise suppression\n* Data simulation\n* ...\n\nPlease check out the following instruction for usage of the scripts.\n\n* [Adaptive Beamformer](doc/adaptive_beamformer)\n* [Fixed Beamformer](doc/fixed_beamformer)\n* [Sound Source Localization](doc/ssl)\n* [Spectral Feature](doc/spectral_feature)\n* [Spatial Feature](doc/spatial_feature)\n* [VAD](doc/vad)\n* [Noise Suppression](doc/ns)\n* [Steer Vector](doc/steer_vector)\n* [Room Impluse Response](doc/rir)\n* [Spatial Clustering](doc/spatial_clustering)\n* [WPE \u0026 WPD](doc/wpe)\n* [Time-frequency Mask](doc/tf_mask)\n* [Format Transform](doc/format_transform)\n* [Data Simulation](doc/data_simu)\n\n### Kaldi Commands\n\n* Compute time-frequency masks (ibm, irm etc)\n* Compute phase \u0026 magnitude spectrogram \u0026 complex STFT\n* Seperate target component using input masks\n* Wave reconstruction from enhanced spectral features\n* Complex matrix/vector class\n* MVDR/GEVD beamformer (depend on T-F mask, not very stable)\n* Fixed beamformer\n* Compute angular spectrogram based on SRP-PHAT\n* RIR generator (reference from [RIR-Generator](https://github.com/ehabets/RIR-Generator))\n\nTo build the sources, you need to compile [Kaldi](https://github.com/kaldi-asr/kaldi) with `--shared` flags and patch `matrix/matrix-common.h` first\n```c++\ntypedef enum {\n    kTrans          = 112,  // CblasTrans\n    kNoTrans        = 111,  // CblasNoTrans\n    kConjTrans      = 113,  // CblasConjTrans\n    kConjNoTrans    = 114   // CblasConjNoTrans\n} MatrixTransposeType;\n```\n\nThen run\n```bash\nmkdir build\ncd build\nexport KALDI_ROOT=/path/to/kaldi/root\nexport OPENFST_ROOT=/path/to/openfst/root\n# if on UNIX, need compile kaldi with openblas\nexport OPENBLAS_ROOT=/path/to/openblas/root\ncmake ..\nmake -j\n```\n\n***Now I mainly work on [sptk](scripts) package, development based on kaldi is stopped.***\n\nFor developers (who want to make commits or PRs), please remember to setup [pre-commit](https://pre-commit.com) for code style formating.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffuncwj%2Fsetk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffuncwj%2Fsetk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffuncwj%2Fsetk/lists"}