{"id":23444055,"url":"https://github.com/sevagh/zen","last_synced_at":"2025-04-13T12:34:28.735Z","repository":{"id":54240590,"uuid":"308073169","full_name":"sevagh/Zen","owner":"sevagh","description":"optimized realtime harmonic/percussive source separation using the GPU (NVIDIA CUDA) and CPU (Intel IPP)","archived":false,"fork":false,"pushed_at":"2023-12-22T11:14:05.000Z","size":18675,"stargazers_count":22,"open_issues_count":0,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-05T01:01:53.024Z","etag":null,"topics":["audio","cuda","digital-signal-processing","dsp","real-time","source-separation","thrust"],"latest_commit_sha":null,"homepage":"","language":"Cuda","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sevagh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-10-28T16:14:47.000Z","updated_at":"2024-12-26T06:27:47.000Z","dependencies_parsed_at":"2022-08-13T09:50:38.231Z","dependency_job_id":null,"html_url":"https://github.com/sevagh/Zen","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sevagh%2FZen","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sevagh%2FZen/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sevagh%2FZen/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sevagh%2FZen/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sevagh","download_url":"https://codeload.github.com/sevagh/Zen/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248714671,"owners_count":21149935,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio","cuda","digital-signal-processing","dsp","real-time","source-separation","thrust"],"created_at":"2024-12-23T18:26:17.701Z","updated_at":"2025-04-13T12:34:28.702Z","avatar_url":"https://github.com/sevagh.png","language":"Cuda","readme":"# Zen\n\nZen is a real-time capable, CUDA-accelerated harmonic/percussive source separation library, which implements:\n* Harmonic-percussive source separation using median filtering ([Fitzgerald 2010](http://dafx10.iem.at/papers/DerryFitzGerald_DAFx10_P15.pdf), [Drieger et al 2014](https://archives.ismir.net/ismir2014/paper/000127.pdf))\n* Steady-state/transient source separation using SSE (stochastic spectrum estimation) filtering ([Bayarres 2014](https://iie.fing.edu.uy/publicaciones/2014/Iri14/Iri14.pdf))\n\nNote that the Npp/Ipp FilterBox (moving average filter) functions in the SSE filtering case are not as well-behaved as the FilterMedian functions - use caution with the SSE implementation.\n\nZen was written from the ground up to support dual CPU/GPU implementations of algorithms by using policy-based template metaprogramming. For specialized subroutines (e.g. cuFFT, Npp/Ipp), there are abstraction wrappers.\n\n| Component | License | Description | Dependencies |\n|-----------|---------|-------------|--------------|\n| [libzen](./libzen/) | MIT | Core C++ library | IPP, CUDA Toolkit [+ gtest, benchmark for tests] |\n| [zen](./zen/) | MIT | Reference command-line tool | IPP, CUDA Toolkit, [libnyquist](https://github.com/ddiakopoulos/libnyquist), [clipp](https://github.com/muellan/clipp) |\n| [pitch-tracking demo](./demos/pitch-tracking) | MIT | Demo of real-time pitch tracking ([McLeod Pitch Method](http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf)) with **harmonic separation pre-processing**. Includes an optimized implementation of MPM using IPP FFT | IPP, CUDA Toolkit, [libnyquist](https://github.com/ddiakopoulos/libnyquist) |\n| [beat-tracking demo](./demos/beat-tracking) | GPLv3 | Demo of real-time beat tracking ([BTrack](https://github.com/adamstark/BTrack)) with **percussive separation pre-processing**. Includes an optimized implementation of BTrack using IPP FFT. | IPP, CUDA Toolkit, [libnyquist](https://github.com/ddiakopoulos/libnyquist), [gcem](https://github.com/kthohr/gcem) |\n\n**NB** The algorithms are intended to be used with a causal real-time input stream. For simplicity, all real-time demo code uses offline wav files, but steps through them in hops to simulate real-time.\n\n## Block diagram\n\n![block1](./docs/block.png)\n\nZen is the fastest implementation of realtime median-filtering HPSS with a sliding causal STFT, first shown in https://github.com/sevagh/Real-Time-HPSS:\n\n\u003cimg src=\"./docs/rt_hpss_diagram.png\" width=678px\u003e\n\n## Example\n\nClick this to see an example on my Soundcloud page:\n\n[\u003cimg src=\"./docs/soundcloud.png\" width=600px\u003e](https://soundcloud.com/user-167126026/sets/harmonic-percussive-source-separation)\n\n## Performance\n\n1024-hop GPU HPR is the sweet spot of performance:\n\n![gpuvcpu](./docs/cpu_vs_gpu.png) ![gpuoverhead](./docs/gpu_overhead.png)\n\n## Quality of separation\n\nSee a newer project of mine, \u003chttps://github.com/sevagh/Music-Separation-TF\u003e, for some separation audio quality measurements. The new project further develops an idea for using the CQT instead of the STFT in the single-pass median-filtering HPSS algorithm for a higher quality separation.\n\nThis should be implementable in Zen, if a good CUDA NSGT or CQT library is found.\n\n## Origin\n\nThis is a followup to my project [Real-time Harmonic-Percussive Source Separation](https://github.com/sevagh/Real-Time-HPSS). In the previous project, I showed that Fitzgerald's 2010 algorithm for median-filtering harmonic-percussive source separation (and Drieger et al's subequent 2014 modification) could be adapted to work in real-time. However, my simple MATLAB and Python implementations were too slow to be feasible (~5-10ms of processing per 10ms hop in a real-time stream).\n\nUsing CUDA and NPP to implement median-filtering-based HPR (harmonic-percussive-residual) separation, I got the computation time down to ~160us for a 10ms input buffer in this library, making it viable as an early stage in a real-time processing chain.\n\n## Usage\n\n### Build\n\nZen uses CMake (and is not simple to build). You need to adjust [CMakeLists.txt](./CMakeLists.txt) to specify locations for your custom GCC (for nvcc), CUDA toolkit libraries, and IPP libraries. I suggest using Ninja:\n\n```\n$ mkdir -p build \u0026\u0026 cd build \u0026\u0026 cmake .. -GNinja \u0026\u0026 ninja -j16\n```\n\n### libzen library examples\n\nThe [pitch-tracking main.cu](./demos/pitch-tracking/main.cu) and [beat-tracking main.cu](./demos/beat-tracking/main.cu) files show example usages of `HPRRealtime\u003cBackend::GPU\u003e` for creating real-time pure harmonic and pure percussive separations.\n\n### zen command-line tool usage\n\nThe [zen](./zen) command line tool implements all of the classes and algorithms of Zen:\n```\nusage:\n\n  zen offline -i, --input \u003cinfile\u003e [--hps [\u003chop-h\u003e] [\u003cbeta-h\u003e] [\u003chop-p\u003e] [\u003cbeta-p\u003e]] [-o,\n      --out-prefix \u003coutfile_prefix\u003e] [--cpu] [--sse] [--soft-mask] [--nocopybord]\n\n  zen fakert -i, --input \u003cinfile\u003e [--hps [\u003chop\u003e] [\u003cbeta\u003e]] [-o, --output \u003coutfile\u003e] [--cpu] [--sse]\n      [--soft-mask] [--nocopybord]\n\n  zen help | -h | --help\n  zen version | -v | --version\n```\n\nBy default, `beta` is the separation factor of Drieger et al's Harmonic-Percussive-Residual technique. If using `--soft-mask`, `beta` is the raised power of the Wiener soft mask. If using `--sse`, the parameter `beta` is ignored.\n\nExample of the iterative offline separation into 3 components, harmonic/percussive/residual:\n```\n$ ./zen offline --hps 4096 2.5 256 2.5 --input ../samples/mixed.wav --out-prefix offline-sep\nRunning zen-offline with the following params:\n        infile: ../samples/mixed.wav\n        outfile_prefix: offline-sep\n        do hps: yes\n                harmonic hop: 4096\n                harmonic beta: 2.5\n                percussive hop: 256\n                percussive beta: 2.5\n                mask: hard/binary\n                filter: median\n        compute: gpu (cuda/npp)\nAudio file info:\n        sample rate: 44100\n        len samples: 161571\n        frame size: 2\n        seconds: 3.66374\n        channels: 1\nProcessing input signal of size 161571 with HPR-I separation using harmonic params: 4096,2.5, percussive params: 256,2.5\nGPU/CUDA/thrust: 2-pass HPR-I-Offline took 487 ms\n$\n$ ls offline-sep*\noffline-sep_harm.wav  offline-sep_perc.wav  offline-sep_residual.wav\n```\n\nExample of fakert (aka \"fake-real-time\" using streaming wav files) separation into a single percussive component:\n```\n$ ./zen fakert --input ../samples/mixed.wav -o perc.wav --hps 256 2.5\nRunning zen-fakert with the following params:\n        infile: ../samples/mixed.wav\n        outfile: perc.wav\n        do hps: yes\n                hop: 256\n                beta: 2.5\n                mask: hard/binary\n                filter: median\n        compute: gpu (cuda/npp)\nAudio file info:\n        sample rate: 44100\n        len samples: 161571\n        frame size: 2\n        seconds: 3.66374\n        channels: 1\nSlicing buffer size 161571 into 631 chunks of size 256\nPRealtime GPU:  Δn = 256, Δt(ms) = 5.80499, average processing duration(us) = 173.99\n$\n$ ls perc.wav\nperc.wav\n```\n\n## Development\n\nI wrote Zen on Linux (Fedora 32) using GCC 8, CUDA Toolkit 10.2, and nvcc on an amd64 Ryzen host with an NVIDIA RTX 2070 SUPER. All NVIDIA libraries were installed and managed using negativo17's Fedora nvidia repository.\n\nThere are unit tests in the libzen source tree. Memory and UB checks can be run during the test suite as follows. I favor asan over valgrind, but we need some special ASAN options to not clash with CUDA. I also try to use cuda-memcheck, but it slows execution down too much in some cases.\n\n```\n$ mkdir -p build \u0026\u0026 cd build \u0026\u0026 cmake .. -GNinja -DENABLE_UBSAN=ON -DENABLE_ASAN=ON\n$ ninja -j16\n$ export ASAN_OPTIONS=\"protect_shadow_gap=0:replace_intrin=0:detect_leaks=0\"\n$ ninja test\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsevagh%2Fzen","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsevagh%2Fzen","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsevagh%2Fzen/lists"}