{"id":13430291,"url":"https://github.com/xiph/rnnoise","last_synced_at":"2025-05-14T13:05:41.273Z","repository":{"id":37953057,"uuid":"99199151","full_name":"xiph/rnnoise","owner":"xiph","description":"Recurrent neural network for audio noise reduction","archived":false,"fork":false,"pushed_at":"2025-02-22T23:58:31.000Z","size":910,"stargazers_count":4495,"open_issues_count":193,"forks_count":936,"subscribers_count":149,"default_branch":"main","last_synced_at":"2025-04-10T03:47:22.816Z","etag":null,"topics":["audio","c","noise-reduction","rnn"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xiph.png","metadata":{"files":{"readme":"README","changelog":null,"contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-08-03T06:34:37.000Z","updated_at":"2025-04-09T15:08:05.000Z","dependencies_parsed_at":"2022-07-14T05:20:29.201Z","dependency_job_id":"4315c0e4-dc4a-4cd4-bda2-bad60c1d7977","html_url":"https://github.com/xiph/rnnoise","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiph%2Frnnoise","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiph%2Frnnoise/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiph%2Frnnoise/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiph%2Frnnoise/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xiph","download_url":"https://codeload.github.com/xiph/rnnoise/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254149903,"owners_count":22022851,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio","c","noise-reduction","rnn"],"created_at":"2024-07-31T02:00:51.892Z","updated_at":"2025-05-14T13:05:41.205Z","avatar_url":"https://github.com/xiph.png","language":"C","funding_links":[],"categories":["C","Audio","Table of Contents","Speech Enhancement \u0026 Audio Processing","7. Audio enhancement and noise suppression"],"sub_categories":["VST Plugins","Noise Suppression \u0026 Enhancement","Voice-specific prompting and tools"],"readme":"RNNoise is a noise suppression library based on a recurrent neural network.\nA description of the algorithm is provided in the following paper:\n\nJ.-M. Valin, A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech\nEnhancement, Proceedings of IEEE Multimedia Signal Processing (MMSP) Workshop,\narXiv:1709.08243, 2018.\nhttps://arxiv.org/pdf/1709.08243.pdf\n\nAn interactive demo of version 0.1 is available at: https://jmvalin.ca/demo/rnnoise/\n\nTo compile, just type:\n% ./autogen.sh\n% ./configure\n% make\n\nOptionally:\n% make install\n\nIt is recommended to either set -march= in the CFLAGS to an architecture\nwith AVX2 support or to add --enable-x86-rtcd to the configure script\nso that AVX2 (or SSE4.1) can at least be used as an option.\nNote that the autogen.sh script will automatically download the model files\nfrom the Xiph.Org servers, since those are too large to put in Git.\n\nWhile it is meant to be used as a library, a simple command-line tool is\nprovided as an example. It operates on RAW 16-bit (machine endian) mono\nPCM files sampled at 48 kHz. It can be used as:\n\n% ./examples/rnnoise_demo \u003cnoisy speech\u003e \u003coutput denoised\u003e\n\nThe output is also a 16-bit raw PCM file.\nNOTE AGAIN, THE INPUT and OUTPUT ARE IN RAW FORMAT, NOT WAV.\n\nThe latest version of the source is available from\nhttps://gitlab.xiph.org/xiph/rnnoise .  The GitHub repository\nis a convenience copy.\n\n== Training ==\n\nThe models distributed with RNNoise are now trained using only the publicly\navailable datasets listed below and using the training precedure described\nhere. Exact results will still depend on the the exact mix of data used,\non how long the training is performed and on the various random seeds involved.\n\nTo train an RNNoise model, you need both clean speech data, and noise data.\nBoth need to be sampled at 48 kHz, in 16-bit PCM format (machine endian).\nClean speech data can be obtained from the datasets listed in the datasets.txt\nfile, or by downloaded the already-concatenation of those files in\nhttps://media.xiph.org/rnnoise/data/tts_speech_48k.sw\nFor noise data, we suggest the background_noise.sw and foreground_noise.sw\n(or later versions) noise files from https://media.xiph.org/rnnoise/data/\nThe foreground_noise.sw file contains noise signals that are meant to be added\nto the background noise (e.g. keyboard sounds). Optionally, the foreground noise\nfile can even be denoised with a traditional denoiser (e.g. libspeexdsp) to\nkeep only the transient components. For background noise, the data from the\noriginal RNNoise noise collection have now been sufficiently filtered to\nprovide good results -- either alone or in combination with the\nbackground_noise.sw file. The dataset can be downloaded (updated Jan 30th 2025)\nfrom: https://media.xiph.org/rnnoise/rnnoise_contributions.tar.gz\n\nThe first step is to take the speech and noise, and mix them in a variety of\nways to simulate real life conditions (including pauses, filtering and more).\nAssuming the files are called speech.pcm and noise.pcm, start by generating\nthe training feature data with:\n\n% ./dump_features speech.pcm background_noise.pcm foreground_noise.pcm features.f32 \u003ccount\u003e\nwhere \u003ccount\u003e is the number of sequences to process. The number of sequences\nshould be at least 10000, but the more the better (200000 or more is\nrecommended).\n\nOptionally, training can also simulate reverberation, in which case room impulse\nresponses (RIR) are also needed. Limited RIR data is available at:\nhttps://media.xiph.org/rnnoise/data/measured_rirs-v2.tar.gz\nThe format for those is raw 32-bit floating-point (files are little endian).\nAssuming a list of all the RIR files is contained in a rir_list.txt file,\nthe training feature data can be generated with:\n\n% ./dump_features -rir_list rir_list.txt speech.pcm background_noise.pcm foreground_noise.pcm features.f32 \u003ccount\u003e\n\nTo make the feature generation faster, you can use the script provided in\nscript/dump_features_parallel.sh (you will need to modify the script if you\nwant to add RIR augmentation).\n\nTo use it:\n% script/dump_features_parallel.sh ./dump_features speech.pcm background_noise.pcm foreground_noise.pcm features.f32 \u003ccount\u003e rir_list.txt\nwhich will run nb_processes processes, each for count sequences, and\nconcatenate the output to a single file.\n\nOnce the feature file is computed, you can start the training with:\n% python3 train_rnnoise.py features.f32 output_directory\n\nChoose a number of epochs (using --epochs) that leads to about 75000 weight\nupdates. The training will produce .pth files, e.g. rnnoise_50.pth .\nThe next step is to convert the model to C files using:\n\n% python3 dump_rnnoise_weights.py --quantize rnnoise_50.pth rnnoise_c\n\nwhich will produce the rnnoise_data.c and rnnoise_data.h files in the\nrnnoise_c directory.\n\nCopy these files to src/ and then build RNNoise using the instructions above.\n\nFor slightly better results, a trained model can be used to remove any noise\nfrom the \"clean\" training speech, before restaring the denoising process\nagain (no need to do that more than once).\n\n== Loadable Models ==\n\nThe model format has changed since v0.1.1. Models now use a binary\n\"machine endian\" format. To output a model in that format, build RNNoise\nwith that model and use the dump_weights_blob executable to output a\nweights_blob.bin binary file. That file can then be used with the\nrnnoise_model_from_file() API call. Note that the model object MUST NOT\nbe deleted while the RNNoise state is active and the file MUST NOT\nbe closed.\n\nTo avoid including the default model in the build (e.g. to reduce download\nsize) and rely only on model loading, add -DUSE_WEIGHTS_FILE to the CFLAGS.\nTo be able to load different models, the model size (and header file) needs\nto patch the size use during build. Otherwise the model will not load\nWe provide a \"little\" model with half as an alternative. To use the smaller\nmodel, rename rnnoise_data_little.c to rnnoise_data.c. It is possible\nto build both the regular and little binary weights and load any of them\nat run time since the little model has the same size as the regular one\n(except for the increased sparsity).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxiph%2Frnnoise","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxiph%2Frnnoise","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxiph%2Frnnoise/lists"}