{"id":13397815,"url":"https://github.com/adblockradio/adblockradio","last_synced_at":"2025-09-27T10:31:33.658Z","repository":{"id":50362594,"uuid":"140282973","full_name":"adblockradio/adblockradio","owner":"adblockradio","description":"An adblocker for live radio streams and podcasts. Machine learning meets Shazam.","archived":true,"fork":false,"pushed_at":"2021-04-10T14:07:10.000Z","size":9143,"stargazers_count":1482,"open_issues_count":12,"forks_count":65,"subscribers_count":23,"default_branch":"master","last_synced_at":"2024-09-21T17:14:00.139Z","etag":null,"topics":["adblock","adblockradio","ai","keras","machine-learning","shazam","signal-processing","tensorflow"],"latest_commit_sha":null,"homepage":"https://www.adblockradio.com","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/adblockradio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-07-09T12:35:43.000Z","updated_at":"2024-09-19T08:38:48.000Z","dependencies_parsed_at":"2022-09-11T02:01:51.044Z","dependency_job_id":null,"html_url":"https://github.com/adblockradio/adblockradio","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adblockradio%2Fadblockradio","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adblockradio%2Fadblockradio/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adblockradio%2Fadblockradio/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adblockradio%2Fadblockradio/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/adblockradio","download_url":"https://codeload.github.com/adblockradio/adblockradio/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":219871973,"owners_count":16554475,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adblock","adblockradio","ai","keras","machine-learning","shazam","signal-processing","tensorflow"],"created_at":"2024-07-30T18:01:46.709Z","updated_at":"2025-09-27T10:31:27.815Z","avatar_url":"https://github.com/adblockradio.png","language":"JavaScript","funding_links":["https://liberapay.com/asto/donate"],"categories":["JavaScript","Ad Blockers","Git Blocklist Repos","Networking"],"sub_categories":["Notable Mentions","Browser and Email Blockers","Ad Blockers"],"readme":"# Adblock Radio (project archived)\n\n![Adblock Radio](https://www.adblockradio.com/assets/img/abr_buddha_v3_175.png)\n\nA library to block ads on live radio streams and podcasts. Machine learning meets Shazam.\n\nEngine of [AdblockRadio.com](https://www.adblockradio.com).\nDemo standalone player [available here](https://github.com/adblockradio/buffer-player).\n\nBuild status:\n[![CircleCI](https://circleci.com/gh/adblockradio/adblockradio.svg?style=svg)](https://circleci.com/gh/adblockradio/adblockradio)\n\nHelp the project grow:\n\u003ca href=\"https://liberapay.com/asto/donate\"\u003e\u003cimg alt=\"Donate using Liberapay\" src=\"https://liberapay.com/assets/widgets/donate.svg\"\u003e\u003c/a\u003e\n\n## Overview\nA technical discussion is available [here](https://www.adblockradio.com/blog/2018/11/15/designing-audio-ad-block-radio-podcast/).\n\nRadio streams are downloaded in `predictor.js` with the module [adblockradio/stream-tireless-baler](https://github.com/adblockradio/stream-tireless-baler). Podcasts are downloaded in `predictor-file.js`.\n\nIn both cases, audio is then decoded to single-channel, `22050 Hz` PCM with `ffmpeg`.\n\nChunks of about one second of PCM audio are piped into two sub-modules:\n- a time-frequency analyser (`predictor-ml/ml.js`), that analyses spectral content with a neural network.\n- a fingerprint matcher (`predictor-db/hotlist.js`), that searches for exact occurrences of known ads, musics or jingles.\n\nIn `post-processing.js`, results are gathered for each audio segment and cleaned.\n\nA Readable interface, `Analyser`, is exposed to the end user. It streams objects containing the audio itself and all analysis results.\n\nOn a regular laptop CPU and with the Python time-frequency analyser, computations run at 5-10X for files and at 10-20% usage for live stream.\n\n## Getting started\n\n### Installation\n\n##### Mandatory prerequisites:\nYou need Node.js (\u003e= v10.12.x, but \u003c 11) and NPM. Download it [here](https://nodejs.org/en/download/). Pro-tip: to manage several node versions on your platform, use [NVM](https://github.com/creationix/nvm).\n\nOn Debian Stretch:\n```bash\napt-get install -y git ssh tar gzip ca-certificates build-essential sqlite3 ffmpeg\n```\nNote: works on Jessie, but installing ffmpeg is a bit painful. See [here](https://ffmpeg.org/download.html) and [there](https://superuser.com/questions/286675/how-to-install-ffmpeg-on-debian).\n\n##### Optional prerequisites:\nFor best performance (~2x speedup) you should choose to do part of the computations with Python. Additional prerequisites are the following: Python (tested with v2.7.9), [Keras](https://keras.io/#installation) (tested with v2.0.8) and [Tensorflow](https://www.tensorflow.org/install/) (tested with CPU v1.4.0 and GPU v1.3.0).\n\nOn Debian:\n```bash\napt-get install python-dev portaudio19-dev\npip install python_speech_features h5py numpy scipy keras tensorflow zerorpc sounddevice psutil\n```\nNote: if you do not have pip [follow these instructions to install it](https://pip.pypa.io/en/stable/installing/).\n\n##### Then install this module:\n```bash\ngit clone https://github.com/adblockradio/adblockradio.git\ncd adblockradio\nnpm install\n```\n\n### Testing\n\nValidate your installation with the test suite:\n\n```\nnpm test\n```\n\n### Command-line demo\n\nAt startup and periodically during runtime, filter configuration files are automatically updated from [adblockradio.com/models/](https://adblockradio.com/models/):\n- a compatible machine-learning model (`model.keras` or `model.json` + `group1-shard1of1`), for the time-frequency analyser.\n- a fingerprint database (`hotlist.sqlite`), for the fingerprint matcher.\n\n#### Live stream analysis\nRun the demo on French RTL live radio stream:\n```bash\nnode demo.js\n```\n\nHere is a sample output of the demo script, showing an ad detected:\n```\n{\n\t\"gain\": 74.63,\n\t\"ml\": {\n\t\t\"class\": \"0-ads\",\n\t\t\"softmaxraw\": [\n\t\t\t0.996,\n\t\t\t0.004,\n\t\t\t0\n\t\t],\n\t\t\"softmax\": [\n\t\t\t0.941,\n\t\t\t0.02,\n\t\t\t0.039\n\t\t],\n\t\t\"slotsFuture\": 4,\n\t\t\"slotsPast\": 5\n\t},\n\t\"hotlist\": {\n\t\t\"class\": \"9-unsure\",\n\t\t\"file\": null,\n\t\t\"matches\": 1,\n\t\t\"total\": 7\n\t},\n\t\"class\": \"0-ads\",\n\t\"metadata\": {\n\t\t\"artist\": \"Laurent Ruquier\",\n\t\t\"title\": \"L'été des Grosses Têtes\",\n\t\t\"cover\": \"https://cdn-media.rtl.fr/cache/wQofzw9SfgHNHF1rqJA3lQ/60v73-2/online/image/2014/0807/7773631957_laurent-ruquier.jpg\"\n\t},\n\t\"streamInfo\": {\n\t\t\"url\": \"http://streaming.radio.rtl.fr/rtl-1-44-128\",\n\t\t\"favicon\": \"https://cdn-static.rtl.fr/versions/www/6.0.637/img/apple-touch-icon.png\",\n\t\t\"homepage\": \"http://www.rtl.fr/\",\n\t\t\"audioExt\": \"mp3\"\n\t},\n\t\"predictorStartTime\": 1531150137583,\n\t\"playTime\": 1531150155250,\n\t\"tBuffer\": 15.98,\n\t\"audio\": ...\n}\n```\n\n#### Podcast analysis\nIt is also possible to analyse radio recordings.\nRun the demo on a recording of French RTL radio, including ads, talk and music:\n```bash\nnode demo-file.js\n```\n\nGradual outputs are similar to those of live stream analysis. An additional post-processing specific to recordings hides the uncertainties in predictions and shows big chunks for each class, with time stamps in milliseconds, making it ready for slicing.\n```\n[\n\t{\n\t\t\"class\": \"1-speech\",\n\t\t\"tStart\": 0,\n\t\t\"tEnd\": 58500\n\t},\n\t{\n\t\t\"class\": \"0-ads\",\n\t\t\"tStart\": 58500,\n\t\t\"tEnd\": 125500\n\t},\n\t{\n\t\t\"class\": \"1-speech\",\n\t\t\"tStart\": 125500,\n\t\t\"tEnd\": 218000\n\t},\n\t{\n\t\t\"class\": \"2-music\",\n\t\t\"tStart\": 218000,\n\t\t\"tEnd\": 250500\n\t},\n\t{\n\t\t\"class\": \"1-speech\",\n\t\t\"tStart\": 250500,\n\t\t\"tEnd\": 472949\n\t}\n]\n```\nNote that when analyzing audio files, you still need to provide the name of a radio stream, because the algorithm has to load acoustic parameters and DB of known samples. Analysis of podcasts not tied to a radio is not yet supported, but may possibly be in the future.\n\n## Documentation\n\n### Usage\n\nBelow is a simple usage example. More thorough usage examples are available in the tests:\n- file/podcast analysis: `test/file.js`\n- live stream analysis: `test/online.js`\n- record a live stream, analyse it later: `test/offline.js`\n\n```javascript\nconst { Analyser } = require(\"adblockradio\");\n\nconst abr = new Analyser({\n\tcountry: \"France\",\n\tname: \"RTL\",\n\tconfig: {\n\t\t...\n\t}\n});\n\nabr.on(\"data\", function(obj) {\n\t...\n});\n```\n\nProperty|Description|Default\n--------|-----------|-------\n`country`|Country of the radio stream according to [radio-browser.info](http://www.radio-browser.info)|None\n`name`|Name of the radio stream according to [radio-browser.info](http://www.radio-browser.info)|None\n`file`|File to analyse (optional, analyse the live stream otherwise)|None\n\n### Methods\n\nAcoustic model and hotlist files are refreshed automatically on startup. If you plan to continuously run the algo for a long time, you can trigger manual updates. Note those methods are only available in live stream analysis mode.\n\nMethod|Parameters|Description\n------|----------|-----------\n`refreshPredictorMl`|None|Manually refresh the ML model (live stream only)\n`refreshPredictorHotlist`|None|Manually refresh the hotlist DB (live stream only)\n`refreshMetadata`|None|Manually refresh the [metadata scraper](https://github.com/adblockradio/webradio-metadata) (live stream only)\n`stopDl`|None|Stop Adblock Radio (live stream only)\n\n\n### Optional configuration\nProperties marked with a `*` are meant to be used only with live radio stream analysis, not file analysis where they are ignored.\n\n#### Scheduling\n\nProperty|Description|Default\n--------|-----------|-------\n`predInterval`|Send stream status to listener every N seconds|`1`\n`saveDuration*`|If enabled, save audio file and metadata every N `predInterval` times|`10`\n`modelUpdatesInterval`|If enabled, update model files every N minutes|`60`\n\n#### Switches\n\nProperty|Description|Periodicity|Default\n--------|-----------|-----------|-------\n`enablePredictorMl`|Perform machine learning inference|`predInterval`|`true`\n`JSPredictorMl`|Use tfjs instead of Python for ML inference (slower)|`false`\n`enablePredictorHotlist`|Compute audio fingerprints and search them in a DB|`predInterval`|`true`\n`saveAudio*`|Save stream audio data in segments on hard drive|`saveDuration`|`true`\n`saveMetadata`|Save a JSON with predictions|`saveDuration`|`true`\n`fetchMetadata*`|Gather metadata from radio websites|`saveDuration`|`true`\n`modelUpdates`|Keep ML and hotlist files up to date|`modelUpdatesInterval`|`true`\n\n#### Paths\n\nProperty|Description|Default\n--------|-----------|-------\n`modelPath`|Directory where ML models and hotlist DBs are stored|`process.cwd() + '/model'`\n`modelFile`|Path of ML file relative to `modelPath`|`country + '_' + name + '/model.keras'`\n`hotlistFile`|Path of the hotlist DB relative to `modelPath`|`country + '_' + name + '/hotlist.sqlite'`\n`saveAudioPath*`|Root folder where audio and metadata are saved|`process.cwd() + '/records'`\n\n### Output\n\nReadable streams constructed with `Analyser` emit objects with the following properties. Some properties are only available when doing live radio analysis. They are marked with a `*`. Other specific to file analysis are marked with `**`.\n\n- `audio*`: Buffer containing a chunk of original (compressed) audio data.\n\n- `ml`: `null` if not available, otherwise an object containing the results of the time-frequency analyser\n  * `softmaxraw`: an array of three numbers representing the [softmax](https://en.wikipedia.org/wiki/Softmax_function) between ads, speech and music.\n  * `softmax`: same as softmaxraw, but smoothed in time with `slotsFuture` data points in the future and `slotsPast` data points in the past. Smoothing weights are defined by `consts.MOV_AVG_WEIGHTS` in [`post-processing.js`](https://github.com/adblockradio/adblockradio/blob/master/post-processing.js).\n  * `class`: either `0-ads`, `1-speech`, `2-music` or `9-unsure`. The classification according to `softmax`.\n\n- `hotlist`: null if not available, otherwise an object containing the results of the fingerprint matcher.\n  * `file`: if class is not \"9-unsure\", the reference of the file recognized.\n  * `total`: number of fingerprints computed for the given audio segment.\n  * `matches`: number of matching fingerprints between the audio segment and the fingerprint database.\n  * `class`: either `0-ads`, `1-speech`, `2-music`, `3-jingles` or `9-unsure` if not enough matches have been found.\n\n- `class`: final prediction of the algorithm. Either `0-ads`, `1-speech`, `2-music`, `3-jingles` or `9-unsure`.\n\n- `metadata*`: live metadata, fetched and parsed by the module [adblockradio/webradio-metadata](https://github.com/adblockradio/webradio-metadata).\n\n- `streamInfo*`: static metadata about the stream. Contains stream `url`, `favicon`, `bitrate` in bytes / s, audio files extension `audioExt` (`mp3` or `aac`) and `homepage` URL.\n\n- `gain`: a [dB](https://en.wikipedia.org/wiki/Decibel) value representing the average volume of the stream. Useful if you wish to normalize the playback volume. Calculated by [`mlpredict.py`](https://github.com/adblockradio/adblockradio/blob/master/predictor-ml/mlpredict.py).\n\n- `tBuffer*`: seconds of audio buffer. Calculated by [adblockradio/stream-tireless-baler](https://github.com/adblockradio/stream-tireless-baler).\n\n- `predictorStartTime*`: timestamp of the algorithm startup. Useful to get the uptime.\n\n- `playTime*`: approximate timestamp of when the given audio is to be played. TODO check this.\n\n- `tStart**`: lower boundary of the time interval linked with the prediction (in milliseconds)\n\n- `tEnd**`: upper boundary of the time interval linked with the prediction (in milliseconds)\n\n## Supported radios\nThe list of supported radios is [available here](https://github.com/adblockradio/available-models).\n\n### Note to developers\nIntegrations of this module are welcome. Suggestions are available [here](https://www.adblockradio.com/blog/2018/11/15/designing-audio-ad-block-radio-podcast/#product-design).\n\nA standalone demo player for web browsers is [available here](https://github.com/adblockradio/buffer-player).\n\n## License\nSee LICENSE file.\n\nYour contribution to this project is welcome, but might be subject to a contributor's license agreement.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadblockradio%2Fadblockradio","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadblockradio%2Fadblockradio","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadblockradio%2Fadblockradio/lists"}