https://github.com/sol1/blocklist-domains
A github action makes a blocklist of domains in cdb format for dnsdist
https://github.com/sol1/blocklist-domains
Last synced: 4 months ago
JSON representation
A github action makes a blocklist of domains in cdb format for dnsdist
- Host: GitHub
- URL: https://github.com/sol1/blocklist-domains
- Owner: sol1
- License: mit
- Created: 2025-06-25T03:41:53.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-07-25T02:25:01.000Z (11 months ago)
- Last Synced: 2025-07-25T07:35:06.679Z (11 months ago)
- Language: Python
- Size: 23 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
  
# Sol1 Blocklist Aggregator
A fork of **[python-blocklist-aggregator](https://github.com/dmachard/python-blocklist-aggregator)** by dmachard.
This python module does the aggregation of several ads/tracking/malware lists, and merges them into a unified list with duplicates removed.
Create your own list from several sources.
See the **[blocklist-domains](https://github.com/dmachard/blocklist-domains)** repository for an implementation.
Default sources are defined on the [configuration file](../main/blocklist_aggregator/blocklist.conf)
## Table of contents
* [Installation](#installation)
* [Get Started](#get-started)
* [Custom Configuration](#custom-configuration)
* [Fetch and save-it to files](#fetch-and-save-it-to-files)
## Installation
     
If you want to generate your own unified blocklist,
install this module with the pip command.
```python
pip install blocklist_aggregator
```
## Get started
This basic example enable to get a unified list of domains.
You can save-it in a file or do what you want.
```python
import blocklist_aggregator
unified = blocklist_aggregator.fetch()
print(unified)
[ "doubleclick.net", ..., "telemetry.dropbox.com" ]
print(len(unified))
152978
```
## Custom configuration
See the default [configuration file](../main/blocklist_aggregator/blocklist.conf)
The configuration contains:
* the ads/tracking/malware URL lists with the pattern (regex) to use
* the domains list to exclude (whitelist)
* additionnal domains list to block (blacklist)
The configuration can be overwritten at runtime.
```python
cfg_yaml = "verbose: true"
unified = blocklist_aggregator.fetch(cfg_update=cfg_yaml)
```
or loaded from external config file
```python
unified = blocklist_aggregator.fetch(cfg_filename="/home/custom-blocklist.conf")
```
## Fetch and save-it to files
This module can be used to export the list in several format:
* text
* hosts
* map (for use with TinyCDB)
* CDB (key/value database)
```python
import blocklist_aggregator
# fetch domains
unified = blocklist_aggregator.fetch()
# save to a text file
blocklist_aggregator.save_raw(filename="/tmp/unified_list.txt")
# save to hosts file
blocklist_aggregator.save_hosts(filename="/tmp/unified_hosts.txt", ip="0.0.0.0")
# save to a text file in map format
blocklist_aggregator.save_map(filename="tmp/unified_map.txt")
# save to CDB
blocklist_aggregator.save_cdb(filename="/tmp/unified_domains.cdb")
```
## For developpers
Run test units
```bash
python3 -m unittest discover tests/ -v
```