{"id":13582943,"url":"https://github.com/AdguardTeam/HostlistCompiler","last_synced_at":"2025-04-06T18:31:43.163Z","repository":{"id":37519936,"uuid":"246633813","full_name":"AdguardTeam/HostlistCompiler","owner":"AdguardTeam","description":"A simple tool that compiles hosts blocklists from multiple sources","archived":false,"fork":false,"pushed_at":"2024-04-08T09:19:27.000Z","size":159,"stargazers_count":133,"open_issues_count":15,"forks_count":18,"subscribers_count":19,"default_branch":"master","last_synced_at":"2024-04-14T05:58:00.207Z","etag":null,"topics":["adguard-home","filters","javascript","open-source"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AdguardTeam.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-03-11T17:15:24.000Z","updated_at":"2024-04-27T14:07:51.417Z","dependencies_parsed_at":"2024-04-27T14:07:31.158Z","dependency_job_id":"53b96e4f-e617-47e2-8dcc-e1c26945bb35","html_url":"https://github.com/AdguardTeam/HostlistCompiler","commit_stats":{"total_commits":58,"total_committers":8,"mean_commits":7.25,"dds":0.5172413793103448,"last_synced_commit":"ec9cd697a3157433d6c56e3d500eb256b868f346"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdguardTeam%2FHostlistCompiler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdguardTeam%2FHostlistCompiler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdguardTeam%2FHostlistCompiler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdguardTeam%2FHostlistCompiler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AdguardTeam","download_url":"https://codeload.github.com/AdguardTeam/HostlistCompiler/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247530940,"owners_count":20953875,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adguard-home","filters","javascript","open-source"],"created_at":"2024-08-01T15:03:08.829Z","updated_at":"2025-04-06T18:31:43.156Z","avatar_url":"https://github.com/AdguardTeam.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# Hostlist compiler\n\n[![NPM](https://nodei.co/npm/@adguard/hostlist-compiler.png?compact=true)](https://www.npmjs.com/package/@adguard/hostlist-compiler/)\n\nThis is a simple tool that makes it easier to compile a [hosts blocklist](https://adguard-dns.io/kb/general/dns-filtering-syntax/) compatible with AdGuard Home or any other AdGuard product with **DNS filtering**.\n\n- [Usage](#usage)\n  - [Configuration](#configuration)\n  - [Command-line](#command-line)\n  - [API](#api)\n- [Transformations](#transformations)\n  - [RemoveComments](#remove-comments)\n  - [Compress](#compress)\n  - [RemoveModifiers](#remove-modifiers)\n  - [Validate](#validate)\n  - [ValidateAllowIp](#validate-allow-ip)\n  - [Deduplicate](#deduplicate)\n  - [InvertAllow](#invertallow)\n  - [RemoveEmptyLines](#removeemptylines)\n  - [TrimLines](#trimlines)\n  - [InsertFinalNewLine](#insertfinalnewline)\n  - [ConvertToAscii](#convert-to-ascii)\n- [How to build](#how-to-build)\n\n## \u003ca name=\"usage\"\u003e\u003c/a\u003e Usage\n\nFirst of all, install the `hostlist-compiler`:\n\n```bash\nnpm i -g @adguard/hostlist-compiler\n```\n\nAfter that you have two options.\n\n**Quick hosts conversion**\n\nConvert and compress a `/etc/hosts`-syntax blocklist to [AdGuard syntax](https://adguard-dns.io/kb/general/dns-filtering-syntax/).\n\n```\nhostlist-compiler -i hosts.txt -i hosts2.txt -o output.txt\n```\n\n**Build a configurable blocklist from multiple sources**\n\nPrepare the list configuration (read more about that [below](#configuration)) and run the compiler:\n\n```bash\nhostlist-compiler -c configuration.json -o output.txt\n```\n\n**All command line options**\n\n```\nUsage: hostlist-compiler [options]\n\nOptions:\n  --config, -c      Path to the compiler configuration file             [string]\n  --input, -i       URL (or path to a file) to convert to an AdGuard-syntax\n                    blocklist. Can be specified multiple times.          [array]\n  --input-type, -t  Type of the input file (/etc/hosts, adguard)        [string]\n  --output, -o      Path to the output file                  [string] [required]\n  --verbose, -v     Run with verbose logging                           [boolean]\n  --version         Show version number                                [boolean]\n  -h, --help        Show help                                          [boolean]\n\nExamples:\n  hostlist-compiler -c config.json -o       compile a blocklist and write the\n  output.txt                                output to output.txt\n  hostlist-compiler -i                      compile a blocklist from the URL and\n  https://example.org/hosts.txt -o          write the output to output.txt\n  output.txt\n```\n\n### \u003ca name=\"configuration\"\u003e\u003c/a\u003e Configuration\n\nConfiguration defines your filter list sources, and the transformations that are applied to the sources.\n\nHere is an example of this configuration:\n\n```json\n{\n  \"name\": \"List name\",\n  \"description\": \"List description\",\n  \"homepage\": \"https://example.org/\",\n  \"license\": \"GPLv3\",\n  \"version\": \"1.0.0.0\",\n  \"sources\": [\n    {\n      \"name\": \"Local rules\",\n      \"source\": \"rules.txt\",\n      \"type\": \"adblock\",\n      \"transformations\": [\"RemoveComments\", \"Compress\"],\n      \"exclusions\": [\"excluded rule 1\"],\n      \"exclusions_sources\": [\"exclusions.txt\"],\n      \"inclusions\": [\"*\"],\n      \"inclusions_sources\": [\"inclusions.txt\"]\n    },\n    {\n      \"name\": \"Remote rules\",\n      \"source\": \"https://example.org/rules\",\n      \"type\": \"hosts\",\n      \"exclusions\": [\"excluded rule 1\"]\n    }\n  ],\n  \"transformations\": [\"Deduplicate\", \"Compress\"],\n  \"exclusions\": [\"excluded rule 1\", \"excluded rule 2\"],\n  \"exclusions_sources\": [\"global_exclusions.txt\"],\n  \"inclusions\": [\"*\"],\n  \"inclusions_sources\": [\"global_inclusions.txt\"]\n}\n```\n\n- `name` - (mandatory) the list name.\n- `description` - (optional) the list description.\n- `homepage` - (optional) URL to the list homepage.\n- `license` - (optional) Filter list license.\n- `version` - (optional) Filter list version.\n- `sources` - (mandatory) array of the list sources.\n  - `.source` - (mandatory) path or URL of the source. It can be a traditional filter list or a hosts file.\n  - `.name` - (optional) name of the source.\n  - `.type` - (optional) type of the source. It could be `adblock` for Adblock-style lists or `hosts` for /etc/hosts style lists. If not specified, `adblock` is assumed.\n  - `.transformations` - (optional) a list of transformations to apply to the source rules. By default, **no transformations** are applied. Learn more about possible transformations [here](#transformations).\n  - `.exclusions` - (optional) a list of rules (or wildcards) to exclude from the source.\n  - `.exclusions_sources` - (optional) a list of files with exclusions.\n  - `.inclusions` - (optional) a list of wildcards to include from the source. All rules that don't match these wildcards won't be included.\n  - `.inclusions_sources` - (optional) a list of files with inclusions.\n- `transformations` - (optional) a list of transformations to apply to the final list of rules. By default, **no transformations** are applied. Learn more about possible transformations [here](#transformations).\n- `exclusions` - (optional) a list of rules (or wildcards) to exclude from the source.\n- `exclusions_sources` - (optional) a list of files with exclusions.\n- `.inclusions` - (optional) a list of wildcards to include from the source. All rules that don't match these wildcards won't be included.\n- `.inclusions_sources` - (optional) a list of files with inclusions.\n\nHere is an example of a minimal configuration:\n\n```json\n{\n  \"name\": \"test list\",\n  \"sources\": [\n    {\n      \"source\": \"rules.txt\"\n    }\n  ]\n}\n```\n\n**Exclusion and inclusion rules**\n\nPlease note, that exclusion or inclusion rules may be a plain string, wildcard, or a regular expression.\n\n- `plainstring` - every rule that contains `plainstring` will match the rule\n- `*.plainstring` - every rule that matches this wildcard will match the rule\n- `/regex/` - every rule that matches this regular expression, will match the rule. By default, regular expressions are case-insensitive.\n- `! comment` - comments will be ignored.\n\n\u003e [!IMPORTANT]\n\u003e Ensure that rules in the exclusion list match the format of the rules in the filter list.\n\u003e To maintain a consistent format, add the `Compress` transformation to convert `/etc/hosts` rules to adblock syntax.\n\u003e This is especially useful if you have multiple lists in different formats.\n\nHere is an example:\n\nRules in HOSTS syntax: `/hosts.txt`\n\n```txt\n0.0.0.0 ads.example.com  \n0.0.0.0 tracking.example1.com  \n0.0.0.0 example.com\n```\n\nExclusion rules in adblock syntax: `/exclusions.txt`\n\n```txt\n||example.com^\n```\n\nConfiguration of the final list:\n\n```json\n{\n  \"name\": \"List name\",\n  \"description\": \"List description\",\n  \"sources\": [\n    {\n      \"name\": \"HOSTS rules\",\n      \"source\": \"hosts.txt\",\n      \"type\": \"hosts\",\n      \"transformations\": [\"Compress\"]\n    }\n  ],\n  \"transformations\": [\"Deduplicate\", \"Compress\"],\n  \"exclusions_sources\": [\"exclusions.txt\"]\n}\n```\n\nFinal filter output of `/hosts.txt` after applying the `Compress` transformation and exclusions:\n\n```txt\n||ads.example.com^  \n||tracking.example1.com^\n```\n\nThe last rule now `||example.com^` will correctly match the rule from the exclusion list and will be excluded.\n\n### \u003ca name=\"command-line\"\u003e\u003c/a\u003e Command-line\n\nCommand-line arguments.\n\n```\nUsage: hostlist-compiler [options]\n\nOptions:\n  --version      Show version number                                   [boolean]\n  --config, -c   Path to the compiler configuration file     [string] [required]\n  --output, -o   Path to the output file                     [string] [required]\n  --verbose, -v  Run with verbose logging                              [boolean]\n  -h, --help     Show help                                             [boolean]\n\nExamples:\n  hostlist-compiler -c config.json -o       compile a blocklist and write the\n  output.txt                                output to output.txt\n```\n\n### \u003ca name=\"api\"\u003e\u003c/a\u003e API\n\nInstall: `npm i @adguard/hostlist-compiler` or `yarn add @adguard/hostlist-compiler`\n\n#### JavaScript example:\n\n```javascript\nconst compile = require(\"@adguard/hostlist-compiler\");\n\n;(async () =\u003e {\n    // Compile filters\n    const result = await compile({\n        name: 'Your Hostlist',\n        sources: [\n            {\n                type: 'adblock',\n                source: 'https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt', // or local file\n                transformations: ['RemoveComments', 'Validate'],\n            },\n        ],\n        transformations: ['Deduplicate'],\n    });\n\n    // Write to file\n    writeFileSync('your-hostlist.txt', result.join('\\n'));\n})();\n```\n\n#### TypeScript example:\n\n```typescript\nimport compile from '@adguard/hostlist-compiler';\nimport { writeFileSync } from 'fs';\n\n;(async () =\u003e {\n    // Compile filters\n    const result = await compile({\n        name: 'Your Hostlist',\n        sources: [\n            {\n                type: 'adblock',\n                source: 'https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt',\n                transformations: ['RemoveComments', 'Validate'],\n            },\n        ],\n        transformations: ['Deduplicate'],\n    });\n\n    // Write to file\n    writeFileSync('your-hostlist.txt', result.join('\\n'));\n})();\n```\n\nor:\n\n```typescript\nimport HostlistCompiler, { IConfiguration as HostlistCompilerConfiguration } from '@adguard/hostlist-compiler';\nimport { writeFileSync } from 'fs';\n\n;(async () =\u003e {\n    // Configuration\n    const config: HostlistCompilerConfiguration = {\n        name: 'Your Hostlist',\n        sources: [\n            {\n                type: 'adblock',\n                source: 'https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt',\n                transformations: ['RemoveComments', 'Validate'],\n            },\n        ],\n        transformations: ['Deduplicate'],\n    };\n\n    // Compile filters\n    const result = await HostlistCompiler(config);\n\n    // Write to file\n    writeFileSync('your-hostlist.txt', result.join('\\n'));\n})();\n```\n\n## \u003ca name=\"transformations\"\u003e\u003c/a\u003e Transformations\n\nHere is the full list of transformations that are available:\n\n1. `RemoveComments`\n1. `Compress`\n1. `RemoveModifiers`\n1. `Validate`\n1. `ValidateAllowIp`\n1. `Deduplicate`\n1. `InvertAllow`\n1. `RemoveEmptyLines`\n1. `TrimLines`\n1. `InsertFinalNewLine`\n\nPlease note that these transformations are are always applied in the order specified here.\n\n### \u003ca name=\"remove-comments\"\u003e\u003c/a\u003e RemoveComments\n\nThis is a very simple transformation that simply removes comments (e.g. all rules starting with `!` or `#`).\n\n### \u003ca name=\"compress\"\u003e\u003c/a\u003e Compress\n\n\u003e [!IMPORTANT]\n\u003e This transformation converts `hosts` lists into `adblock` lists.\n\nHere's what it does:\n\n1. It converts all rules to adblock-style rules. For instance, `0.0.0.0 example.org` will be converted to `||example.org^`.\n2. It discards the rules that are now redundant because of other existing rules. For instance, `||example.org` blocks `example.org` and all it's subdomains, therefore additional rules for the subdomains are now redundant.\n\n### \u003ca name=\"remove-modifiers\"\u003e\u003c/a\u003e RemoveModifiers\n\nBy default, [AdGuard Home](https://github.com/AdguardTeam/AdGuardHome) will ignore rules with unsupported modifiers, and all of the modifiers listed here are unsupported. However, the rules with these modifiers are likely to be okay for DNS-level blocking, that's why you might want to remove them when importing rules from a traditional filter list.\n\nHere is the list of modifiers that will be removed:\n\n- `$third-party` and `$3p` modifiers\n- `$document` and `$doc` modifiers\n- `$all` modifier\n- `$popup` modifier\n- `$network` modifier\n\n\u003e [!CAUTION]\n\u003e Blindly removing `$third-party` from traditional ad blocking rules leads to lots of false-positives.\n\u003e\u003e This is exactly why there is an option to exclude rules - you may need to use it.\n\n### \u003ca name=\"validate\"\u003e\u003c/a\u003e Validate\n\nThis transformation is really crucial if you're using a filter list for a traditional ad blocker as a source.\n\nIt removes dangerous or incompatible rules from the list.\n\nSo here's what it does:\n\n- Discards domain-specific rules (e.g. `||example.org^$domain=example.com`). You don't want to have domain-specific rules working globally.\n- Discards rules with unsupported modifiers. [Click here](https://github.com/AdguardTeam/AdGuardHome/wiki/Hosts-Blocklists#-adblock-style-syntax) to learn more about which modifiers are supported.\n- Discards rules that are too short.\n- Discards IP addresses. If you need to keep IP addresses, use [ValidateAllowIp](#validate-allow-ip) instead.\n- Removes rules that block entire top-level domains (TLDs) like `||*.org^`, unless they have specific limiting modifiers such as `$denyallow`, `$badfilter`, or `$client`.\n  Examples:\n  - `||*.org^` - this rule will be removed\n  - `||*.org^$denyallow=example.com` - this rule will be kept because it has a limiting modifier\n\nIf there are comments preceding the invalid rule, they will be removed as well.\n\n### \u003ca name=\"validate-allow-ip\"\u003e\u003c/a\u003e ValidateAllowIp\n\nThis transformation exactly repeats the behavior of [Validate](#validate), but leaves the IP addresses in the lists.\n\n### \u003ca name=\"deduplicate\"\u003e\u003c/a\u003e Deduplicate\n\nThis transformation simply removes the duplicates from the specified source.\n\nThere are two important notes about this transformation:\n\n1. It keeps the original rules order.\n2. It ignores comments. However, if the comments precede the rule that is being removed, the comments will be also removed.\n\nFor instance:\n\n```\n! rule1 comment 1\nrule1\n! rule1 comment 2\nrule1\n```\n\nHere's what will be left after the transformation:\n\n```\n! rule1 comment 2\nrule1\n```\n\n### \u003ca name=\"invertallow\"\u003e\u003c/a\u003e InvertAllow\n\nThis transformation converts blocking rules to \"allow\" rules. Note, that it does nothing to /etc/hosts rules (unless they were previously converted to adblock-style syntax by a different transformation, for example [Compress](#compress)).\n\nThere are two important notes about this transformation:\n\n1. It keeps the original rules order.\n2. It ignores comments, empty lines, /etc/hosts rules and existing \"allow\" rules.\n\n**Example:**\n\nOriginal list:\n\n```\n! comment 1\nrule1\n\n# comment 2\n192.168.11.11   test.local\n@@rule2\n```\n\nHere's what we will have after applying this transformation:\n\n```\n! comment 1\n@@rule1\n\n# comment 2\n192.168.11.11   test.local\n@@rule2\n```\n\n### \u003ca name=\"removeemptylines\"\u003e\u003c/a\u003e RemoveEmptyLines\n\nThis is a very simple transformation that removes empty lines.\n\n**Example:**\n\nOriginal list:\n\n```\nrule1\n\nrule2\n\n\nrule3\n```\n\nHere's what we will have after applying this transformation:\n\n```\nrule1\nrule2\nrule3\n```\n\n### \u003ca name=\"trimlines\"\u003e\u003c/a\u003e TrimLines\n\nThis is a very simple transformation that removes leading and trailing spaces/tabs.\n\n**Example:**\n\nOriginal list:\n\n```\nrule1\n   rule2\nrule3\n\t\trule4\n```\n\nHere's what we will have after applying this transformation:\n\n```\nrule1\nrule2\nrule3\nrule4\n```\n\n### \u003ca name=\"insertfinalnewline\"\u003e\u003c/a\u003e InsertFinalNewLine\n\nThis is a very simple transformation that inserts a final newline.\n\n**Example:**\n\nOriginal list:\n\n```\nrule1\nrule2\nrule3\n```\n\nHere's what we will have after applying this transformation:\n\n```\nrule1\nrule2\nrule3\n\n```\n\n`RemoveEmptyLines` doesn't delete this empty row due to the execution order.\n\n### \u003ca name=\"convert-to-ascii\"\u003e\u003c/a\u003e ConvertToAscii\n\nThis transformation converts all non-ASCII characters to their ASCII equivalents. It is always performed first.\n\n**Example:**\n\nOriginal list:\n\n```\n||*.рус^\n||*.कॉम^\n||*.セール^\n```\n\nHere's what we will have after applying this transformation:\n\n```\n||*.xn--p1acf^\n||*.xn--11b4c3d^\n||*.xn--1qqw23a^\n```\n\n## \u003ca name=\"how-to-build\"\u003e\u003c/a\u003e How to build\n\n- `yarn install` - installs dependencies\n- `yarn lint` - runs eslint\n- `yarn test` - runs tests\n- `node src/cli.js -c examples/sdn/configuration.json -o filter.txt` - runs compiler with the example configuration\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAdguardTeam%2FHostlistCompiler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FAdguardTeam%2FHostlistCompiler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAdguardTeam%2FHostlistCompiler/lists"}