{"id":21020581,"url":"https://github.com/nodiscc/hecat","last_synced_at":"2026-04-01T23:54:12.345Z","repository":{"id":37498553,"uuid":"380801190","full_name":"nodiscc/hecat","owner":"nodiscc","description":"Generic automation tool around data stored as plaintext YAML files","archived":false,"fork":false,"pushed_at":"2026-03-22T18:54:36.000Z","size":485,"stargazers_count":37,"open_issues_count":29,"forks_count":7,"subscribers_count":2,"default_branch":"master","last_synced_at":"2026-03-23T10:15:27.461Z","etag":null,"topics":["archiving","awesome-list","catalog","markdown","plaintext","shaarli","static-site-generator","yaml","yt-dlp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nodiscc.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-06-27T17:33:35.000Z","updated_at":"2026-03-22T18:54:39.000Z","dependencies_parsed_at":"2024-04-12T23:03:29.899Z","dependency_job_id":"8310b2d7-ba3a-48cc-9170-bd0b69c245f6","html_url":"https://github.com/nodiscc/hecat","commit_stats":null,"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"purl":"pkg:github/nodiscc/hecat","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodiscc%2Fhecat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodiscc%2Fhecat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodiscc%2Fhecat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodiscc%2Fhecat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nodiscc","download_url":"https://codeload.github.com/nodiscc/hecat/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nodiscc%2Fhecat/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31293130,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T21:15:39.731Z","status":"ssl_error","status_checked_at":"2026-04-01T21:15:34.046Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archiving","awesome-list","catalog","markdown","plaintext","shaarli","static-site-generator","yaml","yt-dlp"],"created_at":"2024-11-19T10:42:11.843Z","updated_at":"2026-04-01T23:54:12.336Z","avatar_url":"https://github.com/nodiscc.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# hecat\n\nA generic automation tool around data stored as plaintext YAML files.\n\n[![CI](https://github.com/nodiscc/hecat/actions/workflows/ci.yml/badge.svg)](https://github.com/nodiscc/hecat/actions)\n\nThis program uses YAML files to store data about various kind of items (bookmarks, software projects, ...) and apply various processing tasks.\nFunctionality is implemented in separate modules.\n\n### Importers\n\nImport data from various input formats:\n\n- [importers/markdown_awesome](hecat/importers/markdown_awesome.py): import data from the [awesome-selfhosted](https://github.com/awesome-selfhosted/awesome-selfhosted) markdown format\n- [importers/shaarli_api](hecat/importers//shaarli_api.py): import data from a [Shaarli](https://github.com/shaarli/Shaarli) instance using the [API](https://shaarli.github.io/api-documentation/)\n\n[![](https://gitlab.com/nodiscc/toolbox/-/raw/master/DOC/SCREENSHOTS/tMAxhLw.png)](hecat/importers/markdown_awesome.py)\n\n\n### Processors\n\nPerform processing tasks on YAML data:\n\n- [processors/software_metadata](hecat/processors/software_metadata.py): enrich software project metadata from GitHub and GitLab APIs (stars, last commit date...)\n- [processors/awesome_lint](hecat/processors/awesome_lint.py): check data against [awesome-selfhosted](https://github.com/awesome-selfhosted/awesome-selfhosted) consistency/completeness guidelines\n- [processors/download_media](hecat/processors/download_media.py): download video/audio files using [yt-dlp](https://github.com/yt-dlp/yt-dlp) for bookmarks imported from Shaarli\n- [processors/url_check](hecat/processors/url_check.py): check data for dead links\n- [processors/archive_webpages](hecat/processors/archive_webpages.py): archive webpages locally\n\n[![](https://gitlab.com/nodiscc/toolbox/-/raw/master/DOC/SCREENSHOTS/Heg3Esg.png)](hecat/processors/url_check.py)\n[![](https://gitlab.com/nodiscc/toolbox/-/raw/master/DOC/SCREENSHOTS/RtiDE91.png)](hecat/processors/download_media.py)\n[![](https://gitlab.com/nodiscc/toolbox/-/raw/master/DOC/SCREENSHOTS/hecat-processor-github-metadata.png)](hecat/processors/software_metadata.py)\n\n#### Exporters\n\nExport data to other formats:\n- [exporters/markdown_singlepage](hecat/exporters/markdown_singlepage.py): render data as a single markdown document\n- [exporters/markdown_multipage](hecat/exporters/markdown_multipage.py): render data as a multipage markdown site which can be used to generate a HTML site with Sphinx\n- [exporters/html_table](hecat/exporters/html_table.py): render data as single-page HTML table\n\n[![](https://gitlab.com/nodiscc/toolbox/-/raw/master/DOC/SCREENSHOTS/NvCOeiK.png)](hecat/exporters/markdown_singlepage.py)\n[![](https://gitlab.com/nodiscc/toolbox/-/raw/master/DOC/SCREENSHOTS/FFMPdaw.png)](hecat/exporters/html_table.py)\n[![](https://gitlab.com/nodiscc/toolbox/-/raw/master/DOC/SCREENSHOTS/hecat-exporter-markdown-multipage.png)](hecat/exporters/markdown_multipage.py)\n\n\n## Installation\n\n```bash\n# install requirements\nsudo apt install python3-venv python3-pip\n# create a python virtualenv\npython3 -m venv ~/.venv\n# activate the virtualenv\nsource ~/.venv/bin/activate\n# install the program\npip3 install git+https://gitlab.com/nodiscc/hecat.git\n```\n\nTo install from a local copy instead:\n\n```bash\n# grab a copy\ngit clone https://gitlab.com/nodiscc/hecat.git\n# install the python package\ncd hecat \u0026\u0026 python3 -m pip install .\n```\n\nTo install a specific [release](https://github.com/nodiscc/hecat/releases), adapt the `git clone` or `pip3 install` command:\n\n```bash\npip3 install git+https://gitlab.com/nodiscc/hecat.git@1.0.2\ngit clone -b 1.0.2 https://gitlab.com/nodiscc/hecat.git\n```\n\n## Usage\n\n```bash\n$ hecat --help\nusage: hecat [-h] [--config CONFIG_FILE] [--log-level {ERROR,WARNING,INFO,DEBUG}]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --config CONFIG_FILE  configuration file (default .hecat.yml)\n  --log-level {ERROR,WARNING,INFO,DEBUG} log level (default INFO)\n  --log-file LOG_FILE   log file (default none)\n```\n\nIf no configuration file is specified, configuration is read from `.hecat.yml` in the current directory.\n\n\n## Configuration\n\nhecat executes all steps defined in the configuration file. For each step:\n\n```yaml\nsteps:\n  - name: example step # arbitrary name for this step\n    module: processor/example # the module to use, see list of modules above\n    module_options: # a dict of options specific to the module, see list of modules above\n      option1: True\n      option2: some_value\n```\n\n### Examples\n\n#### Awesome lists\n\nImport data from [awesome-selfhosted](https://github.com/awesome-selfhosted/awesome-selfhosted)'s markdown list format:\n\n```yaml\n# .hecat.import.yml\n# $ git clone https://github.com/awesome-selfhosted/awesome-selfhosted\n# $ git clone https://github.com/awesome-selfhosted/awesome-selfhosted-data\nsteps:\n  - name: import awesome-selfhosted README.md to YAML\n    module: importers/markdown_awesome\n    module_options:\n      source_file: awesome-selfhosted/README.md\n      output_directory: ./\n      output_licenses_file: licenses.yml # optional, default licenses.yml\n      overwrite_tags: False # optional, default False\n```\n\nCheck data against awesome-selfhosted formatting guidelines, export to single page markdown and static HTML site (see [awesome-selfhosted-data](https://github.com/awesome-selfhosted/awesome-selfhosted-data), its [`Makefile`](https://github.com/awesome-selfhosted/awesome-selfhosted-data/blob/master/Makefile) and [Github Actions workflows](https://github.com/awesome-selfhosted/awesome-selfhosted-data/tree/master/.github/workflows) for complete usage examples. See [awesome-selfhosted](https://github.com/awesome-selfhosted/awesome-selfhosted) and [awesome-selfhosted-html](https://github.com/nodiscc/awesome-selfhosted-html-preview/) for example output):\n\n```yaml\n# .hecat.export.yml\nsteps:\n  - name: check data against awesome-selfhosted guidelines\n    module: processors/awesome_lint\n    module_options:\n      source_directory: awesome-selfhosted-data\n      licenses_files:\n        - licenses.yml\n        - licenses-nonfree.yml\n\n  - name: export YAML data to single-page markdown\n    module: exporters/markdown_singlepage\n    module_options:\n      source_directory: awesome-selfhosted-data # source/YAML data directory\n      output_directory: awesome-selfhosted # output directory\n      output_file: README.md # output markdown file\n      markdown_header: markdown/header.md # (optional, default none) path to markdown file to use as header (relative to source_directory)\n      markdown_footer: markdown/footer.md # (optional, default none) path to markdown file to use as footer (relative to source_directory)\n      back_to_top_url: '#awesome-selfhosted' # (optional, default #) the URL/anchor to use in 'back to top' links\n      exclude_licenses: # (optional, default none) do not write software items with any of these licenses to the output file\n        - '⊘ Proprietary'\n        - 'BUSL-1.1'\n        - 'CC-BY-NC-4.0'\n        - 'CC-BY-NC-SA-3.0'\n        - 'CC-BY-ND-3.0'\n        - 'Commons-Clause'\n        - 'DPL'\n        - 'SSPL-1.0'\n        - 'DPL'\n        - 'Elastic-1.0'\n        - 'Elastic-2.0'\n\n  - name: export YAML data to single-page markdown (non-free.md)\n    module: exporters/markdown_singlepage\n    module_options:\n      source_directory: awesome-selfhosted-data\n      output_directory: awesome-selfhosted\n      output_file: non-free.md\n      markdown_header: markdown/non-free-header.md\n      licenses_file: licenses-nonfree.yml # (optional, default licenses.yml) YAML file to load licenses from\n      back_to_top_url: '##awesome-selfhosted---non-free-software'\n      render_empty_categories: False # (optional, default True) do not render categories which contain 0 items\n      render_category_headers: False # (optional, default True) do not render category headers (description, related categories, external links...)\n      include_licenses: # (optional, default none) only render items matching at least one of these licenses (cannot be used together with exclude_licenses) (by identifier)\n        - '⊘ Proprietary'\n        - 'BUSL-1.1'\n        - 'CC-BY-NC-4.0'\n        - 'CC-BY-NC-SA-3.0'\n        - 'CC-BY-ND-3.0'\n        - 'Commons-Clause'\n        - 'DPL'\n        - 'SSPL-1.0'\n        - 'DPL'\n        - 'Elastic-1.0'\n        - 'Elastic-2.0'\n\n  - name: export YAML data to multi-page markdown/HTML site\n    module: exporters/markdown_multipage\n    module_options:\n      source_directory: awesome-selfhosted-data # directory containing YAML data\n      output_directory: awesome-selfhosted-html # directory to write markdown pages to\n      exclude_licenses: # optional, default []\n        - '⊘ Proprietary'\n        - 'BUSL-1.1'\n        - 'CC-BY-NC-4.0'\n        - 'CC-BY-NC-SA-3.0'\n        - 'CC-BY-ND-3.0'\n        - 'Commons-Clause'\n        - 'DPL'\n        - 'SSPL-1.0'\n        - 'DPL'\n        - 'Elastic-1.0'\n        - 'Elastic-2.0'\n\n# $ sphinx-build -b html -c awesome-selfhosted-data/ awesome-selfhosted-html/md/ awesome-selfhosted-html/html/\n# $ rm -r tests/awesome-selfhosted-html/html/.buildinfo tests/awesome-selfhosted-html/html/objects.inv awesome-selfhosted-html/html/.doctrees\n```\n\n\u003cdetails\u003e\u003csummary\u003eExample automation using Github actions:\u003c/summary\u003e\n\n```yaml\n# .github/workflows/build.yml\njobs:\n  build-markdown:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v3\n        with:\n          ref: ${{ github.ref }}\n      - run: python3 -m venv .venv \u0026\u0026 source .venv/bin/activate \u0026\u0026 pip3 install wheel \u0026\u0026 pip3 install --force git+https://github.com/nodiscc/hecat.git@1.2.0\n      - run: source .venv/bin/activate \u0026\u0026 hecat --config .hecat/awesome-lint.yml\n      - run: source .venv/bin/activate \u0026\u0026 hecat --config .hecat/export-markdown.yml\n\n  build-html:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v3\n        with:\n          ref: ${{ github.ref }}\n      - run: python3 -m venv .venv \u0026\u0026 source .venv/bin/activate \u0026\u0026 pip3 install wheel \u0026\u0026 pip3 install --force git+https://github.com/nodiscc/hecat.git@1.2.0\n      - run: source .venv/bin/activate \u0026\u0026 hecat --config .hecat/awesome-lint.yml\n      - run: source .venv/bin/activate \u0026\u0026 hecat --config .hecat/export-html.yml\n```\n\u003c/details\u003e\n\nUpdate metadata before rebuilding HTML/markdown output:\n\n```yaml\n# .hecat.update_metadata.yml\nsteps:\n  - name: update projects metadata\n    module: processors/software_metadata\n    module_options:\n      source_directory: awesome-selfhosted-data # directory containing YAML data and software subdirectory\n      metadata_only_missing: True # (default False) only gather metadata for software entries in which one of stargazers_count,updated_at, archived, current_release, commit_history is missing\n      sleep_time: 5 # (default 5) sleep for this amount of time before each request to API\n      batch_size_github: 25 # (default 25) number of repositories to include in each batch request to GitHub API\n      batch_size_gitlab: 10 # (default 10) number of repositories to include in each batch request to GitLab API\n      commit_history_clean_months: 12 # (default 12) number of months of commit history to keep after cleanup (GitHub only)\n      commit_history_fetch_months: 3 # (default 3) number of months to fetch from GitHub API (GitHub only)\n```\n\n\u003cdetails\u003e\u003csummary\u003eExample automation using Github actions:\u003c/summary\u003e\n\n```yaml\n# .github/workflows/update-metadata.yml\nname: update metadata\non:\n  schedule:\n    - cron: '22 22 * * *'\n  workflow_dispatch:\n\nenv:\n  GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}\n\nconcurrency:\n  group: update-metadata-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  update-metadata:\n    if: github.repository == 'awesome-selfhosted/awesome-selfhosted-data'\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v3\n      - run: python3 -m venv .venv \u0026\u0026 source .venv/bin/activate \u0026\u0026 pip3 install wheel \u0026\u0026 pip3 install --force git+https://github.com/nodiscc/hecat.git@1.2.0\n      - run: source .venv/bin/activate \u0026\u0026 hecat --config .hecat/update-metadata.yml\n      - name: commit and push changes\n        run: |\n          git config user.name awesome-selfhosted-bot\n          git config user.email github-actions@github.com\n          git add software/ tags/ platforms/ licenses*.yml\n          git diff-index --quiet HEAD || git commit -m \"[bot] update projects metadata\"\n          git push\n  build:\n    if: github.repository == 'awesome-selfhosted/awesome-selfhosted-data'\n    needs: update-metadata\n    uses: ./.github/workflows/build.yml\n    secrets: inherit\n```\n\n\u003c/details\u003e\n\n\nCheck URLs for dead links:\n\n```yaml\n# .hecat.url_check.yml\nsteps:\n  - name: check URLs\n    module: processors/url_check\n    module_options:\n      source_directories:\n        - awesome-selfhosted-data/software\n        - awesome-selfhosted-data/tags\n      source_files:\n        - awesome-selfhosted-data/licenses.yml\n      errors_are_fatal: True\n      exclude_regex:\n        - '^https://github.com/[\\w\\.\\-]+/[\\w\\.\\-]+$' # don't check URLs that will be processed by the software_metadata module\n        - '^https://retrospring.net/$' # DDoS protection page, always returns 403\n        - '^https://www.taiga.io/$' # always returns 403 Request forbidden by administrative rules\n        - '^https://docs.paperless-ngx.com/$' # DDoS protection page, always returns 403\n        - '^https://demo.paperless-ngx.com/$' # DDoS protection page, always returns 403\n        - '^https://git.dotclear.org/dev/dotclear$' # DDoS protection page, always returns 403\n        - '^https://word-mastermind.glitch.me/$' # the demo instance takes a long time to spin up, times out with the default 10s timeout\n        - '^https://getgrist.com/$' # hecat/python-requests bug? 'Received response with content-encoding: gzip,br, but failed to decode it.'\n        - '^https://www.uvdesk.com/$' # DDoS protection page, always returns 403\n        - '^https://demo.uvdesk.com/$' # DDoS protection page, always returns 403\n        - '^https://notes.orga.cat/$' # DDoS protection page, always returns 403\n        - '^https://cytu.be$' # DDoS protection page, always returns 403\n        - '^https://demo.reservo.co/$' # hecat/python-requests bug? always returns 404 but the website works in a browser\n        - '^https://crates.io/crates/vigil-server$' # hecat/python-requests bug? always returns 404 but the website works in a browser\n        - '^https://nitter.net$' # always times out from github actions but the website works in a browser\n```\n\n\u003cdetails\u003e\u003csummary\u003eExample automation using Github actions:\u003c/summary\u003e\n\n```yaml\n# .github/workflows/url-check.yml\nname: dead links\n\non:\n  schedule:\n    - cron: '22 22 * * *'\n  workflow_dispatch:\n\nenv:\n  GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}\n\nconcurrency:\n  group: dead-links-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  check-dead-links:\n    if: github.repository == 'awesome-selfhosted/awesome-selfhosted-data'\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v3\n      - run: python3 -m venv .venv \u0026\u0026 source .venv/bin/activate \u0026\u0026 pip3 install wheel \u0026\u0026 pip3 install --force git+https://github.com/nodiscc/hecat.git@1.2.0\n      - run: source .venv/bin/activate \u0026\u0026 hecat --config .hecat/url-check.yml\n```\n\n\u003c/details\u003e\n\n#### Shaarli\n\nImport data from a Shaarli instance, download video/audio files identified by specific tags, check for dead links, export to single-page HTML page/table:\n\n```bash\n# hecat consumes output from https://github.com/shaarli/python-shaarli-client\n# install the python API client\npython3 -m venv .venv \u0026\u0026 source .venv/bin/activate \u0026\u0026 pip3 install shaarli-client\n# edit python-shaarli-client configuration file\nmkdir -p ~/.config/shaarli/ \u0026\u0026 nano ~/.config/shaarli/client.ini\n```\n```ini\n# ~/.config/shaarli/client.ini\n[shaarli]\nurl = https://links.example.org\nsecret = AAAbbbZZZvvvSSStttUUUvvVXYZ\n```\n```bash\n# download data from your shaarli instance\nshaarli --outfile /path/to/shaarli-export.json get-links --limit=all\n```\n```yaml\n# .hecat.yml\nsteps:\n  - name: import data from shaarli API JSON\n      module: importers/shaarli_api\n      module_options:\n        source_file: /path/to/shaarli-export.json\n        output_file: shaarli.yml\n        skip_existing: True # (default True) skip importing items whose 'url:' already exists in the output file\n        clean_removed: False # (default False) remove items from the output file, whose 'url:' was not found in the input file\n        sort_by: created # (default 'created') key by which to sort the output list\n        sort_reverse: True # (default True) sort the output list in reverse order\n\n  - name: download video files\n      module: processors/download_media\n      module_options:\n        data_file: shaarli.yml # path to the YAML data file\n        only_tags: ['video'] # only download items tagged with any of these tags\n        exclude_tags: ['nodl'] # (default []), don't download items tagged with any of these tags\n        output_directory: '/path/to/video/directory' # path to the output directory for media files\n        download_playlists: False # (default False) download playlists\n        skip_when_filename_present: True # (default True) skip processing when item already has a 'video_filename/audio_filename': key\n        retry_items_with_error: True # (default True) retry downloading items for which an error was previously recorded\n        use_download_archive: True # (default True) use a yt-dlp archive file to record downloaded items, skip them if already downloaded\n\n  - name: download audio files\n    module: processors/download_media\n    module_options:\n      data_file: shaarli.yml\n      only_tags: ['music', 'musique']\n      exclude_tags: ['nodl']\n      output_directory: '/path/to/audio/directory'\n      only_audio: True # (default False) download the 'bestaudio' format instead of the default 'best'\n\n  - name: check URLs\n    module: processors/url_check\n    module_options:\n      source_files:\n        - shaarli.yml\n      check_keys:\n        - url\n      errors_are_fatal: True\n      exclude_regex:\n        - '^https://www.youtube.com/watch.*$' # don't check youtube video URLs, always returns HTTP 200 even for unavailable videos```\n\n  - name: archive webpages for items tagged 'hecat' or 'doc'\n    module: processors/archive_webpages\n    module_options:\n      data_file: shaarli.yml\n      only_tags: ['hecat', 'doc']\n      exclude_tags: ['nodl']\n      exclude_regex:\n        - '^https://[a-z]\\.wikipedia.org/wiki/.*$' # don't archive wikipedia pages, we have a local copy of wikipedia dumps from https://dumps.wikimedia.org/\n      output_directory: webpages\n      clean_removed: True\n      clean_excluded: True\n\n  - name: export shaarli data to HTML table\n    module: exporters/html_table\n    module_options:\n      source_file: shaarli.yml # file from which data will be loaded\n      output_file: index.html # (default index.html) output HTML table file\n      html_title: \"Shaarli export - shaarli.example.org\" # (default \"hecat HTML export\") output HTML title\n      description_format: paragraph # (details/paragraph, default details) wrap the description in a HTML details tag\n```\n\n[ffmpeg](https://ffmpeg.org/) must be installed for audio/video conversion support. [jdupes](https://github.com/jbruchon/jdupes), [soundalike](https://github.com/derat/soundalike) and [videoduplicatefinder](https://github.com/0x90d/videoduplicatefinder) may further help dealing with duplicate files and media.\n\n\n## Support\n\nPlease submit any questions to \u003chttps://gitlab.com/nodiscc/hecat/-/issues\u003e or \u003chttps://github.com/nodiscc/hecat/issues\u003e\n\n\n## Contributing\n\nBug reports, suggestions, code cleanup, documentation, tests, improvements, support for other input/output formats are welcome at \u003chttps://gitlab.com/nodiscc/hecat/-/merge_requests\u003e or \u003chttps://github.com/nodiscc/hecat/pulls\u003e\n\n\n## Testing\n\n```bash\n# install pyvenv, pip and make\n$ sudo apt install python3-pip python3-venv make\n# run tests using the Makefile\n$ make help \nUSAGE: make TARGET\nAvailable targets:\nhelp                generate list of targets with descriptions\nclean               clean files generated by make install/test_run\ninstall             install in a virtualenv\ntest                run tests\ntest_short          run tests except those that consume github API requests/long URL checks\ntest_pylint         run linter (non blocking)\nclone_awesome_selfhosted                clone awesome-selfhosted/awesome-selfhosted-data\ntest_import_awesome_selfhosted          test import from awesome-sefhosted\ntest_process_awesome_selfhosted         test all processing steps on awesome-selfhosted-data\ntest_url_check      test URL checker on awesome-sefhosted-data\ntest_update_software_metadata             test software metadata updater/processor on awesome-selfhosted-data\ntest_awesome_lint   test linter/compliance checker on awesome-sefhosted-data\ntest_export_awesome_selfhosted_md       test export to singlepage markdown from awesome-selfhosted-data\ntest_export_awesome_selfhosted_html     test export to singlepage HTML from awesome-selfhosted-data\ntest_import_shaarli test import from shaarli JSON\ntest_download_video test downloading videos from the shaarli import, test log file creation\ntest_download_audio test downloading audio files from the shaarli import\ntest_archive_webpages                   test webpage archiving\ntest_export_html_table                  test exporting shaarli data to HTML table\nscan_trivy          run trivy vulnerability scanner\n```\n\n## License\n\n[GNU GPLv3](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnodiscc%2Fhecat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnodiscc%2Fhecat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnodiscc%2Fhecat/lists"}