{"id":16386279,"url":"https://github.com/tmcw/notfoundbot","last_synced_at":"2025-07-17T18:32:40.119Z","repository":{"id":40286384,"uuid":"260744558","full_name":"tmcw/notfoundbot","owner":"tmcw","description":"fix \u0026 archive outgoing links on your website","archived":false,"fork":false,"pushed_at":"2024-05-11T16:09:07.000Z","size":3038,"stargazers_count":115,"open_issues_count":9,"forks_count":7,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-06-30T14:48:55.199Z","etag":null,"topics":["actions","hugo","jekyll"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tmcw.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-02T17:59:32.000Z","updated_at":"2025-03-17T15:14:22.000Z","dependencies_parsed_at":"2024-06-18T17:09:50.541Z","dependency_job_id":"25c9f7ee-02b9-44db-a059-368ece82f34a","html_url":"https://github.com/tmcw/notfoundbot","commit_stats":{"total_commits":110,"total_committers":6,"mean_commits":"18.333333333333332","dds":0.08181818181818179,"last_synced_commit":"f91be597e4a7e2a182788b1c5ffdc460df3195ea"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/tmcw/notfoundbot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmcw%2Fnotfoundbot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmcw%2Fnotfoundbot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmcw%2Fnotfoundbot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmcw%2Fnotfoundbot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tmcw","download_url":"https://codeload.github.com/tmcw/notfoundbot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tmcw%2Fnotfoundbot/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265645425,"owners_count":23804185,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["actions","hugo","jekyll"],"created_at":"2024-10-11T04:16:48.488Z","updated_at":"2025-07-17T18:32:40.100Z","avatar_url":"https://github.com/tmcw.png","language":"TypeScript","funding_links":[],"categories":["All Software"],"sub_categories":["Programming Tools"],"readme":"# notfoundbot\n\n[![Maintainability](https://api.codeclimate.com/v1/badges/1870414e70039aad07f3/maintainability)](https://codeclimate.com/github/tmcw/notfoundbot/maintainability) [![Test Coverage](https://api.codeclimate.com/v1/badges/1870414e70039aad07f3/test_coverage)](https://codeclimate.com/github/tmcw/notfoundbot/test_coverage)\n\nnotfoundbot is a GitHub Action that helps you automatically maintain the correctness of your\nwebsite's outgoing links. It finds links that need fixing and opens pull requests\nthat fix them.\n\nThis action is intended for websites and blogs powered by static site generators.\n\nnotfoundbot does the following fixes:\n\n- Upgrades outgoing HTTP links to HTTPS\n- Replaces broken outgoing links with links to the [Wayback Machine](https://web.archive.org/)\n\nBy using post dates derived from filenames, notfoundbot searches for Wayback Machine archives\nof linked resources that are contemporary to the post itself: broken links in a 2011 blog post\nwill be linked to archives from around that era.\n\n## Example YAML\n\n```yaml\nname: notfoundbot\non:\n  schedule:\n    - cron: \"0 5 * * *\"\njobs:\n  check:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v2\n      - name: Fix links\n        uses: tmcw/notfoundbot@v2.0.2\n        env:\n          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n```\n\nSome websites will return different information for notfoundbot than for a typical user. When false positives become a repetitive issue an exceptions list can be used. The exception list is a space separated list of hosts that will always return an ok status.\n\n```yaml\nname: notfoundbot\non:\n  schedule:\n    - cron: \"0 5 * * *\"\njobs:\n  check:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v2\n      - name: Fix links\n        uses: tmcw/notfoundbot@v2.0.2\n        env:\n          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n          EXCEPTIONS: www.host.com thisisok.org\n```\n\nBy default notfoundbot will check `.md` files in the `_posts` directory. You can change this directory:\n\n```yaml\n      - name: Fix links\n        uses: tmcw/notfoundbot@v2.0.2\n        env:\n          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n        with:\n          content-folder: custom-content\n```\n\nNotes:\n\n- I might forget to update the version on `notfoundbot` here - make sure that it's\n  the latest!\n- Check out [crontab.guru](https://crontab.guru/#5_*_*_*_*) to customize the\n  schedule line, which can run the task more or less often if you want.\n\n## Features\n\n- Post date detection: supports filename-based dates, YAML \u0026 TOML frontmatter\n- notfoundbot uses [magic-string](https://github.com/rich-harris/magic-string) to\n  selectively update links without affecting surrounding markup\n\n## Workflow\n\n- If there is an existing PR tagged `notfoundbot`, exit\n- Gather post files and parse them, and then for each unique outlink URL\n    - If the URL is not http or https, ignore it\n    - If the URL is relative, ignore it\n    - If the URL has been checked recently and is in the cache, ignore it\n    - If the URL is HTTP, check its HTTPS equivalent.\n        - If the HTTPS equivalent exists, upgrade the link to HTTPS\n        - Otherwise, check the HTTP link\n            - If the HTTP link resolves, ignore it\n            - If the HTTP link fails, mark it as an error.\n     - If the URL is HTTPS, check to see if it resolves\n        - If the link resolves, ignore it\n        - If the link fails, mark it as an error\n\nThen, for each link marked as an error:\n\n- Check the Internet Archive to find contemporary archives of each failed URL\n    - If an archive exists, replace the link\n    - Otherwise, ignore it.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftmcw%2Fnotfoundbot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftmcw%2Fnotfoundbot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftmcw%2Fnotfoundbot/lists"}