{"id":23773362,"url":"https://github.com/beenotung/scan-link","last_synced_at":"2025-08-14T08:05:44.784Z","repository":{"id":240382102,"uuid":"802483937","full_name":"beenotung/scan-link","owner":"beenotung","description":"Scan given website recursively and report 404 links","archived":false,"fork":false,"pushed_at":"2024-05-18T12:54:21.000Z","size":23,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-08-09T15:43:40.289Z","etag":null,"topics":["404-errors","broken-links","cli","csv-report","http-status","link-analyzer","link-checker","link-scanner","link-validator","npx","seo-tools","url-scanner","web-crawler","web-scrapping","web-tools","website-scanner"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/scan-link","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/beenotung.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-18T12:27:53.000Z","updated_at":"2024-05-23T01:06:25.000Z","dependencies_parsed_at":"2024-05-18T13:45:36.108Z","dependency_job_id":null,"html_url":"https://github.com/beenotung/scan-link","commit_stats":null,"previous_names":["beenotung/scan-link"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/beenotung/scan-link","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beenotung%2Fscan-link","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beenotung%2Fscan-link/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beenotung%2Fscan-link/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beenotung%2Fscan-link/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/beenotung","download_url":"https://codeload.github.com/beenotung/scan-link/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/beenotung%2Fscan-link/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270385467,"owners_count":24574556,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-14T02:00:10.309Z","response_time":75,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["404-errors","broken-links","cli","csv-report","http-status","link-analyzer","link-checker","link-scanner","link-validator","npx","seo-tools","url-scanner","web-crawler","web-scrapping","web-tools","website-scanner"],"created_at":"2025-01-01T05:39:24.511Z","updated_at":"2025-08-14T08:05:44.727Z","avatar_url":"https://github.com/beenotung.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# scan-link\n\nScan given website recursively and report 404 links\n\n[![npm Package Version](https://img.shields.io/npm/v/scan-link)](https://www.npmjs.com/package/scan-link)\n\n## Features\n\n- Start scanning from a specified entry URL\n- Follow links within specified origins\n- Report links that lead to 404 pages\n- Export 404 error report as a CSV file\n\n## Remark\n\nThe links and page status code are stored in the `db.sqlite3` file of the current working directory. You may run `mkdir` and `cd` to a specific directory to avoid storing it in the home directory.\n\n## Installation (optional)\n\nYou can install scan-link for version control, or execute it via `npx` without installation.\n\nTo install `scan-link`, use npm:\n\n```bash\nnpm install scan-link\n```\n\nYou may install it as dev dependency or global dependency based on your preference.\n\n## Usage\n\nYou can use `scan-link` from the command line via `npx`. The configuration can be provided via environment variables or interactively during execution.\n\nUsage with dev/global installation:\n\n```bash\nnpx scan-link [entryUrl]\n```\n\nUsage without installation:\n\n```bash\nnpx -y scan-link [entryUrl]\n```\n\nThe `entryUrl` can be specified in argument, loaded from environment variable, or answered in the interactive prompt.\n\n### Environment Variables\n\n- SITE_URL: The entry URL for the scan\n- ORIGINS: A comma-separated list of origins to limit the scan\n- REPORT_404_CSV_FILE: Path of the CSV file where the 404 error report will be saved\n\nExample content of `.env` file:\n\n```bash\nSITE_DIR=https://example.com\nORIGINS=https://example.com,https://sub.example.com\nREPORT_404_CSV_FILE=report.csv\n```\n\n### Interactive Usage\n\nIf environment variables are not set, `scan-link` will prompt you for the necessary information.\n\n```bash\nnpx scan-link\n```\n\nYou will be prompted to setup above variables.\n\n### Example Interactive Session\n\n```bash\n$ npx -y scan-link\nentryUrl: http://localhost:8200/\n\nPlease specified the origins of links to follow.\nMultiple origins can be delimited by comma (\",\").\norigins (default: \"http://localhost:8200\"):\norigins: [ 'http://localhost:8200' ]\n\npath of CSV file to be saved (default \"404.csv\"): report.csv\nscanned: 12 | pending: 85 | scanning: http://localhost:8200/about\n...\nscanned: 119 pages\n{\n  '404 link count': 1447,\n  'total link count': 5036,\n  'page count with 404 link': 11,\n  'total page count': 119\n}\nexported 404 pages to file: report.csv\n```\n\n## API\n\nFor advanced usage, you can import and use the `scanAndFollow()` functions programmatically.\n\n```typescript\nexport function scanAndFollow(options: {\n  /** @example 'http://localhost:8200/' */\n  entryUrl: string\n\n  /** @default same as entryUrl */\n  origins?: string[]\n\n  /** @description report stats on 404 pages and links */\n  report_404_stats?: boolean\n\n  /** @description specified filename to report 404 links. Skip reporting if not specified. */\n  export_404_csv_file?: string\n\n  /**\n   * @description auto close browser after all scanning\n   * @default true\n   */\n  close_browser?: boolean\n}): Promise\u003cvoid\u003e\n\n/** @description called by `scanAndFollow()` if `options.report_404_stats` is true */\nexport function get404Report(options: { origin: string }): {\n  '404 link count': number\n  'total link count': number\n  'page count with 404 link': number\n  'total page count': number\n}\n\n/** @description called by `scanAndFollow()` if `options.export_404_csv_file` is specified */\nexport function export404Pages(options: {\n  csv_file: string\n  origin: string\n}): void\n\n/** @description close the lazy loaded browser instance if it's launched */\nexport function closeBrowser(): Promise\u003cvoid\u003e\n```\n\n## License\n\nThis project is licensed with [BSD-2-Clause](./LICENSE)\n\nThis is free, libre, and open-source software. It comes down to four essential freedoms [[ref]](https://seirdy.one/2021/01/27/whatsapp-and-the-domestication-of-users.html#fnref:2):\n\n- The freedom to run the program as you wish, for any purpose\n- The freedom to study how the program works, and change it so it does your computing as you wish\n- The freedom to redistribute copies so you can help others\n- The freedom to distribute copies of your modified versions to others\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbeenotung%2Fscan-link","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbeenotung%2Fscan-link","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbeenotung%2Fscan-link/lists"}