{"id":13828338,"url":"https://github.com/dantleech/fink","last_synced_at":"2025-04-04T16:16:55.777Z","repository":{"id":45711402,"uuid":"166602440","full_name":"dantleech/fink","owner":"dantleech","description":"PHP Link Checker","archived":false,"fork":false,"pushed_at":"2024-03-16T12:10:00.000Z","size":311,"stargazers_count":205,"open_issues_count":28,"forks_count":26,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-10-30T04:50:10.598Z","etag":null,"topics":["link-checker","php","spider"],"latest_commit_sha":null,"homepage":null,"language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dantleech.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-01-19T22:59:27.000Z","updated_at":"2024-10-24T09:07:12.000Z","dependencies_parsed_at":"2024-06-21T16:32:47.787Z","dependency_job_id":"77cf2f79-eeb3-45d1-8310-ef7b5b67976d","html_url":"https://github.com/dantleech/fink","commit_stats":{"total_commits":250,"total_committers":16,"mean_commits":15.625,"dds":"0.21199999999999997","last_synced_commit":"8cf53525be4244131b6fd7339a00dc8c0625fabb"},"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dantleech%2Ffink","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dantleech%2Ffink/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dantleech%2Ffink/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dantleech%2Ffink/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dantleech","download_url":"https://codeload.github.com/dantleech/fink/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247208190,"owners_count":20901570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["link-checker","php","spider"],"created_at":"2024-08-04T09:02:41.991Z","updated_at":"2025-04-04T16:16:55.761Z","avatar_url":"https://github.com/dantleech.png","language":"PHP","funding_links":[],"categories":["PHP"],"sub_categories":[],"readme":"Fink\n====\n\n[![Build Status](https://travis-ci.org/dantleech/fink.svg?branch=master)](https://travis-ci.org/dantleech/fink)\n\nFink (pronounced \"Phpink\") is a command line tool, written in PHP, for checking HTTP links.\n\n- Check websites for broken links or error pages.\n- Asynchronous HTTP requests.\n\n![recording](https://user-images.githubusercontent.com/530801/55685040-e4f11400-5949-11e9-9f79-51c5c23a40c0.gif)\n\nInstallation\n------------\n\nInstall as a stand-alone tool or as a project dependency:\n\n### Installing as a project dependency\n\n```bash\n$ composer require dantleech/fink --dev\n```\n\n### Installing from a PHAR\n\nDownload the PHAR from the\n[Releases](https://github.com/dantleech/fink/releases) page.\n\n### Building your own PHAR with Box\n\nYou can build your own PHAR by cloning this repository and running:\n\n```bash\n$ ./vendor/bin/box compile\n```\n\nUsage\n-----\n\nRun the command with a single URL to start crawling:\n\n```\n$ ./vendor/bin/fink https://www.example.com\n```\n\nUse `--output=somefile` to log verbose information for each URL in JSON format, including:\n\n- `url`: The tested URL.\n- `status`: The HTTP status code.\n- `referrer`: The page which linked to the URL.\n- `referrer_title`: The value (e.g. link title) of the referring element.\n- `referrer_xpath`: The path to the node in the referring document.\n- `distance`: The number of links away from the start document.\n- `request_time`: Number of microseconds taken to make the request.\n- `timestamp`: The time that the request was made.\n- `exception`: Any runtime exception encountered (e.g. malformed URL, etc).\n\nArguments\n---------\n\n- `url` (multiple) Specify one or more base URLs to crawl (mandatory).\n\nOptions\n-------\n\n- `--client-max-body-size`: Max body size for HTTP client (in bytes).\n- `--client-max-header-size`: Max header size for HTTP client (in bytes).\n- `--client-redirects=5`: Set the maximum number of times the client should redirect (`0` to never redirect).\n- `--client-security-level=1`: Set the default SSL [security\n  level](https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set_security_level.html)\n- `--client-timeout=15000`: Set the maximum amount of time (in milliseconds)\n  the client should wait for a response, defaults to 15,000 (15 seconds).\n- `--concurrency`: Number of simultaneous HTTP requests to use.\n- `--display-bufsize=10`: Set the number of URLs to consider when showing the\n  display.\n- `--display=+memory`: Set, add or remove elements of the runtime display\n  (prefix with `-` or `+` to modify the default set).\n- `--exclude-url=logout`: (multiple) Exclude URLs matching the given PCRE pattern.\n- `--header=\"Foo: Bar\"`: (multiple) Specify custom header(s).\n- `--help`: Display available options.\n- `--include-link=foobar.html`: Include given link as if it were linked from the\n  base URL.\n- `--insecure`: Do not verify SSL certificates.\n- `--load-cookies`: Load from a [cookies.txt](http://www.cookiecentral.com/faq/#3.5).\n- `--max-distance`: Maximum allowed distance from base URL (if not specified\n  then there is no limitation).\n- `--max-external-distance`: Limit the external (disjoint) distance from the\n  base URL.\n- `--no-dedupe`: Do _not_ filter duplicate URLs (can result in a\n  non-terminating process).\n- `--output=out.json`: Output JSON report for each URL to given file\n  (truncates existing content).\n- `--publisher=csv`: Set the publisher (defaults to `json`) can be either\n  `json` or `csv`.\n- `--rate`: Set a maximum number of requests to make in a second.\n- `--stdout`: Stream to STDOUT directly, disables display and any specified outfile.\n\nExamples\n--------\n\n### Crawl a single website\n\n```\n$ fink http://www.example.com --max-external-distance=0\n```\n\n### Crawl a single website and check the status of external links\n\n```\n$ fink http://www.example.com --max-external-distance=1\n```\n\n### Use `jq` to analyse results\n\n[jq](https://stedolan.github.io/jq/) is a tool which can be used to query and\nmanipulate JSON data.\n\n```\n$ fink http://www.example.com -x0 -oreport.json\n```\n\n```\n$ cat report.json| jq -c '. | select(.status==404) | {url: .url, referrer: .referrer}' | jq\n```\n\n### Crawl pages behind a login\n\n```\n# create a cookies file for later re-use (simulate a login in this case via HTTP-POST)\n$ curl -L --cookie-jar mycookies.txt -d username=myLogin -d password=MyP4ssw0rd https://www.example.org/my/login/url\n\n# re-use the cookies file with your fink crawl command\n$ fink https://www.example.org/myaccount --load-cookies=mycookies.txt\n```\n\nnote: its not possible to create the cookie jar on computer A, store it and read it in again on e.g. a linux server.\nyou need to create the cookie file from the very same ip, because otherwise server side session handling might not continue the http-session because of a IP mismatch\n\nExit Codes\n----------\n\n- `0`: All URLs were successful.\n- `1`: Unexpected runtime error.\n- `2`: At least one URL failed to resolve successfully.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdantleech%2Ffink","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdantleech%2Ffink","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdantleech%2Ffink/lists"}