{"id":13425047,"url":"https://github.com/hakluke/hakrawler","last_synced_at":"2025-05-14T21:00:18.084Z","repository":{"id":37290148,"uuid":"228192593","full_name":"hakluke/hakrawler","owner":"hakluke","description":"Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application","archived":false,"fork":false,"pushed_at":"2024-12-21T20:40:03.000Z","size":1786,"stargazers_count":4700,"open_issues_count":8,"forks_count":518,"subscribers_count":62,"default_branch":"master","last_synced_at":"2025-05-07T20:29:03.563Z","etag":null,"topics":["bugbounty","crawling","hacking","osint","pentesting","recon","reconnaissance"],"latest_commit_sha":null,"homepage":"https://hakluke.com","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hakluke.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-12-15T13:54:43.000Z","updated_at":"2025-05-06T21:41:45.000Z","dependencies_parsed_at":"2024-01-18T13:34:32.046Z","dependency_job_id":"54b367c3-94c3-4479-9dd2-524ef844ee1a","html_url":"https://github.com/hakluke/hakrawler","commit_stats":{"total_commits":183,"total_committers":25,"mean_commits":7.32,"dds":"0.35519125683060104","last_synced_commit":"14e240b1e1758638a4e4152ccd339e3944841fe3"},"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hakluke%2Fhakrawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hakluke%2Fhakrawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hakluke%2Fhakrawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hakluke%2Fhakrawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hakluke","download_url":"https://codeload.github.com/hakluke/hakrawler/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254227603,"owners_count":22035667,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bugbounty","crawling","hacking","osint","pentesting","recon","reconnaissance"],"created_at":"2024-07-31T00:01:03.230Z","updated_at":"2025-05-14T21:00:18.002Z","avatar_url":"https://github.com/hakluke.png","language":"Go","readme":"# Hakrawler\n\nFast golang web crawler for gathering URLs and JavaScript file locations. This is basically a simple implementation of the awesome Gocolly library.\n\n## Example usages\n\nSingle URL:\n\n```\necho https://google.com | hakrawler\n```\n\nMultiple URLs:\n\n```\ncat urls.txt | hakrawler\n```\n\nTimeout for each line of stdin after 5 seconds:\n\n```\ncat urls.txt | hakrawler -timeout 5\n```\n\nSend all requests through a proxy:\n\n```\ncat urls.txt | hakrawler -proxy http://localhost:8080\n```\n\nInclude subdomains:\n\n```\necho https://google.com | hakrawler -subs\n```\n\n\u003e Note: a common issue is that the tool returns no URLs. This usually happens when a domain is specified (https://example.com), but it redirects to a subdomain (https://www.example.com). The subdomain is not included in the scope, so the no URLs are printed. In order to overcome this, either specify the final URL in the redirect chain or use the `-subs` option to include subdomains.\n\n## Example tool chain\n\nGet all subdomains of google, find the ones that respond to http(s), crawl them all.\n\n```\necho google.com | haktrails subdomains | httpx | hakrawler\n```\n\n## Installation\n\n### Normal Install\n\nFirst, you'll need to [install go](https://golang.org/doc/install).\n\nThen run this command to download + compile hakrawler:\n```\ngo install github.com/hakluke/hakrawler@latest\n```\n\nYou can now run `~/go/bin/hakrawler`. If you'd like to just run `hakrawler` without the full path, you'll need to `export PATH=\"~/go/bin/:$PATH\"`. You can also add this line to your `~/.bashrc` file if you'd like this to persist.\n\n### Docker Install (from dockerhub)\n\n```\necho https://www.google.com | docker run --rm -i hakluke/hakrawler:v2 -subs\n```\n\n### Local Docker Install\n\nIt's much easier to use the dockerhub method above, but if you'd prefer to run it locally:\n\n```\ngit clone https://github.com/hakluke/hakrawler\ncd hakrawler\nsudo docker build -t hakluke/hakrawler .\nsudo docker run --rm -i hakluke/hakrawler --help\n```\n### Kali Linux: Using apt\n\nNote: This will install an older version of hakrawler without all the features, and it may be buggy. I recommend using one of the other methods.\n\n```sh\nsudo apt install hakrawler\n```\n\nThen, to run hakrawler:\n\n```\necho https://www.google.com | docker run --rm -i hakluke/hakrawler -subs\n```\n\n## Command-line options\n```\nUsage of hakrawler:\n  -d int\n    \tDepth to crawl. (default 2)\n  -dr\n    \tDisable following HTTP redirects.\n  -h string\n    \tCustom headers separated by two semi-colons. E.g. -h \"Cookie: foo=bar;;Referer: http://example.com/\"\n  -i\tOnly crawl inside path\n  -insecure\n    \tDisable TLS verification.\n  -json\n    \tOutput as JSON.\n  -proxy string\n    \tProxy URL. E.g. -proxy http://127.0.0.1:8080\n  -s\tShow the source of URL based on where it was found. E.g. href, form, script, etc.\n  -size int\n    \tPage size limit, in KB. (default -1)\n  -subs\n    \tInclude subdomains for crawling.\n  -t int\n    \tNumber of threads to utilise. (default 8)\n  -timeout int\n    \tMaximum time to crawl each URL from stdin, in seconds. (default -1)\n  -u\tShow only unique urls.\n  -w\tShow at which link the URL is found.\n```\n","funding_links":[],"categories":["Go","All","Content Discovery","Recon","Weapons","Tools","Go (531)","扫描器、资产收集、子域名","[](#table-of-contents) Table of contents","bugbounty","Exfiltration","信息搜集"],"sub_categories":["Crawlers","Content Discovery","Tools","Posts from Hacker101 members on how to get started hacking","网络服务_其他","[](#dorkspentestvulnerabilities)Dorks/Pentest/Vulnerabilities","Purple Team"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhakluke%2Fhakrawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhakluke%2Fhakrawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhakluke%2Fhakrawler/lists"}