{"id":20795333,"url":"https://github.com/ndoolan360/go-crawler","last_synced_at":"2026-04-21T15:36:07.956Z","repository":{"id":206210032,"uuid":"716093363","full_name":"NDoolan360/go-crawler","owner":"NDoolan360","description":"A simple web crawling program written in Go in an afternoon. 🕷️🕸️","archived":false,"fork":false,"pushed_at":"2023-11-09T04:00:28.000Z","size":7,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-12T01:34:33.846Z","etag":null,"topics":["afternoon-project","crawler","scraper"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NDoolan360.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-08T12:40:08.000Z","updated_at":"2024-11-24T06:52:16.000Z","dependencies_parsed_at":"2023-11-08T13:51:32.477Z","dependency_job_id":"891bb2c9-5405-4f7c-a1be-4ea378ae0a0a","html_url":"https://github.com/NDoolan360/go-crawler","commit_stats":null,"previous_names":["ndoolan360/go-crawler"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/NDoolan360/go-crawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NDoolan360%2Fgo-crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NDoolan360%2Fgo-crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NDoolan360%2Fgo-crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NDoolan360%2Fgo-crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NDoolan360","download_url":"https://codeload.github.com/NDoolan360/go-crawler/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NDoolan360%2Fgo-crawler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32098198,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-21T11:25:29.218Z","status":"ssl_error","status_checked_at":"2026-04-21T11:25:28.499Z","response_time":128,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["afternoon-project","crawler","scraper"],"created_at":"2024-11-17T16:21:12.346Z","updated_at":"2026-04-21T15:36:07.926Z","avatar_url":"https://github.com/NDoolan360.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Go Crawler\r\n\r\n**go-crawler** is a simple web crawling program written in Go in an afternoon. It allows you to start from a given URL and crawl through web pages, collecting links up to a specified depth and printing what is has found to stdout. This tool could be a great foundation for various web scraping and web data collection applications. \r\n\r\n## Getting Started\r\n\r\nTo get started with the Go Crawler, follow these simple steps:\r\n\r\n1. Clone the repository:\r\n\r\n```bash\r\ngit clone https://github.com/yourusername/go-crawler.git\r\ncd go-crawler\r\n```\r\n\r\n2. Build the executable:\r\n\r\n```bash\r\ngo get go-crawler\r\ngo build\r\n```\r\n\r\n3. Run the program:\r\n\r\n```bash\r\n./go-crawler \u003cstartURL\u003e \u003cmaxDepth\u003e\r\n```\r\n\r\n- \\\u003cstartURL\u003e: The URL from which the crawling will begin.\r\n- \\\u003cmaxDepth\u003e (optional): The maximum depth for crawling. Default is 1 if not specified.\r\n\r\n## Usage\r\n\r\n1. The program takes at least one command-line argument, which is the starting URL for crawling. You can optionally provide a second argument for the maximum depth of the crawl.\r\n\r\n2. The crawler will start from the specified URL and collect links up to the specified depth.\r\n\r\n3. The crawled URLs and any errors encountered during the process will be printed to the console.\r\n\r\n### Example\r\n\r\n```bash\r\n./go-crawler https://example.com 2\r\n```\r\n\r\nThis command will start crawling from \"https://example.com\" up to a depth of 2.\r\n\r\n## Features\r\n\r\n- Recursive web crawling starting from a given URL.\r\n- Specify the maximum depth for the crawl.\r\n- Handle and report HTTP errors.\r\n- Extract links from HTML content.\r\n\r\n## Dependencies\r\n\r\nThe program uses the following Go packages:\r\n\r\n- net/http: For making HTTP requests.\r\n- golang.org/x/net/html: For parsing HTML content and extracting links.\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fndoolan360%2Fgo-crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fndoolan360%2Fgo-crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fndoolan360%2Fgo-crawler/lists"}