{"id":13393663,"url":"https://github.com/gocolly/colly","last_synced_at":"2025-05-12T17:58:16.414Z","repository":{"id":37927892,"uuid":"105279544","full_name":"gocolly/colly","owner":"gocolly","description":"Elegant Scraper and Crawler Framework for Golang","archived":false,"fork":false,"pushed_at":"2025-04-26T09:16:29.000Z","size":8633,"stargazers_count":24107,"open_issues_count":196,"forks_count":1793,"subscribers_count":327,"default_branch":"master","last_synced_at":"2025-05-05T15:19:01.248Z","etag":null,"topics":["crawler","crawling","framework","go","golang","scraper","scraping","spider"],"latest_commit_sha":null,"homepage":"https://go-colly.org/","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gocolly.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-09-29T14:08:49.000Z","updated_at":"2025-05-05T11:35:10.000Z","dependencies_parsed_at":"2023-10-17T04:51:47.651Z","dependency_job_id":"5fe1dc04-045b-4f4f-a7bd-e2526e57210e","html_url":"https://github.com/gocolly/colly","commit_stats":{"total_commits":480,"total_committers":117,"mean_commits":4.102564102564102,"dds":0.5208333333333333,"last_synced_commit":"3c987f1982edbb5ba8876eef56dd35e1ff05932a"},"previous_names":["asciimoo/colly"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gocolly%2Fcolly","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gocolly%2Fcolly/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gocolly%2Fcolly/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gocolly%2Fcolly/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gocolly","download_url":"https://codeload.github.com/gocolly/colly/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253792839,"owners_count":21965248,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","crawling","framework","go","golang","scraper","scraping","spider"],"created_at":"2024-07-30T17:00:58.265Z","updated_at":"2025-05-12T17:58:16.379Z","avatar_url":"https://github.com/gocolly.png","language":"Go","readme":"# Colly\n\nLightning Fast and Elegant Scraping Framework for Gophers\n\nColly provides a clean interface to write any kind of crawler/scraper/spider.\n\nWith Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving.\n\n[![GoDoc](https://godoc.org/github.com/gocolly/colly?status.svg)](https://pkg.go.dev/github.com/gocolly/colly/v2)\n[![Backers on Open Collective](https://opencollective.com/colly/backers/badge.svg)](#backers) [![Sponsors on Open Collective](https://opencollective.com/colly/sponsors/badge.svg)](#sponsors) [![build status](https://github.com/gocolly/colly/actions/workflows/ci.yml/badge.svg)](https://github.com/gocolly/colly/actions/workflows/ci.yml)\n[![report card](https://img.shields.io/badge/report%20card-a%2B-ff3333.svg?style=flat-square)](http://goreportcard.com/report/gocolly/colly)\n[![view examples](https://img.shields.io/badge/learn%20by-examples-0077b3.svg?style=flat-square)](https://github.com/gocolly/colly/tree/master/_examples)\n[![Code Coverage](https://img.shields.io/codecov/c/github/gocolly/colly/master.svg)](https://codecov.io/github/gocolly/colly?branch=master)\n[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fgocolly%2Fcolly.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2Fgocolly%2Fcolly?ref=badge_shield)\n[![Twitter URL](https://img.shields.io/badge/twitter-follow-green.svg)](https://twitter.com/gocolly)\n\n\n------\n\n\n\n## Features\n\n-   Clean API\n-   Fast (\u003e1k request/sec on a single core)\n-   Manages request delays and maximum concurrency per domain\n-   Automatic cookie and session handling\n-   Sync/async/parallel scraping\n-   Caching\n-   Automatic encoding of non-unicode responses\n-   Robots.txt support\n-   Distributed scraping\n-   Configuration via environment variables\n-   Extensions\n\n## Example\n\n```go\nfunc main() {\n\tc := colly.NewCollector()\n\n\t// Find and visit all links\n\tc.OnHTML(\"a[href]\", func(e *colly.HTMLElement) {\n\t\te.Request.Visit(e.Attr(\"href\"))\n\t})\n\n\tc.OnRequest(func(r *colly.Request) {\n\t\tfmt.Println(\"Visiting\", r.URL)\n\t})\n\n\tc.Visit(\"http://go-colly.org/\")\n}\n```\n\nSee [examples folder](https://github.com/gocolly/colly/tree/master/_examples) for more detailed examples.\n\n## Installation\n\nAdd colly to your `go.mod` file:\n\n```\nmodule github.com/x/y\n\ngo 1.14\n\nrequire (\n        github.com/gocolly/colly/v2 latest\n)\n```\n\n## Bugs\n\nBugs or suggestions? Visit the [issue tracker](https://github.com/gocolly/colly/issues) or join `#colly` on freenode\n\n## Other Projects Using Colly\n\nBelow is a list of public, open source projects that use Colly:\n\n-   [greenpeace/check-my-pages](https://github.com/greenpeace/check-my-pages) Scraping script to test the Spanish Greenpeace web archive.\n-   [altsab/gowap](https://github.com/altsab/gowap) Wappalyzer implementation in Go.\n-   [jesuiscamille/goquotes](https://github.com/jesuiscamille/goquotes) A quotes scraper, making your day a little better!\n-   [jivesearch/jivesearch](https://github.com/jivesearch/jivesearch) A search engine that doesn't track you.\n-   [Leagify/colly-draft-prospects](https://github.com/Leagify/colly-draft-prospects) A scraper for future NFL Draft prospects.\n-   [lucasepe/go-ps4](https://github.com/lucasepe/go-ps4) Search playstation store for your favorite PS4 games using the command line.\n-   [yringler/inside-chassidus-scraper](https://github.com/yringler/inside-chassidus-scraper) Scrapes Rabbi Paltiel's web site for lesson metadata.\n-   [gamedb/gamedb](https://github.com/gamedb/gamedb) A database of Steam games.\n-   [lawzava/scrape](https://github.com/lawzava/scrape) CLI for email scraping from any website.\n-   [eureka101v/WeiboSpiderGo](https://github.com/eureka101v/WeiboSpiderGo) A sina weibo(chinese twitter) scraper\n-   [Go-phie/gophie](https://github.com/Go-phie/gophie) Search, Download and Stream movies from your terminal\n-   [imthaghost/goclone](https://github.com/imthaghost/goclone) Clone websites to your computer within seconds.\n-   [superiss/spidy](https://github.com/superiss/spidy) Crawl the web and collect expired domains.\n-   [docker-slim/docker-slim](https://github.com/docker-slim/docker-slim) Optimize your Docker containers to make them smaller and better.\n-   [seversky/gachifinder](https://github.com/seversky/gachifinder) an agent for asynchronous scraping, parsing and writing to some storages(elasticsearch for now)\n-   [eval-exec/goodreads](https://github.com/eval-exec/goodreads) crawl all tags and all pages of quotes from goodreads.\n\nIf you are using Colly in a project please send a pull request to add it to the list.\n\n## Contributors\n\nThis project exists thanks to all the people who contribute. [[Contribute]](CONTRIBUTING.md).\n\u003ca href=\"https://github.com/gocolly/colly/graphs/contributors\"\u003e\u003cimg src=\"https://opencollective.com/colly/contributors.svg?width=890\" /\u003e\u003c/a\u003e\n\n## Backers\n\nThank you to all our backers! 🙏 [[Become a backer](https://opencollective.com/colly#backer)]\n\n\u003ca href=\"https://opencollective.com/colly#backers\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/backers.svg?width=890\"\u003e\u003c/a\u003e\n\n## Sponsors\n\nSupport this project by becoming a sponsor. Your logo will show up here with a link to your website. [[Become a sponsor](https://opencollective.com/colly#sponsor)]\n\n\u003ca href=\"https://opencollective.com/colly/sponsor/0/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/0/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/1/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/1/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/2/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/2/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/3/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/3/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/4/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/4/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/5/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/5/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/6/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/6/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/7/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/7/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/8/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/8/avatar.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opencollective.com/colly/sponsor/9/website\" target=\"_blank\"\u003e\u003cimg src=\"https://opencollective.com/colly/sponsor/9/avatar.svg\"\u003e\u003c/a\u003e\n\n## License\n\n[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2Fgocolly%2Fcolly.svg?type=large)](https://app.fossa.io/projects/git%2Bgithub.com%2Fgocolly%2Fcolly?ref=badge_large)\n","funding_links":["https://opencollective.com/colly"],"categories":["Popular","Go","开源类库","Misc","🕷️ Web Scraping Frameworks","Applications","Open source library","Uncategorized","Library","spider","网络服务","Specific Formats","Core Libraries","Application Recommendation","Text Processing","Golang","Repositories","Bot Building"],"sub_categories":["爬虫","Go","Crawlers","Uncategorized","网络爬虫","🤖 Automation Tools","Scrapers","Scraper"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgocolly%2Fcolly","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgocolly%2Fcolly","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgocolly%2Fcolly/lists"}