{"id":20510309,"url":"https://github.com/frobware/grawler","last_synced_at":"2025-03-05T22:26:09.821Z","repository":{"id":57528710,"uuid":"91804088","full_name":"frobware/grawler","owner":"frobware","description":"Web Crawler","archived":false,"fork":false,"pushed_at":"2017-05-26T07:28:27.000Z","size":24,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-24T07:38:24.619Z","etag":null,"topics":["crawler","go"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/frobware.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-05-19T12:35:38.000Z","updated_at":"2017-05-19T12:40:51.000Z","dependencies_parsed_at":"2022-09-10T20:01:17.965Z","dependency_job_id":null,"html_url":"https://github.com/frobware/grawler","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frobware%2Fgrawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frobware%2Fgrawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frobware%2Fgrawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frobware%2Fgrawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/frobware","download_url":"https://codeload.github.com/frobware/grawler/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242112691,"owners_count":20073651,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","go"],"created_at":"2024-11-15T20:28:56.497Z","updated_at":"2025-03-05T22:26:09.796Z","avatar_url":"https://github.com/frobware.png","language":"Go","readme":"[![Travis CI](https://travis-ci.org/frobware/grawler.svg?branch=master)](https://travis-ci.org/frobware/grawler)\n[![GoDoc](https://img.shields.io/badge/godoc-reference-blue.svg?style=flat-square)](https://godoc.org/github.com/frobware/grawler)\n[![Coverage Status](http://codecov.io/github/frobware/grawler/coverage.svg?branch=master)](http://codecov.io/github/frobware/grawler?branch=master)\n[![Report Card](https://goreportcard.com/badge/github.com/frobware/grawler)](https://goreportcard.com/report/github.com/frobware/grawler)\n\n# Web Crawler\n\nA webcrawler library written in Go.\n\n## Installation\n\n\t$ go get -u golang.org/x/net/html\n\t$ go get -u github.com/frobware/grawler/...\n\nThe binary `sitemap` is an example of using the library.\n\nGiven a URL it will print a basic sitemap for the given domain,\nlisting the links each page has, together with a list of assets found\non each page. At the moment only `img`, `script` and `link` elements\nare considered an asset. `sitemap` will also only download links from\nthe same domain. And downloads are, by default, concurrent, governed\nby the `-j \u003cN\u003e` argument.\n\n\t$ sitemap -j 42 http://gopl.io\n\n```json\n{\n  \"URL\": \"http://gopl.io\",\n  \"Links\": [\n\t\"http://www.informit.com/store/go-programming-language-9780134190440\",\n\t\"http://www.amazon.com/dp/0134190440\",\n\t\"http://www.barnesandnoble.com/w/1121601944\",\n\t\"http://gopl.io/ch1.pdf\",\n\t\"https://github.com/adonovan/gopl.io/\",\n\t\"http://gopl.io/reviews.html\",\n\t\"http://gopl.io/translations.html\",\n\t\"http://gopl.io/errata.html\",\n\t\"http://golang.org/s/oracle-user-manual\",\n\t\"http://golang.org/lib/godoc/analysis/help.html\",\n\t\"https://github.com/golang/tools/blob/master/refactor/eg/eg.go\",\n\t\"https://github.com/golang/tools/blob/master/refactor/rename/rename.go\",\n\t\"http://www.amazon.com/dp/0131103628?tracking_id=disfordig-20\",\n\t\"http://www.amazon.com/dp/020161586X?tracking_id=disfordig-20\"\n  ],\n  \"Assets\": [\n\t\"style.css\",\n\t\"cover.png\",\n\t\"buyfromamazon.png\",\n\t\"informit.png\",\n\t\"barnesnoble.png\"\n  ]\n},\n{\n  \"URL\": \"http://gopl.io/errata.html\",\n  \"Links\": [\n\t\"https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md\"\n  ],\n  \"Assets\": [\n\t\"style.css\"\n  ]\n},\n{\n  \"URL\": \"http://gopl.io/reviews.html\",\n  \"Links\": [\n\t\"https://www.usenix.org/system/files/login/articles/login_dec15_17_books.pdf\",\n\t\"http://lpar.ath0.com/2015/12/03/review-go-programming-language-book\",\n\t\"http://www.computingreviews.com/index_dynamic.cfm?CFID=15675338\\u0026CFTOKEN=37047869\",\n\t\"http://www.infoq.com/articles/the-go-programming-language-book-review\",\n\t\"http://www.onebigfluke.com/2016/03/book-review-go-programming-language.html\",\n\t\"http://eli.thegreenplace.net/2016/book-review-the-go-programming-language-by-alan-donovan-and-brian-kernighan\",\n\t\"http://www.amazon.com/Programming-Language-Addison-Wesley-Professional-Computing/product-reviews/0134190440/ref=cm_cr_dp_see_all_summary\"\n  ],\n  \"Assets\": [\n\t\"style.css\",\n\t\"5stars.png\"\n  ]\n},\n{\n  \"URL\": \"http://gopl.io/translations.html\",\n  \"Links\": [\n\t\"http://www.acornpub.co.kr/book/go-programming\",\n\t\"http://www.williamspublishing.com/Books/978-5-8459-2051-5.html\",\n\t\"http://helion.pl/ksiazki/jezyk-go-poznaj-i-programuj-alan-a-a-donovan-brian-w-kernighan,jgopop.htm\",\n\t\"http://helion.pl/\",\n\t\"http://www.amazon.co.jp/exec/obidos/ASIN/4621300253\",\n\t\"http://www.maruzen.co.jp/corp/en/services/publishing.html\",\n\t\"http://novatec.com.br/\",\n\t\"http://www.gotop.com.tw/\",\n\t\"http://www.pearsonapac.com/\"\n  ],\n  \"Assets\": [\n\t\"style.css\"\n  ]\n},\n{\n  \"URL\": \"http://gopl.io/ch1.pdf\"\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffrobware%2Fgrawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffrobware%2Fgrawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffrobware%2Fgrawler/lists"}