Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ernesto-jimenez/crawler
Easily crawl websites in Go.
https://github.com/ernesto-jimenez/crawler
crawler golang
Last synced: 23 days ago
JSON representation
Easily crawl websites in Go.
- Host: GitHub
- URL: https://github.com/ernesto-jimenez/crawler
- Owner: ernesto-jimenez
- License: mit
- Created: 2017-01-26T21:20:50.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-10-05T20:49:19.000Z (about 6 years ago)
- Last Synced: 2024-06-20T06:26:27.585Z (5 months ago)
- Topics: crawler, golang
- Language: Go
- Homepage:
- Size: 27.3 KB
- Stars: 4
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# crawler
[![GoDoc](https://godoc.org/github.com/ernesto-jimenez/crawler?status.svg)](https://godoc.org/github.com/ernesto-jimenez/crawler)
[![Build Status](https://travis-ci.org/ernesto-jimenez/crawler.svg?branch=master)](https://travis-ci.org/ernesto-jimenez/crawler)
[![Go Report Card](https://goreportcard.com/badge/ernesto-jimenez/crawler)](https://goreportcard.com/ernesto-jimenez/crawler)A simple package to quickly build programs that require crawling websites.
```
go get github.com/ernesto-jimenez/crawler
```## Usage
[embedmd]:# (example_crawler_test.go /func Example/ $)
```go
func Example() {
startURL := "https://godoc.org"cr, err := crawler.New()
if err != nil {
panic(err)
}err = cr.Crawl(startURL, func(url string, res *crawler.Response, err error) error {
if err != nil {
fmt.Printf("error: %s", err.Error())
return nil
}
fmt.Printf("%s - Links: %d Assets: %d\n", url, len(res.Links), len(res.Assets))
return crawler.ErrSkipURL
})
if err != nil {
panic(err)
}
// Output:
// https://godoc.org/ - Links: 39 Assets: 5
}
```