Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ernesto-jimenez/crawler

Easily crawl websites in Go.
https://github.com/ernesto-jimenez/crawler

crawler golang

Last synced: about 1 month ago
JSON representation

Easily crawl websites in Go.

Awesome Lists containing this project

README

        

# crawler

[![GoDoc](https://godoc.org/github.com/ernesto-jimenez/crawler?status.svg)](https://godoc.org/github.com/ernesto-jimenez/crawler)
[![Build Status](https://travis-ci.org/ernesto-jimenez/crawler.svg?branch=master)](https://travis-ci.org/ernesto-jimenez/crawler)
[![Go Report Card](https://goreportcard.com/badge/ernesto-jimenez/crawler)](https://goreportcard.com/ernesto-jimenez/crawler)

A simple package to quickly build programs that require crawling websites.

```
go get github.com/ernesto-jimenez/crawler
```

## Usage

[embedmd]:# (example_crawler_test.go /func Example/ $)
```go
func Example() {
startURL := "https://godoc.org"

cr, err := crawler.New()
if err != nil {
panic(err)
}

err = cr.Crawl(startURL, func(url string, res *crawler.Response, err error) error {
if err != nil {
fmt.Printf("error: %s", err.Error())
return nil
}
fmt.Printf("%s - Links: %d Assets: %d\n", url, len(res.Links), len(res.Assets))
return crawler.ErrSkipURL
})
if err != nil {
panic(err)
}
// Output:
// https://godoc.org/ - Links: 39 Assets: 5
}
```