https://github.com/tzxyz/webber
基于golang实现的一个轻量级爬虫框架
https://github.com/tzxyz/webber
crawler golang
Last synced: 5 months ago
JSON representation
基于golang实现的一个轻量级爬虫框架
- Host: GitHub
- URL: https://github.com/tzxyz/webber
- Owner: tzxyz
- License: mit
- Created: 2018-03-08T07:53:21.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2019-06-29T15:21:35.000Z (almost 7 years ago)
- Last Synced: 2024-06-20T15:49:48.988Z (about 2 years ago)
- Topics: crawler, golang
- Language: Go
- Homepage:
- Size: 22.5 KB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Webber
一个轻量级爬虫框架
## Get Started
``` go
package main
import (
"strings"
"github.com/tzxyz/webber"
)
func main() {
webber.New().
Name("妹子图").
StartUrls("http://www.meizitu.com/a/more_1.html").
Processor(func(response *webber.Response) *webber.Result {
// 列表页
if strings.HasPrefix(response.GetUrl(), "http://www.meizitu.com/a/more_") {
links := response.Html().Xpath("//h3[@class = 'tit']/a/@href")
return webber.NewResult().PushUrls(links...)
}
// 详情页
return webber.NewResult().
PushItem("images", response.Html().Xpath("//div[@id='picture']/p/img/@src")).
PushItem("title", response.Html().Xpath("//div[@class='metaRight']/h2/a/text()"))
}).Start()
}
```
## LICENSE
[MIT](LICENSE)