https://github.com/ncodes/robotty
Robots.txt parser for Go language
https://github.com/ncodes/robotty
Last synced: 3 months ago
JSON representation
Robots.txt parser for Go language
- Host: GitHub
- URL: https://github.com/ncodes/robotty
- Owner: ncodes
- Created: 2014-07-03T18:46:07.000Z (almost 12 years ago)
- Default Branch: master
- Last Pushed: 2014-07-03T18:48:33.000Z (almost 12 years ago)
- Last Synced: 2024-06-20T23:54:29.276Z (almost 2 years ago)
- Language: Go
- Size: 1.75 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
Robotty is a robots.txt parser for Go language
--------------------------------------------------
Usage:
import the package
```
import "robot"
```
```go
// parse a robots.txt string
decision := robot.FromString("User-agent: * \n Disallow: /css/ \n Disallow: /cgi-bin/")
decision.IsAllowed("http://site.com/css", "*") // returns false
// or parse a robots.txt http.Response
resp, err := http.Get("http://google.com.ng/robots.txt")
decision, _ := robot.FromResponse(resp)
decision.IsAllowed("http://google.com.ng/bleh/bleh", "*")
```
- This library follows the exact matching pattern on this page https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
- It will first parse the first group that matches the user agent specified in Decision.IsAllowed and only fallback to the default group "*" if the specified user agent is not found
- Only the first matching directive is accepted.