https://github.com/m1/smap
smap is a site-mapping engine written in Go.
https://github.com/m1/smap
crawler go go-library go-package golang golang-library golang-package golang-tools sitemap sitemap-generator web-crawler web-crawling
Last synced: 12 months ago
JSON representation
smap is a site-mapping engine written in Go.
- Host: GitHub
- URL: https://github.com/m1/smap
- Owner: m1
- Created: 2019-03-14T23:00:17.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-03-14T23:02:09.000Z (about 7 years ago)
- Last Synced: 2025-02-05T03:36:28.815Z (over 1 year ago)
- Topics: crawler, go, go-library, go-package, golang, golang-library, golang-package, golang-tools, sitemap, sitemap-generator, web-crawler, web-crawling
- Language: Go
- Size: 9.77 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# smap
[](https://godoc.org/github.com/m1/smap)
[](https://travis-ci.org/m1/smap)
[](https://goreportcard.com/report/github.com/m1/smap)
[](https://github.com/m1/smap/releases/latest)
[](https://coveralls.io/github/m1/smap)
## Installation
Use go get to get the latest version
```text
go get github.com/m1/smap
```
Then import it into your projects using the following:
```go
import (
"github.com/m1/smap"
)
```
## Usage
smap can be used as a library, for example:
```go
c, _ := client.New(&client.Config{
MaxWorkers: 50,
IgnoreRobotsTxt: false,
UserAgent: "user-agent 1.1",
})
u, _ := url.Parse("http://example.com")
siteMap, err := c.Crawl(u)
for _, v := range siteMap {
println(v.URL.path, len(v.Links), len(v.LinkedFrom))
}
```
## CLI usage
smap can also be used on the cli, just install using: `go get github.com/m1/gospin/cmd/gospin`
To use:
```
➜ ~ smap --help
smap is a site-mapping engine written in Go.
Usage:
smap [url] [flags]
Flags:
-h, --help help for smap
--json json output
--robots Ignores robots.txt
-u, --user-agent string User agent to use for the crawler
-v, --verbose verbose printing
-w, --workers int How many workers to use (default 50)
```
For example:
```
➜ smap go build && ./smap http://google.com --json --verbose --workers=50 --user-agent="test-test" | jq
{
"/": {
"path": "/",
"redirects_to": null,
"links": [
"/advanced_search",
"/language_tools",
"/intl/en/ads/",
"/services/",
"/intl/en/policies/privacy/",
"/intl/en/policies/terms/"
],
"linked_from": [
"/intl/en/ads/",
"/services/",
"/advanced_search",
"/language_tools",
],
"is_redirect": false
}...
```