https://github.com/jaytaylor/archive.org
Golang package for archived webpage search via archive.org. https://jaytaylor.com/archive.org
https://github.com/jaytaylor/archive.org
archival archiveorg
Last synced: 3 months ago
JSON representation
Golang package for archived webpage search via archive.org. https://jaytaylor.com/archive.org
- Host: GitHub
- URL: https://github.com/jaytaylor/archive.org
- Owner: jaytaylor
- License: mit
- Created: 2018-06-03T20:21:22.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2018-06-04T00:12:21.000Z (almost 7 years ago)
- Last Synced: 2024-06-21T00:06:39.253Z (10 months ago)
- Topics: archival, archiveorg
- Language: Go
- Size: 17.6 KB
- Stars: 5
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# archiveorg
[](https://godoc.org/github.com/jaytaylor/archive.org)
[](https://travis-ci.org/jaytaylor/archiveorg)
[](https://goreportcard.com/report/github.com/jaytaylor/archive.org)### About
archive.org is a golang package for archiving web pages via [archive.org](https://web.archive.org).
Please be mindful and responsible and go easy on them, we want archive.org to last forever!
Created by [Jay Taylor](https://jaytaylor.com/).
Also see: [archive.is golang package](https://jaytaylor.com/archive.is)
### TODO
* Finish migrating to archive.org API
* Consider unifying to single binary
* Add `id_`, `js_`, `cs_`, etc info to golang pkg.Related resources:
* http://ws-dl.blogspot.com/2013/07/2013-07-15-wayback-machine-upgrades.html
* https://en.wikipedia.org/wiki/Help:Using_the_Wayback_Machine### Requirements
* Go version 1.9 or newer
### Installation
```bash
go get jaytaylor.com/archive.org/...
```### Usage
#### Command-line programs
##### `archive.org `
Archive a fresh new copy of an HTML page
##### `archive.org-snapshots `
Search for existing page snapshots
#### Go package interfaces
##### Search for Existing Snapshots
[capture.go](_examples/capture/capture.go):
```go
package mainimport (
"fmt""github.com/jaytaylor/archive.org"
)var captureURL = "https://jaytaylor.com/"
func main() {
archiveURL, err := archiveorg.Capture(captureURL, archiveorg.DefaultRequestTimeout)
if err != nil {
panic(err)
}
fmt.Printf("Successfully archived %v via archive.org: %v\n", captureURL, archiveURL)
}// Output:
//
// Successfully archived https://jaytaylor.com/ via archive.org: https://archive.is/i2PiW
```[search.go](_examples/search/search.go):
```go
package mainimport (
"fmt""jaytaylor.com/archive.org"
)func main() {
u := "http://blog.sendhub.com/post/16800984141/switching-to-heroku-a-django-app-story"hits, err := archiveorg.Search(u, archiveorg.DefaultRequestTimeout)
if err != nil {
panic(fmt.Errorf("Search error: %s", err))
}
fmt.Printf("num: %v\n", len(hits))
for _, hit := range hits {
fmt.Printf("hit: %+v\n", hit)
}
}// Output:
//
// num: 3
// hit: {URL:https://web.archive.org/web/20160304012638/http://blog.sendhub.com/post/16800984141/switching-to-heroku-a-django-app-story Reason:webwidecrawlhackernews00000hackernews StatusCode:301 Timestamp:2016-03-04 01:26:38 +0000 UTC}
// hit: {URL:https://web.archive.org/web/20120202233158/http://blog.sendhub.com/post/16800984141/switching-to-heroku-a-django-app-story Reason:alexacrawls StatusCode:200 Timestamp:2012-02-02 23:31:58 +0000 UTC}
// hit: {URL:https://web.archive.org/web/20120202201233/http://blog.sendhub.com/post/16800984141/switching-to-heroku-a-django-app-story Reason:alexacrawls StatusCode:200 Timestamp:2012-02-02 20:12:33 +0000 UTC}
```### Running the test suite
go test ./...
#### License
Permissive MIT license, see the [LICENSE](LICENSE) file for more information.