https://github.com/buren/site_mapper
Map all links on a given site
https://github.com/buren/site_mapper
gem ruby sitemapper
Last synced: 2 months ago
JSON representation
Map all links on a given site
- Host: GitHub
- URL: https://github.com/buren/site_mapper
- Owner: buren
- License: mit
- Created: 2014-10-22T12:34:25.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2021-11-29T14:34:43.000Z (over 3 years ago)
- Last Synced: 2024-10-13T13:36:58.789Z (8 months ago)
- Topics: gem, ruby, sitemapper
- Language: Ruby
- Homepage: https://rubygems.org/gems/site_mapper
- Size: 38.1 KB
- Stars: 11
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SiteMapper
[](https://codeclimate.com/github/buren/site_mapper)
[](https://coveralls.io/r/buren/site_mapper)
[](http://www.rubydoc.info/github/buren/site_mapper/master)
[](https://travis-ci.org/buren/site_mapper)
[](https://gemnasium.com/buren/site_mapper)
[](http://badge.fury.io/rb/site_mapper)Map all links on a given site.
SiteMapper will try to respect `/robots.txt`Works great with [Wayback Archiver](https://github.com/buren/wayback_archiver) a gem that crawls your site and submits each URL to the [Internet Archive (Wayback Machine)](https://archive.org/web/).
## Installation
Install the gem:```bash
gem install site_mapper
```## Usage
Command line usage:
```bash
# Crawl all found links on page
# that has example.com domain
site_mapper example.com
```Ruby usage:
```ruby
# Crawl all found links on page
# that has example.com domain
require 'site_mapper'
SiteMapper.map('example.com') do |new_url|
puts "New URL found: #{new_url}"
end
# Log to STDOUT
SiteMapper.map('example.com', logger: :system) do |new_url|
puts "New URL found: #{new_url}"
end
```## Docs
You can find the docs online on [RubyDoc](http://www.rubydoc.info/github/buren/site_mapper/master).
This gem is documented using `yard` (run from the root of this respository).
```bash
yard # Generates documentation to doc/
```## Contributing
Contributions, feedback and suggestions are very welcome.
1. Fork it
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create new Pull Request## Notes
* Special thanks to the [robots](https://rubygems.org/gems/robots) gem, which provided the bulk of the code in `lib/robots.rb`
## Alternatives
There are a couple of __great__ alternatives, which are more mature and has more features than this Gem and has. Please feel free to check them out:
* [spidr](https://github.com/postmodern/spidr#readme)
* [anemone](https://github.com/chriskite/anemone#readme)## License
[MIT License](LICENSE)