Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lassebunk/human_power
Easy generation of robots.txt. Force the robots into submission!
https://github.com/lassebunk/human_power
Last synced: 28 days ago
JSON representation
Easy generation of robots.txt. Force the robots into submission!
- Host: GitHub
- URL: https://github.com/lassebunk/human_power
- Owner: lassebunk
- License: mit
- Created: 2013-12-07T17:36:52.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2021-09-07T17:53:39.000Z (almost 3 years ago)
- Last Synced: 2024-05-14T13:21:30.595Z (about 1 month ago)
- Language: Ruby
- Size: 367 KB
- Stars: 96
- Watchers: 4
- Forks: 8
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Lists
- awesome-rails - Human Power - Generate robots.txt (SEO)
README
[![Build Status](https://secure.travis-ci.org/lassebunk/human_power.png)](http://travis-ci.org/lassebunk/human_power)
# Human Power
Human Power lets you write your robots.txt in plain Ruby and force the robots into submission!
![Bingbot getting taught](http://i.imgur.com/77WVSQH.jpg)
## Installation
Add this line to your *Gemfile*:
gem 'human_power'
Then run:
$ bundle
Or install it yourself:
$ gem install human_power
If you are using Rails, you can add a sample *config/robots.rb* configuration file and route for `/robots.txt`:
$ rails g human_power:install
It will allow crawlers access to the whole site by default.
Now you can restart your server and visit `/robots.txt` to see what's generated from the new configuration file.
## Usage
### Using in Ruby on Rails
In *config/robots.rb*:
```ruby
# Disallow everything in /admin for all user agents
disallow_tree admin_path# Googlebot
googlebot do
disallow reviews_path # Disallow a path
allow product_path("one") # Allow a path
disallow new_product_path, new_category_path # Disallow multiple paths in one line
end# Bingbot
bingbot do
disallow :all # There you go, Bingbot! (Disallow everything)
end# Identical settings for multiple user agents
user_agent [:bingbot, :googlebot] do
disallow login_path
end# Custom user agent
user_agent "My Custom User Agent String" do
disallow some_path
end# You have access to everything from your ApplicationController
if request.subdomains.first == "api"
disallow :all
end# Add one or more sitemaps
sitemap sitemap_url
sitemap one_url, two_url
```Then visit `/robots.txt` in your browser.
## Crawlers
Please see [user_agents.yml](https://github.com/lassebunk/human_power/blob/master/user_agents.yml) for a list of 170+ built-in user agents/crawlers you can use like shown above.
The list is from [UserAgentString.com](http://www.useragentstring.com/pages/Crawlerlist/).### Bot detection
You can use the `HumanPower.is_bot?` method to check if a user agent is a known bot / crawler:
```ruby
# Googlebot
ua = "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
HumanPower.is_bot?(ua) # => true# Chrome
ua = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1944.0 Safari/537.36"
HumanPower.is_bot?(ua) # => false# in Rails
HumanPower.is_bot?(request.user_agent) # => performs check on current user agent
```### Regular expression
If you need to get a regular expression for bot detection, you can use:
```ruby
HumanPower.bot_regex # => regular expression that matches all known bots / crawlers
```## Caveats
Human Power is great for adding rules to your robots.txt.
You should note, however, that the user agents are sorted alphabetically upon rendering.
This is fine for most use cases, but if you add more advanced rules relying on user agent
order, please be sure to check that robots.txt is generated into something that meets
your needs. If you need more advanced features and rely heavily on ordering, please submit
an [issue](https://github.com/lassebunk/human_power/issues)
or [pull request](https://github.com/lassebunk/human_power/pulls). Thanks.## Versioning
Follows [semantic versioning](http://semver.org/).
## Contributing
You are very welcome to contribute to this project if you have a feature that you think others can use. I'd appreciate it.
1. Fork the project
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create new pull request