{"id":45338948,"url":"https://github.com/alaz/legitbot","last_synced_at":"2026-04-25T09:09:37.735Z","repository":{"id":14735600,"uuid":"76874370","full_name":"alaz/legitbot","owner":"alaz","description":"🤔 Is this Web request from a real search engine🕷 or from an impersonating agent 🕵️‍♀️?","archived":false,"fork":false,"pushed_at":"2026-04-04T07:07:50.000Z","size":314,"stargazers_count":28,"open_issues_count":2,"forks_count":11,"subscribers_count":2,"default_branch":"master","last_synced_at":"2026-04-04T09:15:16.329Z","etag":null,"topics":["bot","detect-crawlers","fake","googlebot","impersonation","protection","ruby","ruby-gem","search-engine","security"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alaz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2016-12-19T15:39:16.000Z","updated_at":"2026-04-04T07:07:53.000Z","dependencies_parsed_at":"2024-06-09T12:43:28.882Z","dependency_job_id":"86f0fdc7-70fc-4ce1-a5ba-3410658b7ead","html_url":"https://github.com/alaz/legitbot","commit_stats":{"total_commits":166,"total_committers":7,"mean_commits":"23.714285714285715","dds":"0.29518072289156627","last_synced_commit":"47e2c2b8e325d215016fc963bbedeff885c5dbff"},"previous_names":[],"tags_count":78,"template":false,"template_full_name":null,"purl":"pkg:github/alaz/legitbot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alaz%2Flegitbot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alaz%2Flegitbot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alaz%2Flegitbot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alaz%2Flegitbot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alaz","download_url":"https://codeload.github.com/alaz/legitbot/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alaz%2Flegitbot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32256281,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-25T04:23:17.126Z","status":"ssl_error","status_checked_at":"2026-04-25T04:21:53.360Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bot","detect-crawlers","fake","googlebot","impersonation","protection","ruby","ruby-gem","search-engine","security"],"created_at":"2026-02-21T10:03:11.463Z","updated_at":"2026-04-25T09:09:37.697Z","avatar_url":"https://github.com/alaz.png","language":"Ruby","readme":"# Legitbot ![](https://github.com/alaz/legitbot/workflows/build/badge.svg) ![](https://badge.fury.io/rb/legitbot.svg)\n\nRuby gem to make sure that an IP really belongs to a bot, typically a search\nengine.\n\n## Usage\n\nSuppose you have a Web request and you would like to check it is not diguised:\n\n```ruby\nbot = Legitbot.bot(userAgent, ip)\n```\n\n`bot` will be `nil` if no bot signature was found in the `User-Agent`.\nOtherwise, it will be an object with methods\n\n```ruby\nbot.detected_as # =\u003e :google\nbot.valid? # =\u003e true\nbot.fake? # =\u003e false\n```\n\nSometimes you already know which search engine to expect. For example, you might\nbe using [rack-attack](https://github.com/kickstarter/rack-attack):\n\n```ruby\nRack::Attack.blocklist(\"fake Googlebot\") do |req|\n  req.user_agent =~ %r(Googlebot) \u0026\u0026 Legitbot::Google.fake?(req.ip)\nend\n```\n\nOr if you do not like all those ghoulish crawlers stealing your content,\nevaluating it and getting ready to invade your site with spammers, then block\nthem all:\n\n```ruby\nRack::Attack.blocklist 'fake search engines' do |request|\n  Legitbot.bot(request.user_agent, request.ip)\u0026.fake?\nend\n```\n\n## Versioning\n\n[Semantic versioning](https://semver.org/) with the following clarifications:\n\n- MINOR version is incremented when support for new bots is added.\n- PATCH version is incremented when validation logic for a bot changes (IP list\n  updated, for example).\n\n## Supported\n\n- [Ahrefs](https://ahrefs.com/robot)\n- [AmazonAdBot](https://adbot.amazon.com/)\n- [AmazonBot](https://developer.amazon.com/amazonbot)\n- [Applebot](https://support.apple.com/en-us/119829)\n- [Baidu spider](http://help.baidu.com/question?prod_en=master\u0026class=498\u0026id=1000973)\n- [Bingbot](https://blogs.bing.com/webmaster/2012/08/31/how-to-verify-that-bingbot-is-bingbot/)\n- [BLEXBot (WebMeUp)](http://webmeup-crawler.com/)\n- [DataForSEO](https://dataforseo.com/dataforseo-bot)\n- [DuckAssistBot](https://duckduckgo.com/duckduckgo-help-pages/results/duckassistbot)\n- [DuckDuckBot](https://duckduckgo.com/duckduckgo-help-pages/results/duckduckbot)\n- [Google crawlers](https://support.google.com/webmasters/answer/1061943)\n- [IAS](https://integralads.com/ias-privacy-data-management/policies/site-indexing-policy/)\n- [OpenAI GPTBot](https://platform.openai.com/docs/gptbot)\n- [Oracle Data Cloud Crawler](https://www.oracle.com/corporate/acquisitions/grapeshot/crawler.html)\n- [Marginalia](https://www.marginalia.nu/marginalia-search/for-webmasters/)\n- [Meta / Facebook Web crawlers](https://developers.facebook.com/docs/sharing/webmasters/web-crawlers/)\n- [Petal search engine](http://aspiegel.com/petalbot)\n- [Pinterest](https://help.pinterest.com/en/articles/about-pinterest-crawler-0)\n- [Twitterbot](https://developer.twitter.com/en/docs/tweets/optimize-with-cards/guides/getting-started),\n  the list of IPs is in the\n  [Troubleshooting page](https://developer.twitter.com/en/docs/tweets/optimize-with-cards/guides/troubleshooting-cards)\n- [Yandex robots](https://yandex.com/support/webmaster/robot-workings/check-yandex-robots.xml)\n\n## License\n\nApache 2.0\n\n## Other projects\n\n- Play Framework variant in Scala:\n  [play-legitbot](https://github.com/osinka/play-legitbot)\n- Article\n  [When (Fake) Googlebots Attack Your Rails App](http://jessewolgamott.com/blog/2015/11/17/when-fake-googlebots-attack-your-rails-app/)\n- [Voight-Kampff](https://github.com/biola/Voight-Kampff) is a Ruby gem that\n  detects bots by `User-Agent`\n- [crawler_detect](https://github.com/loadkpi/crawler_detect) is a Ruby gem and\n  Rack middleware to detect crawlers by few different request headers, including\n  `User-Agent`\n- Project Honeypot's [http:BL](https://www.projecthoneypot.org/httpbl_api.php)\n  can not only classify IP as a search engine, but also label them as suspicious\n  and reports the number of days since the last activity. My implementation of\n  the protocol in Scala is [here](https://github.com/osinka/httpbl).\n- [CIDRAM](https://github.com/CIDRAM/CIDRAM) is a PHP routing manager with\n  built-in support to validate bots.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falaz%2Flegitbot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falaz%2Flegitbot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falaz%2Flegitbot/lists"}