https://github.com/romis2012/is-bot
Detect bots/crawlers/spiders via user-agent string
https://github.com/romis2012/is-bot
bot-detection bots crawlers python user-agent user-agent-parser web-crawlers
Last synced: 7 months ago
JSON representation
Detect bots/crawlers/spiders via user-agent string
- Host: GitHub
- URL: https://github.com/romis2012/is-bot
- Owner: romis2012
- License: apache-2.0
- Created: 2022-08-23T09:56:27.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2024-09-07T05:49:54.000Z (9 months ago)
- Last Synced: 2024-11-17T13:56:25.931Z (7 months ago)
- Topics: bot-detection, bots, crawlers, python, user-agent, user-agent-parser, web-crawlers
- Language: Python
- Homepage:
- Size: 36.1 KB
- Stars: 9
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
## is-bot
[](https://github.com/romis2012/is-bot/actions/workflows/ci.yml)
[](https://codecov.io/gh/romis2012/is-bot)
[](https://pypi.python.org/pypi/is-bot)Python package to detect bots/crawlers/spiders via user-agent string.
This is a port of the [isbot](https://github.com/omrilotan/isbot) JavaScript module.## Requirements
- Python >= 3.7
- regex >= 2022.8.17## Installation
```
pip install is-bot
```## Usage
### Simple usage
```python
from is_bot import Botsbots = Bots()
ua = 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/104.0.5112.79 Safari/537.36'
assert bots.is_bot(ua)ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36'
assert not bots.is_bot(ua)
```### Add/remove parsing rules
```python
from is_bot import Botsbots = Bots()
# Exclude Chrome-Lighthouse from default bot list
ua = 'Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4695.0 Mobile Safari/537.36 Chrome-Lighthouse'
assert bots.is_bot(ua)
bots.exclude(['chrome-lighthouse'])
assert not bots.is_bot(ua)# Add some browser to default bot list
ua = 'SomeAwesomeBrowser/10.0 (Linux; Android 7.0)'
assert not bots.is_bot(ua)
bots.extend(['SomeAwesomeBrowser'])
assert bots.is_bot(ua)
```### Get additional parsing information
```python
from is_bot import Botsbots = Bots()
ua = 'Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 SearchRobot/1.0'
# view the respective match for bot user agent rule
print(bots.find(ua))
#> Search# list all patterns that match the user agent string
print(bots.matches(ua))
#> ['(?