Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dubniczky/bad-robot
This is a python crawler that disregards robots.txt rules and downloads disallowed resources
https://github.com/dubniczky/bad-robot
crawler osint-python osint-tool python robots-txt
Last synced: 5 days ago
JSON representation
This is a python crawler that disregards robots.txt rules and downloads disallowed resources
- Host: GitHub
- URL: https://github.com/dubniczky/bad-robot
- Owner: dubniczky
- License: mit
- Created: 2023-08-02T11:54:02.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-05T21:53:31.000Z (over 1 year ago)
- Last Synced: 2024-05-02T05:43:06.466Z (10 months ago)
- Topics: crawler, osint-python, osint-tool, python, robots-txt
- Language: Python
- Homepage:
- Size: 16.6 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
# Bad Robot
This is a python crawler that disregards robots.txt rules and downloads disallowed resources
## Usage
Install dependencies:
```bash
make install
```Run:
```bash
python badrobot.py
```Example:
```bash
python badrobot.py https://www.example.com
```> Please note that the projet uses python 3, so on some systems you might need to use `python3` instead of `python`.