Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/p0dalirius/crawlersuseragents
Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.
https://github.com/p0dalirius/crawlersuseragents
bugbounty crawler crawlers pentest request tool user-agent web
Last synced: 7 days ago
JSON representation
Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.
- Host: GitHub
- URL: https://github.com/p0dalirius/crawlersuseragents
- Owner: p0dalirius
- Created: 2021-11-15T14:59:54.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2023-10-01T18:33:36.000Z (about 1 year ago)
- Last Synced: 2024-05-01T17:26:38.217Z (6 months ago)
- Topics: bugbounty, crawler, crawlers, pentest, request, tool, user-agent, web
- Language: Python
- Homepage: https://podalirius.net/
- Size: 302 KB
- Stars: 19
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
README
![](./.github/banner.png)
This Python script can be used to check if there is any differences in responses of an application when the request comes from a search engine's crawler.
![](./.github/four_results.png)
## Features
- [x] 30 crawler's user agent strings.
- [x] Multithreading.
- [x] JSON export with `--json outputfile.json`.
- [x] Auto-detecting responses that stands out.## Usage
```
$ ./crawlersuseragents.py -h
[~] Access web pages as web crawlers User-Agents, v1.1usage: crawlersuseragents.py [-h] [-v] [-t THREADS] [-x PROXY] [-k] [-L] [-j JSONFILE] url
This Python script can be used to check if there is any differences in responses of an application
when the request comes from a search engine's crawler.positional arguments:
url e.g. https://example.com:port/pathoptional arguments:
-h, --help show this help message and exit
-v, --verbose arg1 help message
-t THREADS, --threads THREADS
Number of threads (default: 5)
-x PROXY, --proxy PROXY
Specify a proxy to use for requests (e.g., http://localhost:8080)
-k, --insecure Allow insecure server connections when using SSL (default: False)
-L, --location Follow redirects (default: False)
-j JSONFILE, --jsonfile JSONFILE
Save results to specified JSON file.```
## Auto-detecting responses that stands out
Results are sorted by uniqueness of their response's length. This means that the results with unique response length will be on top, and results with response's length occurring multiple times at the bottom:
| Two different result lengths | Four different result lengths |
|------------------------------|--------------------------------|
| ![](./.github/two_results.png) | ![](./.github/four_results.png) |## Contributing
Pull requests are welcome. Feel free to open an issue if you want to add other features.
## References
- https://developers.google.com/search/docs/advanced/crawling/overview-google-crawlers
- https://www.bing.com/webmasters/help/which-crawlers-does-bing-use-8c184ec0