Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rndinfosecguy/Scavenger
Crawler (Bot) searching for credential leaks on paste sites.
https://github.com/rndinfosecguy/Scavenger
bot crawler credentials leaks osint paste pastebin python
Last synced: about 2 months ago
JSON representation
Crawler (Bot) searching for credential leaks on paste sites.
- Host: GitHub
- URL: https://github.com/rndinfosecguy/Scavenger
- Owner: rndinfosecguy
- License: mit
- Created: 2018-03-10T18:10:03.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2022-03-31T16:12:50.000Z (over 2 years ago)
- Last Synced: 2024-08-01T03:27:43.134Z (4 months ago)
- Topics: bot, crawler, credentials, leaks, osint, paste, pastebin, python
- Language: Python
- Homepage:
- Size: 99.6 KB
- Stars: 604
- Watchers: 29
- Forks: 118
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-hacking-tools - Scavenger - Paste sites crawler (bot) looking for leaked credentials (Asset Discovery / Data Leaks)
- Awesome-Asset-Discovery - Scavenger
- awesome-hacking-lists - rndinfosecguy/Scavenger - Crawler (Bot) searching for credential leaks on paste sites. (Python)
README
# Scavenger - OSINT Bot - REWORKED
---
[bot in action](https://twitter.com/leak_scavenger)
---
[![Anurag's GitHub stats](https://github-readme-stats.vercel.app/api?username=rndinfosecguy)](https://github.com/anuraghazra/github-readme-stats)
---
## Intro
Just the code of my OSINT bot searching for sensitive data leaks on paste sites.Search terms:
- credentials
- private RSA keys
- Wordpress configuration files
- MySQL connect strings
- onion links
- SQL dumps
- API keys
- complete emailsSearch terms can be customized. You can learn more about it in the configuration section.
## Articles About Scavenger
- https://jakecreps.com/2019/05/08/osint-collection-tools-for-pastebin/
- https://jakecreps.com/2019/01/08/scavenger/
- https://youtu.be/VCwiZ2dh17Q?t=51 (the bot is mentioned here)## Main Features
For pastebin.com the bot has two modes:
- looking for sensitive data in the archive via scraping
- looking for sensitive data by tracking users who publish leaksAdditional features:
- customizable search terms
- scan folders with text files for sensitive information## Configuration
1. Delete the README.md files in every subfolder as they are only placeholders
2. The bot searches for email:password combinations and other kinds sensitive data by default. If you want to add more search terms edit the __configs/searchterms.txt__ file or use the -3 switch in the control script
Default __configs/searchterms.txt__ configuration:
```console
mysqli_connect(
BEGIN RSA PRIVATE KEY
The name of the database for WordPress
apiKey:
Return-Path:
insert into
INSERT INTO
.onion
```
If you want to add other search terms just add them to file line by line.
You know a useful search terms which is missing here? Tell me! :-)
3. For the user tracking module of pastebin.com you need to add the target users line by line to the __configs/users.txt__ file.## Usage
Program help:
```console
$ python3 scavenger.py -h_________
/ _____/ ____ _____ ___ __ ____ ____ ____ ___________
\_____ \_/ ___\\__ \\ \/ // __ \ / \ / ___\_/ __ \_ __ \
/ \ \___ / __ \\ /\ ___/| | \/ /_/ > ___/| | \/
/_______ /\___ >____ /\_/ \___ >___| /\___ / \___ >__|
\/ \/ \/ \/ \//_____/ \/ Reworkedusage: scavenger.py [-h] [-0] [-1] [-2] [-3] [-4]
control script
optional arguments:
-h, --help show this help message and exit
-0, --pbincom Activate pastebin.com archive scraping module
-1, --pbincomTrack Activate pastebin.com user tracking module
-2, --sensitivedata Search a specific folder for sensitive data. This might
be useful if you want to analyze some pastes which
were not collected by the bot.
-3, --editsearch Edit search terms file for additional search terms
(email:password combinations will always be searched)
-4, --editusers Edit user file of the pastebin.com user track moduleexample usage: python3 scavenger.py -0 -1
```Crawled pastes are stored at different locations depending on their status.
- Paste crawled but nothing was detected -> __data/raw_pastes__
- Paste crawled and an email:password combination was detected -> __data/raw_pastes__ and __data/files_with_passwords__
- Paste crawled and other sensitive data was detected -> __data/raw_pastes__ and __data/otherSensitivePastes__Pastes get stored in data/raw_pastes until they reach a limit of 48000 files.
Once there are more then 48000 pastes they get ziped and moved to the archive folder.---
Start the pastebin.com archive scraping module
```console
$ python3 scavenger.py -0
```
Start pastebin.com user tracking module
```console
$ python3 scavenger.py -1
```
When starting one of these modules, a tmux session with the running module is created in the background.List tmux sessions
```console
$ tmux ls
pastebincomArchive: 1 windows (created Sun Apr 14 06:33:32 2021) [204x58]
pastebincomTrack: 1 windows (created Sun Apr 14 06:33:32 2021) [204x58]
```
Interact with a tmux session example```console
$ tmux a -t pastebincomArchive
$ tmux a -t pastebincomTrack
```To detach from a session hit STRG+b d.
---
If you want to start a module without using the control software you can do this by calling them directly.
Pastebin.com archive scraper
```console
$ python3 pbincomArchiveScrape.py
```Pastebin.com user tracker
```console
$ python3 pbincomTrackUser.py
```Search specific folder for sensitive data:
```console
$ python3 findSensitiveData.py TARGET_FOLDER
```---
## To Do
If you miss anything and want me to add features or make changes, just let me know via Twitter or GitHub issue :-)