Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eschultze/urlextractor
Information gathering & website reconnaissance | https://phishstats.info/
https://github.com/eschultze/urlextractor
abuse domain incident-response information-extraction information-gathering malicious-domains malware osint phishing shodan virustotal whois
Last synced: 3 months ago
JSON representation
Information gathering & website reconnaissance | https://phishstats.info/
- Host: GitHub
- URL: https://github.com/eschultze/urlextractor
- Owner: eschultze
- License: mit
- Created: 2016-02-22T13:47:25.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2019-05-27T19:44:14.000Z (over 5 years ago)
- Last Synced: 2024-08-09T13:17:36.181Z (6 months ago)
- Topics: abuse, domain, incident-response, information-extraction, information-gathering, malicious-domains, malware, osint, phishing, shodan, virustotal, whois
- Language: Shell
- Homepage:
- Size: 2.73 MB
- Stars: 445
- Watchers: 27
- Forks: 74
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# URLextractor
Information gathering & website reconnaissance
------**Usage:**
`./extractor http://www.hackthissite.org/`![](https://github.com/eschultze/URLextractor/blob/master/examples/example1.png)
**Tips:**
* Colorex: put colors to the ouput `pip install colorex` and use it like `./extractor http://www.hackthissite.org/ | colorex -g "INFO" -r "ALERT"`
* Tldextract: is used by dnsenumeration function `pip install tldextract`Features:
------* IP and hosting info like city and country (using [FreegeoIP](http://freegeoip.net/))
* DNS servers (using [dig](http://packages.ubuntu.com/precise/dnsutils))
* ASN, Network range, ISP name (using [RISwhois](https://www.ripe.net/analyse/archived-projects/ris-tools-web-interfaces/riswhois))
* Load balancer test
* Whois for abuse mail (using [Spamcop](https://www.spamcop.net/))
* PAC (Proxy Auto Configuration) file
* Compares hashes to diff code
* robots.txt (recursively looking for hidden stuff)
* Source code (looking for passwords and users)
* External links (frames from other websites)
* Directory FUZZ (like Dirbuster and Wfuzz - using [Dirbuster](https://www.owasp.org/index.php/Category:OWASP_DirBuster_Project)) directory list)
* [URLvoid](http://www.urlvoid.com/) API - checks Google page rank, Alexa rank and possible blacklists
* Provides useful links at other websites to correlate with IP/ASN
* Option to open ALL results in browser at the endChangelog to version 0.2.0:
------* [Fix] Changed GeoIP from freegeoip to ip-api
* [Fix/Improvement] Remove duplicates from robots.txt
* [Improvement] Better whois abuse contacts (abuse.net)
* [Improvement] Top passwords collection added to sourcecode checking
* [New feature] Firt run verification to install dependencies if need
* [New feature] Log file
* [New feature] Check for hostname on log file
* [New feature] Check if hostname is listed on Spamaus Domain Blacklist
* [New feature] Run a quick dnsenumeration with common server namesChangelog to version 0.1.9:
------* Abuse mail using lynx istead of ~~curl~~
* Target server name parsing fixed
* More verbose about HTTP codes and directory discovery
* MD5 collection for IP fixed
* Links found now show unique URLs from array
* [New feature] **Google** results
* [New feature] **Bing** IP check for other hosts/vhosts
* [New feature] Opened ports from **Shodan**
* [New feature] **VirusTotal** information about IP
* [New feature] **Alexa Rank** information about $TARGET_HOSTRequirements:
------Tested on Kali light mini AND OSX 10.11.3 with brew
```
sudo apt-get install bc curl dnsutils libxml2-utils whois md5sha1sum lynx openssl -y
```**Configuration file:**
```
CURL_TIMEOUT=15 #timeout in --connect-timeout
CURL_UA=Mozilla #user-agent (keep it simple)
INTERNAL=NO #YES OR NO (show internal network info)
URLVOID_KEY=your_API_key #using API from http://www.urlvoid.com/
FUZZ_LIMIT=10 #how many lines it will read from fuzz file
OPEN_TARGET_URLS=NO #open found URLs at the end of script
OPEN_EXTERNAL_LINKS=NO #open external links (frames) at the end of script
FIRST_TIME=YES #if first time check for dependecies
```Todo list:
------* [x] Upload to github :)
* [x] Check for installed packages
* [ ] Integration with other APIs
* [ ] Export to CSV
* [ ] Integration with CipherScan## Stargazers over time
[![Stargazers over time](https://starchart.cc/eschultze/URLextractor.svg)](https://starchart.cc/eschultze/URLextractor)