https://github.com/the404hacking/urlextractor
Information Gathering & WebSite ReConnaissance.
https://github.com/the404hacking/urlextractor
bash information-gathering information-gathering-tools kali kali-linux linux reconnaissance scan scaner scanner script sh site the404hacking urlextractor web website
Last synced: 26 days ago
JSON representation
Information Gathering & WebSite ReConnaissance.
- Host: GitHub
- URL: https://github.com/the404hacking/urlextractor
- Owner: The404Hacking
- License: other
- Created: 2018-02-06T17:32:14.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2018-02-08T21:06:38.000Z (over 7 years ago)
- Last Synced: 2025-04-01T14:58:54.894Z (2 months ago)
- Topics: bash, information-gathering, information-gathering-tools, kali, kali-linux, linux, reconnaissance, scan, scaner, scanner, script, sh, site, the404hacking, urlextractor, web, website
- Language: Shell
- Size: 455 KB
- Stars: 21
- Watchers: 0
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# URLExtractor
Information gathering & website reconnaissance
## Screenshot
------
**Usage:**
`./extractor.sh http://www.target.org/`Features:
------* IP and hosting info like city and country (using [FreegeoIP](http://freegeoip.net/))
* DNS servers (using [dig](http://packages.ubuntu.com/precise/dnsutils))
* ASN, Network range, ISP name (using [RISwhois](https://www.ripe.net/analyse/archived-projects/ris-tools-web-interfaces/riswhois))
* Load balancer test
* Whois for abuse mail (using [Spamcop](https://www.spamcop.net/))
* PAC (Proxy Auto Configuration) file
* Compares hashes to diff code
* robots.txt (recursively looking for hidden stuff)
* Source code (looking for passwords and users)
* External links (frames from other websites)
* Directory FUZZ (like Dirbuster and Wfuzz - using [Dirbuster](https://www.owasp.org/index.php/Category:OWASP_DirBuster_Project)) directory list)
* [URLvoid](http://www.urlvoid.com/) API - checks Google page rank, Alexa rank and possible blacklists
* Provides useful links at other websites to correlate with IP/ASN
* Option to open ALL results in browser at the endChangelog to version 0.1.9:
------* Abuse mail using lynx istead of ~~curl~~
* Target server name parsing fixed
* More verbose about HTTP codes and directory discovery
* MD5 collection for IP fixed
* Links found now show unique URLs from array
* [New feature] **Google** results
* [New feature] **Bing** IP check for other hosts/vhosts
* [New feature] Opened ports from **Shodan**
* [New feature] **VirusTotal** information about IP
* [New feature] **Alexa Rank** information about $TARGET_HOSTRequirements:
------Tested on Kali light mini AND OSX 10.11.3 with brew
```
sudo apt-get install bc curl dnsutils libxml2-utils whois md5sha1sum lynx -y
```**Configuration file:**
```
CURL_TIMEOUT=15 #timeout in --connect-timeout
CURL_UA=Mozilla #user-agent (keep it simple)
INTERNAL=NO #YES OR NO (show internal network info)
URLVOID_KEY=your_API_key #using API from http://www.urlvoid.com/
FUZZ_LIMIT=10 #how many lines it will read from fuzz file
OPEN_TARGET_URLS=NO #open found URLs at the end of script
OPEN_EXTERNAL_LINKS=NO #open external links (frames) at the end of script
```Todo list:
------* [x] Upload to github :)
* [ ] Integration with other APIs
* [ ] Add host regex validation
* [ ] Use GNU parallel to fuzz URLs
* [ ] Export to CSV
* [ ] Possible migration to python
* [ ] Integration with JoomScan/WPScan/CMSmap
* [ ] Integration with CipherScan
* [ ] Check for installed packages## Download and Clone
> Download: Click [Here](https://github.com/The404Hacking/URLExtractor/archive/master.zip) (URLExtractor-master.zip)> Clone: git clone [https://github.com/The404Hacking/URLExtractor.git](https://github.com/The404Hacking/URLExtractor.git)
## The404Hacking | Digital Security ReSearch Group
[The404Hacking](https://T.me/The404Hacking)## Follow us !
[The404Hacking](https://T.me/The404Hacking) - [The404Cracking](https://T.me/The404Cracking)[Instagram](https://instagram.com/The404Hacking) - [GitHub](https://github.com/The404Hacking)
[YouTube](http://yon.ir/youtube404) - [Aparat](http://www.aparat.com/The404Hacking)
[Weblog](http://the404hacking.blogsky.com) - [Email](mailto:[email protected])