https://github.com/cantcode023/dorktuah
Dork across search engines.
https://github.com/cantcode023/dorktuah
bing dorking google googledorks hawktuah open-source osint osint-python python
Last synced: 4 months ago
JSON representation
Dork across search engines.
- Host: GitHub
- URL: https://github.com/cantcode023/dorktuah
- Owner: CantCode023
- License: mit
- Created: 2025-01-07T16:59:02.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2025-01-14T07:54:39.000Z (over 1 year ago)
- Last Synced: 2025-08-08T17:51:08.267Z (11 months ago)
- Topics: bing, dorking, google, googledorks, hawktuah, open-source, osint, osint-python, python
- Language: Python
- Homepage:
- Size: 1.83 MB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README

# About
Dorktuah is a powerful Python tool designed for advanced Google dorking and web scraping through proxy rotation. It leverages multiple search engines while maintaining anonymity through an extensive proxy system. The project aims to provide researchers, security professionals, and developers with a reliable tool for gathering information while avoiding rate limiting and IP blocks.
Key features:
- Automated proxy rotation system
- Support for multiple proxy types (HTTP, SOCKS4, SOCKS5)
- Built-in proxy scraper with 100+ sources
- Proxy health checking and validation
- Clean and structured search results
- Rate limit avoidance through proxy rotation
- Custom proxy list support
The name "Dorktuah" combines "dork" (referring to Google dorking) with "tuah" (meaning luck/fortune in Malay), signifying a fortunate/successful dorking tool (also referencing to hawktuah because this project is hawktuah).
> [!IMPORTANT]
> Remember to use this tool responsibly and in accordance with the target website's terms of service and applicable laws.
---
# Installation
To use Dorktuah, follow these steps:
1. Clone the repository:
```bash
git clone https://github.com/CantCode023/dorktuah.git
```
2. Install required dependencies:
```bash
pip install -r dorktuah/requirements.txt
```
3. Run CLI and you're done!
```bash
python dorktuah
```

---
# TODOLIST:
- [x] set up project structure
- [x] write proxy rotation implementation
- proxy_pool as function
- requests.get(proxies=ProxyPool()) for easier rotation
- ProxyPool()
- type:Literal["socks4", "socks5", "http", "all"] = "all"
- get proxies from proxies.txt
- [x] write a proxy checker to make sure the returned proxy is alive
- [x] convert proxypool to class to allow argument inheritance for easier argument initalization across methods
- [x] write engine implementation
- use BeautifulSoup
- Engine()
- proxy_pool implementation
- [x] write etools scraping implementaiton
- [x] do pagination to retrieve **every** results
- load_more_results, pagination, get_source, search
- combine those 4
- 1. search, open ectools and search for the query
- WE ALSO WANT TO GIVE THE USER THE ABILITY TO GET NEXT RESULT
- def has_more_results()
- if has_more_results then show "click enter to go next" in cli
- if doesn't then don't show it.
- if click enter then load_more_results()
- get source and return
- [x] implement proxy pool in engine.py
- [x] make it into a cli using colorama and rich _maybe_?
- [x] make header "dorktuah"
- [x] make subheader "Dork across search engines."
- [x] put credentials (author, github, discord)
- ~~make textbox to ask for query using rich _maybe_~~
- [x] add config in cli (write config.json file)
- [x] add proxy support
- [x] enable proxy (y/n)
- [x] use custom proxy (y/n)
- [x] proxy type (socks4/socks5/http/all)
- [x] proxy path
- [x] add path checking to check if it exists
- [x] source_limit (1-100)
---
# FUTURE TODOLIST:
- [x] add scrape proxies and check proxies to ProxyPool to get newest proxies
- [x] make proxy checker faster
- [x] make proxy checking using asynchronous functions
- [x] refactor to follow the SOLID principle
- [ ] fix error
Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
handle:
Traceback (most recent call last):
File "C:\Users\cantc\AppData\Local\Programs\Python\Python312\Lib\asyncio\events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "C:\Users\cantc\AppData\Local\Programs\Python\Python312\Lib\asyncio\proactor_events.py", line 165, in _call_connection_lost
self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host