Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mogelpeter/proxy-scraper
This script is designed to download and verify HTTP/s and SOCKS5 proxies from public databases and files.
https://github.com/mogelpeter/proxy-scraper
http http-proxy https-proxy proxies proxies-scraper proxy proxy-checker proxy-scraper python python3 socks5 socks5-proxy
Last synced: 29 days ago
JSON representation
This script is designed to download and verify HTTP/s and SOCKS5 proxies from public databases and files.
- Host: GitHub
- URL: https://github.com/mogelpeter/proxy-scraper
- Owner: mogelpeter
- Created: 2024-07-12T11:02:30.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-08-01T23:31:53.000Z (5 months ago)
- Last Synced: 2024-08-02T01:03:50.098Z (5 months ago)
- Topics: http, http-proxy, https-proxy, proxies, proxies-scraper, proxy, proxy-checker, proxy-scraper, python, python3, socks5, socks5-proxy
- Language: Python
- Homepage:
- Size: 45.9 KB
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Proxy Scraper and Checker
![Stable](https://img.shields.io/badge/status-stable-brightgreen)
![Discord](https://dcbadge.limes.pink/api/shield/741265873779818566?compact=true)
## Script Description
This script is designed to download and verify HTTP/s and SOCKS5 proxies from public databases and files. It offers the following key features:
- **Configurable Threading**: Adjust the number of threads based on your system's capability using a `usage_level` setting from 1 to 3.
- **Scraping Proxies**: Automatically scrape HTTP/s and SOCKS5 proxies from various online sources.
- **Checking Proxies**: Validate the functionality of the scraped proxies to ensure they are operational.
- **System Monitoring**: Display the script's CPU and RAM usage in the console title for real-time performance monitoring.### Usage
1. **Installation**:
- Clone the repository or download the .zip file.
- Navigate to the project directory.2. **Running the Script**:
- Execute the script using:
```bash
start.bat
```
or
```bash
python main.py
```3. **Configuration**:
- The script uses a `config.json` file to manage settings.
- Adjust the `usage_level`, and specify the list of URLs for HTTP/s and SOCKS5 proxies.4. **Educational & Research Purposes Only**:
- This script is intended for educational and research purposes only. Use it responsibly and in accordance with applicable laws.### Requirements
- Python 3.8+
- All necessary packages are automatically installed when the script is run.### Example `config.json`
```json
{
"usage_level": 2,
"http_links": [
"https://api.proxyscrape.com/?request=getproxies&proxytype=https&timeout=10000&country=all&ssl=all&anonymity=all",
"https://api.proxyscrape.com/v2/?request=getproxies&protocol=http&timeout=10000&country=all&ssl=all&anonymity=all"
],
"socks5_links": [
"https://raw.githubusercontent.com/B4RC0DE-TM/proxy-list/main/SOCKS5.txt",
"https://raw.githubusercontent.com/saschazesiger/Free-Proxies/master/proxies/socks5.txt"
]
}
```By following this documentation, you should be able to set up, run, and understand the Proxy Scraper and Checker script with ease.
## Important Information!
For educational & research purposes only!
## Detailed Documentation
### Functions
#### `generate_random_folder_name(length=32)`
Generates a random folder name with the specified length.
#### `remove_old_folders(base_folder=".")`
Removes old folders with 32 character names in the base folder.
#### `get_time_rn()`
Returns the current time formatted as HH:MM:SS.
#### `get_usage_level_str(level)`
Converts the usage level integer to a string representation.
#### `update_title(http_selected, socks5_selected, usage_level)`
Updates the console title with current CPU, RAM usage, and validation counts.
#### `center_text(text, width)`
Centers the text within the given width.
#### `ui()`
Clears the console and displays the main UI with ASCII art.
#### `scrape_proxy_links(link, proxy_type)`
Scrapes proxies from the given link, retries up to 3 times in case of failure.
#### `check_proxy_link(link)`
Checks if a proxy link is accessible.
#### `clean_proxy_links()`
Cleans the proxy links by removing non-accessible ones.
#### `scrape_proxies(proxy_list, proxy_type, file_name)`
Scrapes proxies from the provided list of links and saves them to a file.
#### `check_proxy_http(proxy)`
Checks the validity of an HTTP/s proxy by making a request to httpbin.org.
#### `check_proxy_socks5(proxy)`
Checks the validity of a SOCKS5 proxy by connecting to google.com.
#### `check_http_proxies(proxies)`
Checks a list of HTTP/s proxies for validity.
#### `check_socks5_proxies(proxies)`
Checks a list of SOCKS5 proxies for validity.
#### `signal_handler(sig, frame)`
Handles SIGINT signal (Ctrl+C) to exit gracefully.
#### `set_process_priority()`
Sets the process priority to high for better performance.
#### `loading_animation()`
Displays a loading animation while verifying proxy links.
#### `clear_console()`
Clears the console screen.
#### `continuously_update_title()`
Continuously updates the console title with current status.