{"id":21343017,"url":"https://github.com/thelime1/validity","last_synced_at":"2026-01-18T23:03:49.247Z","repository":{"id":175078106,"uuid":"652329707","full_name":"TheLime1/Validity","owner":"TheLime1","description":"list of only alive proxies (IP:port) for testing \u0026 scraping — updated and validated every 12hrs","archived":false,"fork":false,"pushed_at":"2025-11-08T18:56:00.000Z","size":59797,"stargazers_count":16,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-11-08T20:27:41.172Z","etag":null,"topics":["online-proxy","proxy","proxy-checker","proxy-list"],"latest_commit_sha":null,"homepage":"https://raw.githubusercontent.com/TheLime1/Validity/refs/heads/main/data/http.txt","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TheLime1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-06-11T20:01:44.000Z","updated_at":"2025-11-08T18:56:07.000Z","dependencies_parsed_at":"2026-01-05T06:08:20.632Z","dependency_job_id":null,"html_url":"https://github.com/TheLime1/Validity","commit_stats":null,"previous_names":["thelime1/online-proxy-list","thelime1/validity"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/TheLime1/Validity","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheLime1%2FValidity","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheLime1%2FValidity/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheLime1%2FValidity/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheLime1%2FValidity/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TheLime1","download_url":"https://codeload.github.com/TheLime1/Validity/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TheLime1%2FValidity/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28553055,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T20:59:07.572Z","status":"ssl_error","status_checked_at":"2026-01-18T20:59:02.799Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["online-proxy","proxy","proxy-checker","proxy-list"],"created_at":"2024-11-22T01:11:41.461Z","updated_at":"2026-01-18T23:03:49.240Z","avatar_url":"https://github.com/TheLime1.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Validity\n\n**Note: This repository was archived by the owner on Dec 22, 2023. It is now back alive!**\n\nValidity is now a dedicated proxy validator tool that checks and exports valid proxies from public sources.\n\n## Features\n\n- Validates HTTP and SOCKS5 proxies from multiple public sources\n- Outputs clean proxy lists in separate files\n- Automatic duplicate removal using efficient set-based deduplication\n- Regular validation checks with multi-threading for speed\n- Maintains maximum 1000 proxies per type\n- Daily script designed to keep only alive proxies\n\n## Usage\n\n### 🤖 Automated (Recommended)\n\nThe repository includes **two GitHub Actions** for different use cases:\n\n#### **🕐 Full Validation (Every 12 Hours)**\n- **Schedule**: 6:00 AM and 6:00 PM UTC (automatic)\n- **Duration**: 30 minutes per run with automatic shutdown\n- **Scope**: Complete validation of all sources and proxy types\n- **Auto-commit**: Results automatically committed to repository\n\n#### **⚡ Quick Test (Manual)**\n- **Trigger**: Manual only (Actions tab → \"Quick Proxy Test\")\n- **Duration**: 3 minutes (customizable: 1-10 minutes)\n- **Scope**: Limited validation for immediate results\n- **Use case**: Quick proxy refresh, testing, or immediate needs\n\n**Manual Trigger Options:**\n\n1. **Quick Test (3 minutes):**\n   - Go to **Actions** tab → **\"Quick Proxy Test (3 minutes)\"**\n   - Click **\"Run workflow\"**\n   - Optionally customize duration (1-10 minutes)\n   - Choose proxy types: HTTP, SOCKS5, or both\n\n2. **Full Validation:**\n   - Go to **Actions** tab → **\"Automated Proxy Validation\"**\n   - Click **\"Run workflow\"**\n   - Optionally customize duration (default: 30 minutes)\n\n### 🔧 Manual Setup\n\n1. Install dependencies:\n\n```bash\npip install -r requirements.txt\n```\n\n2. Generate random headers (optional but recommended):\n\n```bash\npython generate_headers.py\n```\n\n3. Run the proxy scraper:\n\n```bash\npython proxy_scraper.py\n```\n\n**⚠️ Important:** The script **ALWAYS** performs `git pull` at startup to ensure you have the latest changes before running.\n\n**Available Options:**\n\n| Parameter       | Type | Default | Description                                                            |\n| --------------- | ---- | ------- | ---------------------------------------------------------------------- |\n| `--push`        | flag | False   | Automatically git add and push changes when program finishes or Ctrl+C |\n| `--timeout`     | int  | 3       | Timeout in seconds for proxy validation                                |\n| `--max-workers` | int  | auto    | Maximum number of worker threads (auto-calculated based on CPU)        |\n| `--batch-size`  | int  | 50      | Number of proxies to take from each source per batch                   |\n\n**Examples:**\n\n```bash\n# Basic usage\npython proxy_scraper.py\n\n# With automatic git push\npython proxy_scraper.py --push\n\n# Custom settings with git push\npython proxy_scraper.py --push --timeout 5 --max-workers 100 --batch-size 25\n```\n\n4. Analyze source quality (after running scraper):\n\n```bash\npython analyze_proxy_quality.py --days 7 --save --performance\n```\n\nThe scraper will:\n\n- **🔄 Pull latest changes** from remote repository (mandatory first step)\n- **Load dead proxies database** and clean entries older than 30 days\n- **Validate existing proxies** in data folder first\n- **Remove dead proxies** from data files and add them to dead_proxies.txt\n- **Fetch new proxies** from sources concurrently\n- **Skip proxies** already in dead_proxies database\n- **Validate new proxies** using random headers\n- **Log detailed validation** results for quality analysis\n- **Save up to 1000 alive proxies** per type with periodic auto-save\n\n## Quality Analysis\n\nThe quality analyzer (`analyze_proxy_quality.py`) provides detailed statistics about each proxy source with comprehensive analysis options.\n\n### Parameters \u0026 Usage\n\n#### Basic Usage\n```bash\npython analyze_proxy_quality.py\n```\n\n#### All Available Parameters\n\n| Parameter         | Type   | Default                         | Description                                        |\n| ----------------- | ------ | ------------------------------- | -------------------------------------------------- |\n| `--days`          | int    | 7                               | Number of days to analyze (1-365)                  |\n| `--save`          | flag   | False                           | Save quality report to CSV file                    |\n| `--performance`   | flag   | False                           | Show top 10 fastest performing proxies             |\n| `--worst-sources` | flag   | False                           | Show detailed worst sources analysis by proxy type |\n| `--log-file`      | string | `data/proxy_validation_log.csv` | Path to proxy validation log file                  |\n\n#### Detailed Examples\n\n**1. Basic Quality Report (Last 7 Days)**\n```bash\npython analyze_proxy_quality.py\n```\nShows source rankings, alive/dead percentages, response times, and worst sources by type.\n\n**2. Extended Analysis Period**\n```bash\npython analyze_proxy_quality.py --days 30\n```\nAnalyze proxy performance over the last 30 days for trend analysis.\n\n**3. Quick Daily Check**\n```bash\npython analyze_proxy_quality.py --days 1\n```\nView today's proxy validation results only.\n\n**4. Performance Analysis**\n```bash\npython analyze_proxy_quality.py --performance\n```\nShows the 10 fastest responding proxies with their response times and sources.\n\n**5. Worst Sources Analysis**\n```bash\npython analyze_proxy_quality.py --worst-sources\n```\nDetailed analysis of the 5 worst performing sources for each proxy type (HTTP, SOCKS4, SOCKS5) with performance warnings:\n- 🚨 **CRITICAL**: \u003c10% success rate (remove immediately)  \n- ⚠️ **WARNING**: \u003c20% success rate (consider replacement)\n\n**6. Comprehensive Analysis with Export**\n```bash\npython analyze_proxy_quality.py --days 14 --save --performance --worst-sources\n```\nComplete analysis with:\n- 14-day data analysis\n- CSV export to `data/source_quality_report.csv`\n- Top performing proxies list\n- Detailed worst sources breakdown\n\n**7. Custom Log File Analysis**\n```bash\npython analyze_proxy_quality.py --log-file \"custom/path/logs.csv\" --days 7\n```\nAnalyze a different validation log file.\n\n### Output Features\n\n#### Main Quality Report Includes:\n- **Source Rankings**: Sorted by quality score (alive percentage)\n- **Detailed Metrics**: Total tested, alive/dead counts, response times\n- **Proxy Types**: Which types each source provides\n- **Overall Statistics**: Aggregate performance across all sources\n- **Worst Sources by Type**: Bottom 5 sources for each proxy type\n- **Smart Recommendations**: Data-driven suggestions for source management\n\n#### CSV Export Format\nWhen using `--save`, generates `data/source_quality_report.csv` with:\n```csv\nsource_url,total_tested,alive_count,dead_count,alive_percent,quality_score,analysis_date\n```\n\n#### Performance Analysis Shows:\n```\n🚀 TOP PERFORMING PROXIES (Last 7 days):\n#1  192.168.1.100:8080  |  245ms | http   | https://source1.com\n#2  10.0.0.50:1080      |  289ms | socks5 | https://source2.com\n```\n\n#### Worst Sources Analysis Example:\n```\n📍 HTTP PROXIES - Bottom 5 Sources:\n#1 https://bad-source.com\n   📊 Total Tested: 1,000\n   ✅ Alive: 12\n   💯 Success Rate: 1.2%\n   🚨 CRITICAL: Consider removing this source immediately\n```\n\n### Use Cases\n\n**Daily Monitoring**\n```bash\npython analyze_proxy_quality.py --days 1 --performance\n```\n\n**Weekly Review**\n```bash\npython analyze_proxy_quality.py --save --worst-sources\n```\n\n**Monthly Source Audit**\n```bash\npython analyze_proxy_quality.py --days 30 --save --performance --worst-sources\n```\n\n**Source Quality Investigation**\n```bash\npython analyze_proxy_quality.py --days 7 --worst-sources\n```\n\nView analysis for different time periods:\n```bash\npython analyze_proxy_quality.py --days 1   # Last 24 hours\npython analyze_proxy_quality.py --days 30  # Last month\n```\n\n## Output\n\nThe validated proxies are saved in the `data/` folder:\n\n- `data/http.txt` - Valid HTTP proxies\n- `data/socks5.txt` - Valid SOCKS5 proxies\n\n## Validation Logs\n\nProxy validation logs are stored in CSV format in the `data/` folder:\n\n- `data/proxy_validation_log.csv` - Main validation log\n- `data/proxy_validation_log_N.csv` - Rotated log files (when main file exceeds 95MB)\n\n**Log Rotation \u0026 Cleanup:**\n- CSV files automatically rotate when reaching 95MB to stay under GitHub's 100MB limit\n- Log entries older than 30 days are automatically removed during startup\n- The analyzer automatically reads from all rotated log files\n\n## Sources\n\nAll proxy sources are publicly available and listed in `sources.csv` for transparency.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthelime1%2Fvalidity","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthelime1%2Fvalidity","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthelime1%2Fvalidity/lists"}