{"id":47603835,"url":"https://github.com/suundumused/proxy-scraping","last_synced_at":"2026-04-01T19:01:06.306Z","repository":{"id":220904841,"uuid":"752891538","full_name":"Suundumused/proxy-scraping","owner":"Suundumused","description":"Project to receive, validate and store a list of free proxies.","archived":false,"fork":false,"pushed_at":"2025-12-22T03:31:46.000Z","size":31,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-12-22T08:00:54.928Z","etag":null,"topics":["annonymous","anonymity","anonymization","anonymizer","anonymous-proxy","ip","proxy","proxy-checker","proxy-configuration","proxy-list","proxy-pattern","proxy-rotation","proxy-scraper","proxy-server","proxychains","proxypool","python","python-script","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Suundumused.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-02-05T03:30:10.000Z","updated_at":"2025-12-22T03:31:50.000Z","dependencies_parsed_at":"2024-03-03T15:26:02.859Z","dependency_job_id":"735c3193-3dec-4d0b-96b1-e041c75b0fea","html_url":"https://github.com/Suundumused/proxy-scraping","commit_stats":null,"previous_names":["suundumused/proxy-scrape"],"tags_count":0,"template":true,"template_full_name":null,"purl":"pkg:github/Suundumused/proxy-scraping","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scraping","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scraping/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scraping/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scraping/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Suundumused","download_url":"https://codeload.github.com/Suundumused/proxy-scraping/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scraping/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31291005,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T13:12:26.723Z","status":"ssl_error","status_checked_at":"2026-04-01T13:12:25.102Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annonymous","anonymity","anonymization","anonymizer","anonymous-proxy","ip","proxy","proxy-checker","proxy-configuration","proxy-list","proxy-pattern","proxy-rotation","proxy-scraper","proxy-server","proxychains","proxypool","python","python-script","python3"],"created_at":"2026-04-01T19:00:34.770Z","updated_at":"2026-04-01T19:01:06.253Z","avatar_url":"https://github.com/Suundumused.png","language":"Python","readme":"﻿# Proxy Scraping\n**Project to receive, test, validate and store a list of free proxies.**\n\n## Installation\n    pip install -r ./requirements.txt\n\n## Requirements\n - argparse\n - urllib3\n - requests[socks]\n\n## Usage \nOn `proxy_validator.py` Switch between providers available in `\\ip_checker_provider_modules`. Those responsible for testing the proxy and the new masked IP address. The script for each schema always returns in string format, compatible with the program.\n\n    from ip_checker_provider_modules.ipify import get_public_ip\n\n----\n\nOn `\\proxy_list_api_modules` you can add or edit the scripts for the Free Proxy List provider schemas. The configuration for each corresponding provider module is in the `schemas\\proxy_providers_config` folder. All must return a list of dictionaries in the same format compatible with the rest of the program. Eg:.\n\n    {\n        \"data\": [\n            {\n                \"_id\": \"xxxx\",\n                \"ip\": \"xxx.xxx.xxx.xxx\",\n                \"city\": \"Busan\",\n                \"country\": \"KR\",\n                \"lastChecked\": 1766169816,\n                \"latency\": 219.011,\n                \"port\": \"9400\",\n                \"protocols\": [\n                    \"socks4\"\n                ]\n            },\n            {\n                \"_id\": \"xxxx\",\n                \"ip\": \"xxx.xxx.xxx.xxx\",\n                \"city\": \"Khon Kaen\",\n                \"country\": \"TH\",\n                \"lastChecked\": 1766169816,\n                \"latency\": 236.013,\n                \"port\": \"8080\",\n                \"protocols\": [\n                    \"socks4\"\n                ]\n            },\n            ...\n        ]\n    }\n\nThe Instance initially receives the arguments:\n\n - `-c` Certificate file path `certificate.pem`. This can also be set to 'True' or 'False' to use a generic certificate or disable it.\n - `-t` Time interval for testing each proxy server.\n\n## Overall Arguments\n - `-a` Name of the API provider from the list of free proxies. This should be an available option in `\\proxy_list_api_modules`.\n - `-i` Select the API that will obtain the public IP. It must be one of the options available in `\\ip_checker_provider_modules`.\n - `-l` Limit of tested and valid proxies per protocol.\n - `-o` It is the output folder that will have the json file with the tested proxy list.\n\n## Some Functions\n\n    retrieve_free_proxy_list(args.link, protocol)\n    \n - Receives the list of API-URL proxy servers with all protocols selected in a json.\n\n---\n    write_valid_list(content, protocol, args.output_folder, args.limit)\n\n - Test, validate (test_servers(...)) and save the ip:port and protocol in a json file.\n ---\n\n    test_servers(protocol, row, self.sess, self.certificate, self.old_ip)\n     \n - Individual function that tests the connection to the server and validates IP filtering.\n\n## Json Structure\n\n    {\n        \"protocolsCount\": {\n            \"socks5\": 1,\n            \"socks4\": 1\n        },\n        \"proxies\": [\n            {\n                \"ip\": \"xxx.xxx.xxx.xxx\",\n                \"port\": \"20000\",\n                \"country\": \"RU\",\n                \"latency\": 44.981,\n                \"protocol\": \"socks5\"\n            },\n            {\n                \"ip\": \"xxx.xxx.xxx.xxx\",\n                \"port\": \"60111\",\n                \"country\": \"FR\",\n                \"latency\": 9.506,\n                \"protocol\": \"socks4\"\n            }\n        ]\n    }\n\n\n## Custom arg Classes\n\n    str_bool_switcher_type(arg)\n\n - It is used by the --certificate(-c) argument, dynamically switches between string, bool.\n - str: When it is the path to the request certificate folder.\n - bool, True: integrated certificate.\n - bool, False: No check.","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuundumused%2Fproxy-scraping","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsuundumused%2Fproxy-scraping","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuundumused%2Fproxy-scraping/lists"}