{"id":21810447,"url":"https://github.com/suundumused/proxy-scrape","last_synced_at":"2025-03-21T08:45:35.323Z","repository":{"id":220904841,"uuid":"752891538","full_name":"Suundumused/proxy-scrape","owner":"Suundumused","description":"Project to receive, validate and store a list of free proxies.","archived":false,"fork":false,"pushed_at":"2024-04-26T00:01:44.000Z","size":12,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-26T05:25:00.673Z","etag":null,"topics":["annonymous","anonymity","anonymization","anonymizer","anonymous-proxy","proxy","proxy-checker","proxy-list","proxy-pattern","proxy-scraper","proxypool","python","python-script","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Suundumused.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-02-05T03:30:10.000Z","updated_at":"2024-04-26T00:01:47.000Z","dependencies_parsed_at":"2024-03-03T15:26:02.859Z","dependency_job_id":"735c3193-3dec-4d0b-96b1-e041c75b0fea","html_url":"https://github.com/Suundumused/proxy-scrape","commit_stats":null,"previous_names":["suundumused/proxy-scrape"],"tags_count":0,"template":true,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scrape","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scrape/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scrape/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Suundumused%2Fproxy-scrape/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Suundumused","download_url":"https://codeload.github.com/Suundumused/proxy-scrape/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244767542,"owners_count":20507110,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annonymous","anonymity","anonymization","anonymizer","anonymous-proxy","proxy","proxy-checker","proxy-list","proxy-pattern","proxy-scraper","proxypool","python","python-script","python3"],"created_at":"2024-11-27T13:35:41.279Z","updated_at":"2025-03-21T08:45:35.303Z","avatar_url":"https://github.com/Suundumused.png","language":"Python","readme":"﻿# Proxy Scrape\nProject to receive, validate and store a list of free proxies.\n\n# Installation\n - `pip install -r .../Requirements.txt`\n\n# Requirements\n - argparse\n - urllib3 \n - requests\n - jinja2 \n - requests[socks]\n - --\n - proxies_validator.py\n - custom_arg_types.py\n\n## Usage \nThese initial lines are required in proxy_scrape.py for basic functionality.\n\n    import argparse\n    import csv\n    import json\n    import requests\n    import os\n    import time\n    \n    from jinja2 import Template\n    from proxies_validator import test_servers\n    from custom_arg_types import str_bool_switcher_type, tuple_type\n----\nThe Instance initially receives the arguments:\n\n - `-pem` Certificate file path file.pem\n - `-to` Time interval for testing each proxy server.\n---\n\n    client = ProxyReceiver(args.certificate, args.time_out)\n\n## Overall arguments\n\n - `-p` Receives a tuple of desired protocols for proxy server search eg: --protocols 'https', 'socks5'\n - `-l` Limit of tested and valid proxies per protocol.\n - `-url` API URL that will have the list of IP:PORT proxy servers, the {{protocol_value}} parameter is mandatory after the protocol= variable or any protocol type reference variable.\n - `-out` It is the output folder that will have the csv file with the tested proxy list.\n\n## Functions\n\n    content = client.retrieve_free_proxy_list(args.link, protocol)\n    \n\n - Receives the list of API-URL proxy servers with all protocols selected in string.\n\n---\n\n    client.write_valid_list(content, protocol, args.output_folder, args.limit)\n\n - Test, validate (test_servers(...)) and save the ip:port and protocol in a csv file.\n ---\n\n    test_servers(protocol, row, self.sess, self.certificate, self.old_ip)\n     \n - Individual function that tests the connection to the server and validates IP filtering.\n\n\n## csv structure\n\n|    url          |port                          |protocol                         |\n|----------------|-------------------------------|-----------------------------|\n|123.456.78.90   |1234                           |socks5                       |\n|098.765.43.21   |4321                           |https                        |\n.....\n\n\n## Custom arg classes\n\n    str_bool_switcher_type(arg)\n\n - It is used by the --certificate(-pem) argument, dynamically switches between string, bool.\n - str: When it is the path to the request certificate folder.\n - bool, True: integrated certificate.\n - bool, False: No check.\n\nUsage example: \n\n    self.rex  =  self.sess.get('https://....', timeout=10, verify=self.certificate)\n\n\n---\n    tuple_type(arg)\n\n - It is used by the argument --protocols(-p), receives a list of protocols, eg: --protocols 'http', 'socks4'\n\nUsage example: \n\n    for protocol in args.protocols:\n\t        print(f'---\\nProtocol selected: {protocol}')\n         \n    \tcontent = client.retrieve_free_proxy_list(args.link, protocol)\n    \tclient.write_valid_list(content, protocol, args.output_folder, args.limit)\n\n---\n\n    proxies = {'http': f'{protocol}://{url}',\n                    'https': f'{protocol}://{url}'}\n---\n\n    resp = self.sess.get('https://....', timeout=5, proxies=proxies, verify=self.certificate)\n\n## 💖 Support Me\n\nIf you find my work valuable and want to support me, consider making a donation. Your contribution goes a long way in helping me continue my open-source contributions and creating awesome content!\n\n[![Buy me a coffee](https://img.shields.io/badge/Buy%20me%20a%20coffee-Donate-blue.svg)](https://www.paypal.com/donate/?hosted_button_id=A2S5G97QM7XCJ)\n[![PayPal](https://img.shields.io/badge/PayPal-Donate-blue.svg)](https://www.paypal.com/donate/?hosted_button_id=A2S5G97QM7XCJ)\n","funding_links":["https://www.paypal.com/donate/?hosted_button_id=A2S5G97QM7XCJ"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuundumused%2Fproxy-scrape","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsuundumused%2Fproxy-scrape","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuundumused%2Fproxy-scrape/lists"}