Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/xnffdd/proxypool
自建免费IP代理池。
https://github.com/xnffdd/proxypool
ip pool proxy python spider
Last synced: about 2 months ago
JSON representation
自建免费IP代理池。
- Host: GitHub
- URL: https://github.com/xnffdd/proxypool
- Owner: xnffdd
- Created: 2017-12-18T08:01:47.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-03-03T13:58:47.000Z (almost 6 years ago)
- Last Synced: 2024-08-03T17:12:27.627Z (5 months ago)
- Topics: ip, pool, proxy, python, spider
- Language: Python
- Homepage:
- Size: 1.36 MB
- Stars: 76
- Watchers: 3
- Forks: 34
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-network-stuff - **63**星
README
# proxypool
自建免费代理IP池
## 系统功能
- 自动爬取互联网上公开的免费代理IP(目前已支持西刺代理、快代理、IP181)
- 周期性验证代理IP有效性
- 提供http接口获取可用IP## 系统架构
![系统架构](https://raw.githubusercontent.com/lsdir/proxypool/master/image/architecture.png)
## 项目源码结构
- /db 数据库操作
- /schedule 后台周期任务
- /spider 爬虫
- /util 通用工具
- /web web服务
- /log 日志存储文件夹
- config.py 全局配置
- main.py 启动入口## 部署运行
- 下载整个项目
- 安装Python3
- 安装Python包
```
pip install -r requirements.txt
```
- 安装MySQL数据库
- 初始化建表 db/proxy.sql
- 修改配置文件 config.py
- 运行 main.py
```
python main.py
```## HTTP接口
### 1. 获取单个可用IP
##### 基本信息
URL|http://localhost:9999/get
:---|:---
HTTP请求方式|GET
方法返回|JSON##### 请求参数(bodyParam)
参数名|类型|必填|参数位置|描述|默认值
---|---|---|---|---|---
check_in_hour|float|否|urlParam|代理最后验证时间(小时)以内|24
response_time_in_second|float|否|urlParam|代理响应时间(秒)以内|null
protocol|string|否|urlParam|代理网络协议,http/https|null
anonymity|string|否|urlParam|代理匿名性,transparent/anonymous/high_anonymous|null##### 请求示例(Python示例)
```
#!/usr/bin/env python3
# -*- coding: utf-8 -*-import requests
url = "http://localhost:9999/get"
querystring = {"anonymity":"high_anonymous","response_time_in_second":"1.5"}
response = requests.request("GET", url, params=querystring)print(response.json())
```##### JSON返回示例
```
{
"ret": 0,
"data": {
"anonymity": "high_anonymous",
"check_time": "2017-12-20 13:55:17",
"country": "CN",
"export_address": [
"120.25.253.234"
],
"from": "快代理",
"grab_time": "2017-12-20 13:54:55",
"host": "120.25.253.234",
"port": "8118",
"protocol": "http",
"response_time": 1.45
}
}
```### 2. 获取全部可用IP
##### 基本信息
URL|http://localhost:9999/get_all
:---|:---
HTTP请求方式|GET
方法返回|JSON##### 请求参数(bodyParam)
参数名|类型|必填|参数位置|描述|默认值
---|---|---|---|---|---
check_in_hour|float|否|urlParam|代理最后验证时间(小时)以内|24
response_time_in_second|float|否|urlParam|代理响应时间(秒)以内|null
protocol|string|否|urlParam|代理网络协议,http/https|null
anonymity|string|否|urlParam|代理匿名性,transparent/anonymous/high_anonymous|null##### 请求示例(Python示例)
```
#!/usr/bin/env python3
# -*- coding: utf-8 -*-import requests
url = "http://localhost:9999/get_all"
querystring = {"anonymity":"high_anonymous","response_time_in_second":"1.5","protocol":"https"}
response = requests.request("GET", url, params=querystring)print(response.json())
```##### JSON返回示例
```
{
"ret": 0,
"data": [
{
"anonymity": "high_anonymous",
"check_time": "2017-12-20 14:10:25",
"country": "CN",
"export_address": [
"118.114.77.47"
],
"from": "西刺代理",
"grab_time": "2017-12-20 14:09:36",
"host": "118.114.77.47",
"port": "8080",
"protocol": "https",
"response_time": 1.41
},
{
"anonymity": "high_anonymous",
"check_time": "2017-12-20 13:09:40",
"country": "CN",
"export_address": [
"119.29.178.21"
],
"from": "西刺代理",
"grab_time": "2017-12-14 16:17:52",
"host": "119.29.178.21",
"port": "8118",
"protocol": "https",
"response_time": 1.11
}
]
}
```