An open API service indexing awesome lists of open source software.

https://github.com/yyhsong/ipyspider

Python网络爬虫与信息提取
https://github.com/yyhsong/ipyspider

beautifulsoup lxml pyquery python-spider requests scrapy urllib

Last synced: 12 days ago
JSON representation

Python网络爬虫与信息提取

Awesome Lists containing this project

README

        

# iPySpider
基于Python 3.x的网络爬虫与信息提取

网页即接口 The website is the API.

## 网络请求库
- urllib Python内置标准库
- requests 基于urllib的再次封装

## 文档解析及信息提取库
- lxml
- pyquery
- beautifulsoup
- re

## 网络爬虫框架
- scrapy

## 实现定时爬虫任务
- APScheduler