Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/desion/tidy_page
It is a html parser.Given a html document,It can get the content from the document. 给定一个网页提取网页中的正文内容和标题,用于网页解析、内容提取
https://github.com/desion/tidy_page
html parser python2 spider
Last synced: 3 days ago
JSON representation
It is a html parser.Given a html document,It can get the content from the document. 给定一个网页提取网页中的正文内容和标题,用于网页解析、内容提取
- Host: GitHub
- URL: https://github.com/desion/tidy_page
- Owner: desion
- License: mit
- Created: 2017-02-20T09:21:25.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-05-31T06:08:13.000Z (over 7 years ago)
- Last Synced: 2024-08-30T22:40:45.600Z (28 days ago)
- Topics: html, parser, python2, spider
- Language: Python
- Size: 28.3 KB
- Stars: 5
- Watchers: 3
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
# tidy_page
It is a html parser.Given a html document,It can get the content from the document. 给定一个网页提取网页中的正文内容和标题,用于网页解析、内容提取