Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/teamhg-memex/extract-html-diff
extract difference between two html pages
https://github.com/teamhg-memex/extract-html-diff
diff html
Last synced: 7 days ago
JSON representation
extract difference between two html pages
- Host: GitHub
- URL: https://github.com/teamhg-memex/extract-html-diff
- Owner: TeamHG-Memex
- License: mit
- Created: 2017-02-08T13:53:19.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-05-29T21:28:49.000Z (over 6 years ago)
- Last Synced: 2024-04-25T00:56:29.799Z (7 months ago)
- Topics: diff, html
- Language: HTML
- Size: 87.9 KB
- Stars: 32
- Watchers: 13
- Forks: 5
- Open Issues: 3
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGES.rst
- License: LICENSE.txt
Awesome Lists containing this project
README
extract-html-diff: extract difference between two html pages
============================================================.. image:: https://img.shields.io/pypi/v/extract-html-diff.svg
:target: https://pypi.python.org/pypi/extract-html-diff
:alt: PyPI Version.. image:: https://img.shields.io/travis/TeamHG-Memex/extract-html-diff/master.svg
:target: http://travis-ci.org/TeamHG-Memex/extract-html-diff
:alt: Build Status.. image:: http://codecov.io/github/TeamHG-Memex/extract-html-diff/coverage.svg?branch=master
:target: http://codecov.io/github/TeamHG-Memex/extract-html-diff?branch=master
:alt: Code CoverageThis package allows you to extract a difference between two html pages:
given pages A and B, it will try to extract parts of A that are changed in B.
It uses ``lxml.html.diff`` under the hood. but provides only changed parts as HTML.It requires Python 3 currently.
License is MIT.
Installaton
-----------You can install the package from PyPI::
pip install extract-html-diff
Usage
-----You can extract diff as text::
import extract_html_diff
html = '
'My site
My content
other_html = ''My site
Other contentextract_html_diff.as_string(html, other_html)
this will give you::
'
'My contentYou can also get diff as a tree (an ``lxml.html.HtmlElement``) if
you plan to do additional transformations or change serialization::extract_html_diff.as_tree(html, other_html)
You can pass input html as ``str`` or ``bytes``
(it will be parsed with ``lxml.html.fromstring`` in this case), or as an already parsed
``lxml.html.HtmlElement``.----
.. image:: https://hyperiongray.s3.amazonaws.com/define-hg.svg
:target: https://www.hyperiongray.com/?pk_campaign=github&pk_kwd=extract-html-diff
:alt: define hyperiongray