https://github.com/imgurbot12/pyxml
Pure python3 alternative to stdlib xml.etree with HTML support
https://github.com/imgurbot12/pyxml
html-parser parser python python3 xml xml-parser
Last synced: 10 months ago
JSON representation
Pure python3 alternative to stdlib xml.etree with HTML support
- Host: GitHub
- URL: https://github.com/imgurbot12/pyxml
- Owner: imgurbot12
- License: mit
- Created: 2023-04-01T23:53:37.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2025-01-07T22:47:56.000Z (over 1 year ago)
- Last Synced: 2025-01-07T23:31:18.676Z (over 1 year ago)
- Topics: html-parser, parser, python, python3, xml, xml-parser
- Language: Python
- Homepage:
- Size: 97.7 KB
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Pyxml
------
Pure python3 alternative to stdlib xml.etree with HTML support
### Install
```
pip install pyxml3
```
### Advantages
1. The default parser ignores XML Declaration Entities avoiding
most if not all XML related vulnerabilities such as
[The Billion Laughs Attack](https://en.wikipedia.org/wiki/Billion_laughs_attack)
2. Our XPATH implementation is much more complete than both xml.etree
and even LXML. Additional functions and features are available making
it easier to quickly parse complex data structures in a single line.
### Examples
###### Standard Usage:
```python
import pyxml
etree = pyxml.fromstring(b'
Hello World!
')
for element in etree.iter():
print(element)
with open('example.xml', 'rb') as f:
etree = pyxml.fromstring(f)
print(etree)
```
###### Monkey Patch:
```python
import pyxml
pyxml.compat.monkey_patch()
from xml.etree import ElementTree as ET
etree = ET.fromstring('
Hello World!
')
for element in etree.iter():
print(element)
print(etree.find('//p[starts-with(lower-case(text()), "hello")]'))
```
###### HTML:
```python
import pyxml.html
etree = pyxml.html.fromstring('
Hello World!
')
for element in etree.iter():
print(element)
print(etree.find('//p[notempty(text())]'))
```