https://github.com/martinblech/xmltodict
Python module that makes working with XML feel like you are working with JSON
https://github.com/martinblech/xmltodict
Last synced: 7 months ago
JSON representation
Python module that makes working with XML feel like you are working with JSON
- Host: GitHub
- URL: https://github.com/martinblech/xmltodict
- Owner: martinblech
- License: mit
- Created: 2012-04-17T14:38:21.000Z (over 13 years ago)
- Default Branch: master
- Last Pushed: 2024-10-16T06:08:59.000Z (about 1 year ago)
- Last Synced: 2025-05-07T06:28:21.891Z (7 months ago)
- Language: Python
- Homepage:
- Size: 249 KB
- Stars: 5,608
- Watchers: 106
- Forks: 460
- Open Issues: 92
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
- awesome-web-scraping - xmltodict - XML to Python dict converter (Specialized Tools / HTML/XML Processing)
- fucking-awesome-python-cn - xmltodict
- awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- best-of-python - GitHub
- my-awesome-github-stars - martinblech/xmltodict - Python module that makes working with XML feel like you are working with JSON (Python)
- awesome-python-resources - GitHub - 27% open · ⏱️ 08.05.2022): (HTML 处理)
- awesome-list - xmltodict - Python module that makes working with XML feel like you are working with JSON. (Data Format & I/O / For Python)
- awesome-python-zh - xmltodict - 使用XML感觉就像你正在使用JSON。 (HTML操作)
- awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-python-again -
- starred-awesome - xmltodict - Python module that makes working with XML feel like you are working with JSON (Python)
- awesome-data-analysis - Xmltodict - Converts XML to Python dictionaries. (📦 Additional Python Libraries / Documentation & File Processing)
- python-awesome - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-python-machine-learning-resources - GitHub - 27% open · ⏱️ 08.05.2022): (数据读写与提取)
- links - xmltodict - Extract data from XML just like JSON. (Python / Python libraries)
- awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-python - xmltodict - Python module that makes working with XML feel like you are working with JSON ` 📝 a year ago ` (HTML Manipulation [🔝](#readme))
- awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- fucking-awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- fucking-awesome-python - :octocat: xmltodict - :star: 5173 :fork_and_knife: 469 - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-etl - xmltodict - Makes working with XML as easy as working with JSON. Also allows streaming so you don't run out of memory on large XML files. Great for simple operations on small XML files. (Python / Libraries)
- awesome-python-cn - xmltodict
- fucking_awesome_python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- Awesome-Python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- Python-Awesome - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-python - xmltodict - Python module that makes working with XML feel like you are working with JSON (Awesome Python / HTML Manipulation)
- awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- git-github.com-vinta-awesome-python - xmltodict - Working with XML feel like you are working with JSON. (HTML Manipulation)
- awesome-devasc - xmltodict
- awesome-python - xmltodict - A Python library that makes working with XML as easy as JSON. (Libraries / JSON and Data Parsing)
README
# xmltodict
`xmltodict` is a Python module that makes working with XML feel like you are working with [JSON](http://docs.python.org/library/json.html), as in this ["spec"](http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html):
[](https://app.travis-ci.com/martinblech/xmltodict)
```python
>>> print(json.dumps(xmltodict.parse("""
...
...
... elements
... more elements
...
...
... element as well
...
...
... """), indent=4))
{
"mydocument": {
"@has": "an attribute",
"and": {
"many": [
"elements",
"more elements"
]
},
"plus": {
"@a": "complex",
"#text": "element as well"
}
}
}
```
## Namespace support
By default, `xmltodict` does no XML namespace processing (it just treats namespace declarations as regular node attributes), but passing `process_namespaces=True` will make it expand namespaces for you:
```python
>>> xml = """
...
... 1
... 2
... 3
...
... """
>>> xmltodict.parse(xml, process_namespaces=True) == {
... 'http://defaultns.com/:root': {
... 'http://defaultns.com/:x': '1',
... 'http://a.com/:y': '2',
... 'http://b.com/:z': '3',
... }
... }
True
```
It also lets you collapse certain namespaces to shorthand prefixes, or skip them altogether:
```python
>>> namespaces = {
... 'http://defaultns.com/': None, # skip this namespace
... 'http://a.com/': 'ns_a', # collapse "http://a.com/" -> "ns_a"
... }
>>> xmltodict.parse(xml, process_namespaces=True, namespaces=namespaces) == {
... 'root': {
... 'x': '1',
... 'ns_a:y': '2',
... 'http://b.com/:z': '3',
... },
... }
True
```
## Streaming mode
`xmltodict` is very fast ([Expat](http://docs.python.org/library/pyexpat.html)-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like [Discogs](http://discogs.com/data/) or [Wikipedia](http://dumps.wikimedia.org/):
```python
>>> def handle_artist(_, artist):
... print(artist['name'])
... return True
>>>
>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
... item_depth=2, item_callback=handle_artist)
A Perfect Circle
Fantômas
King Crimson
Chris Potter
...
```
It can also be used from the command line to pipe objects to a script like this:
```python
import sys, marshal
while True:
_, article = marshal.load(sys.stdin)
print(article['title'])
```
```sh
$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | myscript.py
AccessibleComputing
Anarchism
AfghanistanHistory
AfghanistanGeography
AfghanistanPeople
AfghanistanCommunications
Autism
...
```
Or just cache the dicts so you don't have to parse that big XML file again. You do this only once:
```sh
$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | gzip > enwiki.dicts.gz
```
And you reuse the dicts with every script that needs them:
```sh
$ gunzip enwiki.dicts.gz | script1.py
$ gunzip enwiki.dicts.gz | script2.py
...
```
## Roundtripping
You can also convert in the other direction, using the `unparse()` method:
```python
>>> mydict = {
... 'response': {
... 'status': 'good',
... 'last_updated': '2014-02-16T23:10:12Z',
... }
... }
>>> print(unparse(mydict, pretty=True))
good
2014-02-16T23:10:12Z
```
Text values for nodes can be specified with the `cdata_key` key in the python dict, while node properties can be specified with the `attr_prefix` prefixed to the key name in the python dict. The default value for `attr_prefix` is `@` and the default value for `cdata_key` is `#text`.
```python
>>> import xmltodict
>>>
>>> mydict = {
... 'text': {
... '@color':'red',
... '@stroke':'2',
... '#text':'This is a test'
... }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
This is a test
```
Lists that are specified under a key in a dictionary use the key as a tag for each item. But if a list does have a parent key, for example if a list exists inside another list, it does not have a tag to use and the items are converted to a string as shown in the example below. To give tags to nested lists, use the `expand_iter` keyword argument to provide a tag as demonstrated below. Note that using `expand_iter` will break roundtripping.
```python
>>> mydict = {
... "line": {
... "points": [
... [1, 5],
... [2, 6],
... ]
... }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
[1, 5]
[2, 6]
>>> print(xmltodict.unparse(mydict, pretty=True, expand_iter="coord"))
1
5
2
6
```
## Ok, how do I get it?
### Using pypi
You just need to
```sh
$ pip install xmltodict
```
### Using conda
For installing `xmltodict` using Anaconda/Miniconda (*conda*) from the
[conda-forge channel][#xmltodict-conda] all you need to do is:
[#xmltodict-conda]: https://anaconda.org/conda-forge/xmltodict
```sh
$ conda install -c conda-forge xmltodict
```
### RPM-based distro (Fedora, RHEL, …)
There is an [official Fedora package for xmltodict](https://apps.fedoraproject.org/packages/python-xmltodict).
```sh
$ sudo yum install python-xmltodict
```
### Arch Linux
There is an [official Arch Linux package for xmltodict](https://www.archlinux.org/packages/community/any/python-xmltodict/).
```sh
$ sudo pacman -S python-xmltodict
```
### Debian-based distro (Debian, Ubuntu, …)
There is an [official Debian package for xmltodict](https://tracker.debian.org/pkg/python-xmltodict).
```sh
$ sudo apt install python-xmltodict
```
### FreeBSD
There is an [official FreeBSD port for xmltodict](https://svnweb.freebsd.org/ports/head/devel/py-xmltodict/).
```sh
$ pkg install py36-xmltodict
```
### openSUSE/SLE (SLE 15, Leap 15, Tumbleweed)
There is an [official openSUSE package for xmltodict](https://software.opensuse.org/package/python-xmltodict).
```sh
# Python2
$ zypper in python2-xmltodict
# Python3
$ zypper in python3-xmltodict
```