Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tilde-lab/quantum_esperanto

Very fast parser for the XML logs produced with the VASP, Vienna Ab initio Simulation Package
https://github.com/tilde-lab/quantum_esperanto

ab-initio dft material-design materials materials-informatics materials-science vasp xml-files

Last synced: about 2 months ago
JSON representation

Very fast parser for the XML logs produced with the VASP, Vienna Ab initio Simulation Package

Host: GitHub
URL: https://github.com/tilde-lab/quantum_esperanto
Owner: tilde-lab
License: mit
Created: 2017-08-01T18:09:12.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2024-02-28T15:16:51.000Z (10 months ago)
Last Synced: 2024-11-10T05:11:58.963Z (about 2 months ago)
Topics: ab-initio, dft, material-design, materials, materials-informatics, materials-science, vasp, xml-files
Language: Cython
Homepage:
Size: 723 KB
Stars: 6
Watchers: 3
Forks: 1
Open Issues: 8
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff

Awesome Lists containing this project

README

# Quantum Esperanto

[![DOI](https://zenodo.org/badge/99029873.svg)](https://doi.org/10.5281/zenodo.7693601)
[![PyPI](https://img.shields.io/pypi/v/quantum_esperanto.svg?style=flat)](https://pypi.org/project/quantum_esperanto)
[![Build Status](https://travis-ci.org/tilde-lab/quantum_esperanto.svg?branch=master)](https://travis-ci.org/tilde-lab/quantum_esperanto)

*Quantum Esperanto* is a fast parser of XML files output by DFT codes (such as VASP) written in Cython.
It takes advantage of lxml, a Python wrapper around `libxml2` library, and its Cython interface.
XML files are parsed to a Python dictionary in a transparent way. It is really fast, up to 10 times faster than the
parser used by pymatgen project.

## Installation

The development versions of libraries `libxml2` and `libxslt` must be present in the system. Check with the command:

```
$ xslt-config
```

Also, a C-compiler such as `gcc` must be present. The recommended way of installing Quantum Esperanto is with `pip` from PyPI:

```
$ pip install quantum_esperanto
```

If one is interested in obtaining latest versions of the package, it can be installed using the source code from GitHub:

```
$ git clone https://github.com/tilde-lab/quantum_esperanto
$ cd quantum_esperanto
$ pip install .
```

The Python prerequisites for the package are `numpy` and `lxml` (should be installed automatically with `pip`).
It is possible to install the package in development mode. This will install `Cython` as well as `nose` test suite.
To do it issue the following command after cloning the repository and changing the directory:

```
$ cd quantum_esperanto
$ pip install -e .[dev]
```

After installation run several tests to check if the procedure was completed successfully. It can be
done with the following commands in `quantum_esperanto` directory:

```
$ python setup.py test
```

If everything is OK, you're all set to start using the package.

## Usage

The parser can be used in a very simple way. First, the parser has to be instantiated, and then the `parse_file`
method of the parser returns the dictionary of parsed values:

```
from quantum_esperanto.vasp import VaspParser
parser = VaspParser()
d = parser.parse_file('vasprun.xml')
```

The possible arguments for the parser are:

**recover**
(boolean, default: *True*) a flag that allows recovering broken XML. It is very useful in case of unfinished
calculations; however, it exits on the first XML error and the returned dictionary contains parsed values up to the
first XML error only. When XML recovery is needed, a warning is printed to stderr.

**whitelist**
(list, default: *None*) the list of parent tag names that are only needed to parsed. If None, then all tags are parsed.

### Parsing result

The result of parsing is a dictionary that follows the structure of `vasprun.xml`. The keys of the dictionary are
either tag names (for `i`, `v`, `varray` tags), or `tag:tag name` construction (for tags that do have name
attribute), or just tags themselves. The values are either tag contents converted to the right type (specified by `type`
tag attribute) or (in case of varrays and sets) Numpy arrays. Fortran overflows (denoted by `*****`) are converted to
NaNs in case of float values and to MAXINT in case of integer values.

### Example

```xml

1.43300000 1.43300000 1.43300000
1.43300000 -1.43300000 -1.43300000
-1.43300000 1.43300000 -1.43300000

11.77059895

0.34891835 0.34891835 0.00000000
0.34891835 -0.00000000 -0.34891835
-0.00000000 0.34891835 -0.34891835

0.00000000 0.00000000 0.00000000

```

The *resulting dictionary* reads (printed with *pprint*):

```
{'structure:primitive_cell': {'crystal': {'basis': array([[ 1.433, 1.433, 1.433],
[ 1.433, -1.433, -1.433],
[-1.433, 1.433, -1.433]]),
'rec_basis': array([[ 0.34891835, 0.34891835, 0. ],
[ 0.34891835, -0. , -0.34891835],
[-0. , 0.34891835, -0.34891835]]),
'volume': 11.77059895},
'positions': array([[ 0., 0., 0.]])}}
```

## License