https://github.com/dalito/ucumvert
Python parser for UCUM (Unified Code for Units of Measure) incl. converter to pint units
https://github.com/dalito/ucumvert
lark pint ucum units-of-measure
Last synced: 2 months ago
JSON representation
Python parser for UCUM (Unified Code for Units of Measure) incl. converter to pint units
- Host: GitHub
- URL: https://github.com/dalito/ucumvert
- Owner: dalito
- License: mit
- Created: 2024-01-02T23:24:30.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-05-10T22:10:20.000Z (5 months ago)
- Last Synced: 2025-05-10T22:18:58.598Z (5 months ago)
- Topics: lark, pint, ucum, units-of-measure
- Language: Python
- Homepage:
- Size: 360 KB
- Stars: 9
- Watchers: 1
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://github.com/dalito/ucumvert/actions/workflows/ci.yml)
[](https://pypi.org/project/ucumvert)# Easier access to UCUM from Python
> **Feedback welcome!**
> Currently only the conversion direction from UCM to pint is supported.
> Please review the definitions before you trust them.
> While we have many tests in place and reviewed the mappings carefully, bugs may still be present.[UCUM](https://ucum.org/) (Unified Code for Units of Measure) is a code system intended to cover all units of measures.
It provides a formalism to express units in an unambiguous way suitable for electronic communication.
Note that UCUM does not provide a canonical representation, e.g. `m/s` and `m.s-1` are expressing the same unit in two ways.**ucumvert** is a pip-installable Python package. Features:
- Parser for UCUM unit strings that implements the full grammar.
- Converter for creating [pint](https://pypi.org/project/pint/) units from UCUM unit strings.
- A pint unit definition file [pint_ucum_defs.txt](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/pint_ucum_defs.txt) that extends pint´s default units with UCUM units. All UCUM units from the new version 2.2 of the specification (June 2024) are included.**ucumvert** generates the UCUM grammar by filling a template with unit codes, prefixes etc. from the official [ucum-essence.xml](https://github.com/ucum-org/ucum/blob/main/ucum-essence.xml) file (a copy is included in this repo).
So updating the parser for new UCUM releases is straight forward.
The parser is built with the great [lark](https://pypi.org/project/lark/) parser toolkit.
The generated lark grammar file for case-sensitive UCUM codes is included in the repository, see [ucum_grammar.lark](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_grammar.lark).Some of the UCUM unit atoms are invalid unit names in pint, for example `cal_[15]`, `m[H2O]`, `10*`, `[in_i'H2O]`.
For all of them we define mappings to valid pint unit names in [ucum_pint.py](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_pint.py), e.g. `{"cal_[15]": "cal_15"}`.## Install
ucumvert is available as Python package from [PyPi](https://pypi.org/project/ucumvert) and can be pip-installed in the usual way.
```bash
pip install ucumvert
```To install the most recent code from git in developer mode including creation of a virtual environment use:
Linux
```bash
git clone https://github.com/dalito/ucumvert.git
cd ucumvert
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -e .[dev]
```Windows
```bash
git clone https://github.com/dalito/ucumvert.git
cd ucumvert
py -m venv .venv
.venv\Scripts\activate.bat
py -m pip install --upgrade pip
pip install -e .[dev]
```Optionally you can visualize the parse trees with [Graphviz](https://www.graphviz.org/) as shown below. It requires the additional package [pydot](https://pypi.org/project/pydot/); install by running `pip install pydot`.
## Demo
We provide a basic command line interface.
```cmd
(.venv) $ ucumvert
```It has an interactive mode to test parsing UCUM codes:
```cmd
(.venv) $ ucumvert -i
Enter UCUM unit code to parse, or 'q' to quit.
> m/s2.kg
Created visualization of parse tree (parse_tree.png).
main_term
term
term
simple_unit m
/
annotatable
simple_unit s
2
.
simple_unit
k
g
--> Pint
> q
```So the intermediate result is a tree which is then traversed to convert the elements to pint quantities (or pint-compatible strings):

The package includes an UCUM-aware pint UnitRegistry which loads all definitions for UCUM units on instantiation.
It comes with an additional method `from_ucum` to convert UCUM codes to pint.```python
>>> from ucumvert import PintUcumRegistry
>>> ureg = PintUcumRegistry()
>>> ureg.from_ucum("m/s2.kg")>>> ureg.from_ucum("m[H2O]{35Cel}") # UCUM code with annotation
>>> _.to("mbar")
>>> ureg("degC") # a standard pint unit
>>>
```## Tests
The unit tests include parsing and converting all common UCUM unit codes from the official repo. Run the test suite by:
```bash
pytest
```The common UCUM unit codes are available only in binary form (xlsx, docs, pdf).
Here we keep a copy in tsv-format `ucum_examples.tsv`.
To (re)generate this tsv-file from the official xlsx-file in the [UCUM repository](https://github.com/ucum-org/ucum/tree/main/common-units) run```bash
pip install openpyxl
python src/ucumvert/vendor/get_ucum_example_as_tsv.py
```## Useful links
- UCUM [online-validator](https://ucum.nlm.nih.gov/ucum-lhc/demo.html)
- Issue in pint that motivated this work: [To what extent is pint compatible with UCUM?](https://github.com/hgrecco/pint/issues/1769)## License
The code in this repository is distributed under MIT license with the exception of the `ucum-*.*` files in the directory `src/ucumvert/vendor`
that fall under the [UCUM Copyright Notice and License](https://github.com/ucum-org/ucum/blob/main/LICENSE.md) (Version 1.0).
We consider **ucumvert** according to §1.3 not as "Derivative Works" of UCUM because **ucumvert** only *"interoperates with an unmodified instance of the Work"*.