Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/daniel-sc/xml_normalize
Normalizes xml files. Options include sorting siblings based on provided attribute, remove nodes, normalize whitespace/trim and pretty print.
https://github.com/daniel-sc/xml_normalize
normalization sorting xml
Last synced: 3 months ago
JSON representation
Normalizes xml files. Options include sorting siblings based on provided attribute, remove nodes, normalize whitespace/trim and pretty print.
- Host: GitHub
- URL: https://github.com/daniel-sc/xml_normalize
- Owner: daniel-sc
- Created: 2021-03-12T09:22:25.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2022-11-10T10:33:24.000Z (over 2 years ago)
- Last Synced: 2024-11-01T06:42:11.260Z (3 months ago)
- Topics: normalization, sorting, xml
- Language: TypeScript
- Homepage:
- Size: 1.19 MB
- Stars: 4
- Watchers: 2
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[![npm](https://img.shields.io/npm/v/xml_normalize)](https://www.npmjs.com/package/xml_normalize)
[![Coverage Status](https://coveralls.io/repos/github/daniel-sc/xml_normalize/badge.svg?branch=main)](https://coveralls.io/github/daniel-sc/xml_normalize?branch=main)
[![Language grade: JavaScript](https://img.shields.io/lgtm/grade/javascript/g/daniel-sc/xml_normalize.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/daniel-sc/xml_normalize/context:javascript)# XML Normalize
This program allows normalizing arbitrary xml files.
Normalization can be configured:* sort sibling elements based on some attribute value
* remove unwanted nodes
* trim texts
* normalize whitespaces/line breaksThis can be used as a post-/pre-processing step to keep diffs small for generated xml files.
## Usage
Either install via `npm i -g xml_normalize` or run directly with `npx xml_normalize`.
```text
Usage: npx xml_normalize [options]Options:
-i, --input-file input file
-o, --output-file output file - if not provided result is printed to stdout
-r, --remove-path simple XPath(s) to remove elements - e.g. "/html/head[1]/script"
-s, --sort-path simple XPath that references an attribute to sort - e.g. "/html/head[1]/script/@src"
--no-pretty Disable pretty format output
--no-trim Disable trimming of whitespace at the beginning and end of text nodes (trims only pure text nodes)
--no-attribute-trim Disable trimming whitespace at the beginning and end of attribute values
-tf, --trim-force Trim the whitespace at the beginning and end of text nodes (trims as well text adjacent to nested nodes)
-n, --normalize-whitespace Normalize whitespaces inside text nodes and attribute values
-d, --debug enable debug output
-h, --help display help for command
```## Options and Examples
### Sorting
Allows to sort siblings at a specific path
with the same tag name lexicographically
based on a specific attribute value.Example:
```xml
should be last
should be first
should be last
should be first
```
`npx xml_normalize -s /root/node/child/@id` will create:
```xml
should be first
should be last
should be first
should be last
```
### Removing
Allows to remove nodes in a specific path.
Example:
```xml
should be removed
should be removed
should stay
should stay
```
`npx xml_normalize -r /root/node[1]/child` will create:
```xml
should stay
should stay
```
`npx xml_normalize -r /root/node/child` instead, will create:
```xml
```
### Normalize whitespace
This option replaces any number of consecutive whitespace, tab, new line characters with a single whitespace (in text nodes).
Example:
```xml
some xml
has messed up
formatting
some more mess
```
`npx xml_normalize --normalize-whitespace` will create:
```xml
some xml has messed up formatting
some more mess
```
### Paths for sorting and removing
Paths are a simple subset of XPaths.
```
/ROOT/NODE_NAME[INDEX]/ANOTHER_NODE
```Supported:
* Only absolute paths
* Index access (note in XPath indices are 1-based!)
* Simple predicates using the following functions (parameters can be string (double quotes) or XPaths):
* `starts-with(str,prefix)`
* `contains(str,contained)`
* Node wildcard - e.g `/root/*` to select all nodes in `root` of any type.
* Attribute reference in last node - e.g. `/root/node/@id`.## What is this good for?
This helps to bring xml in a standardized form,
so that changes can easily be spotted in diff tool or git pull request.For example, you could run it as a post processing/pre commit script when re-generating XLIFF translation files
(or getting them back from your beloved translator in a messed up form).## Contribute
PRs always welcome :-)