Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mauriceconrad/xml-parser

A Node.js XML DOM, Parser & Stringifier.
https://github.com/mauriceconrad/xml-parser

crawler crawling dom html html-parser html-parsing xml xml-parser xml-parsing xml-schema

Last synced: 3 months ago
JSON representation

A Node.js XML DOM, Parser & Stringifier.

Awesome Lists containing this project

README

        

# XML Parser, Stringifier and DOM

Parse XML, HTML and more with a very tolerant XML parser and convert it into a DOM.

These three components are separated from each other as own modules.

|Component|Size|
|---|---|
|Parser|4.7 KB|
|Stringifier|1.3 KB|
|DOM|3.1 KB|

## Install
```bash
npm install xml-parse
```
## Require
```javascript
const xml = require("xml-parse");
```

## Parser

Parsing is very simple.

Just call the ```parse``` method of the *xml-parse* instance.

```javascript
const xml = require("xml-parse");

// Valid XML string
var parsedXML = xml.parse('' +
'Root Element');
console.log(parsedXML);

// Invalid XML string
var parsedInavlidXML = xml.parse('' +
'' +
'' +
'');
console.log(parsedInavlidXML);
```

### Parsed Object Structure

The result of ```parse``` is an object that maybe looks like this:

(In this case we have the xml string of the given example)
```javascript
[
{
type: 'element',
tagName: '?xml',
attributes: {
version: '1.0',
encoding: 'UTF-8'
},
childNodes: [],
innerXML: '>',
closing: false,
closingChar: '?'
},
{
type: 'element',
tagName: 'root',
attributes: {},
childNodes: [
{
type: 'text',
text: 'Root Element'
}
],
innerXML: 'Root Element',
closing: true,
closingChar: null
}
]
```
**The root object is always an array because of the fact that it handles invalid xml with more than one root element.**

#### Object Nodes

There are two kinds of objects. *element* and *text*.
An object has always the property ```type```.
The other keys depend from this *type*.

##### 'Element' Object Node

```javascript
{
type: [String], // "element"
tagName: [String], // The tag name of the tag
attributes: [Object], // Object containing attributes as properties
childNodes: [Array], // Array containing child nodes as object nodes ("element" or "text")
innerXML: [String], // The inner XML of the tag
closing: [Boolean], // If the tag is closed typically ()
closingChar: [String] || null // If it is not closed typically, the char that is used to close it ("!" or "?")
}
```

##### 'Text' Object Node
```javascript
{
type: [String], // "text"
text: [String] // Text contents of the text node
}
```

## Stringifier

The stringifier is the simplest component. Just pass a parsed object structure.

```javascript
const xml = require("xml-parse");

var xmlDoc = [
{
type: 'element',
tagName: '?xml',
attributes: {
version: '1.0',
encoding: 'UTF-8'
},
childNodes: [],
innerXML: '>',
closing: false,
closingChar: '?'
},
{
type: 'element',
tagName: 'root',
attributes: {},
childNodes: [
{
type: 'text',
text: 'Root Element'
}
],
innerXML: 'Root Element',
closing: true,
closingChar: null
}
]

var xmlStr = xml.stringify(xmlDoc, 2); // 2 spaces

console.log(xmlStr);
```

## DOM

The ```DOM``` method of *xml-parser* instance returns a Document-Object-Model with a few methods.
It is oriented on the official W3 DOM but not complex as the original.

```javascript
const xml = require("xml-parse");

var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element')); // Can also be a file path.

xmlDoc.document; // Document Object. (Root)
```

### 'Element' Object Node

```javascript
// An element (e.g the 'document' object) has the following prototype methods and properties:

var objectNode = document.childNodes[1]; // Just an example

// This is the return of a object node element

objectNode = {
type: 'element',
tagName: 'tagName',
attributes: [Object],
childNodes: [Object],
innerXML: 'innerXML',
closing: true,
closingChar: null,
getElementsByTagName: [Function], // Returns all child nodes with a specific tagName
getElementsByAttribute: [Function], // Returns all child nodes with a specific attribute value
removeChild: [Function], // Removes a child node
appendChild: [Function], // Appends a child node
insertBefore: [Function], // Inserts a child node
getElementsByCheckFunction: [Function], // Returns all child nodes that are validated by validation function
parentNode: [Circular] // Parent Node
}
```

### Handling with child nodes

With ```appendChild``` or ```insertBefore``` methods of every object node, you are allowed to append a child node. You do not have to do something like ```createElement```.

Because a child node is just an object literal, with some properties like ```type```, ```tagName```, ```attributes```and more you just have to pass such an object to the function.

#### appendChild

```javascript
element.appendChild(childNode); // ChildNode is just a object node
```

##### Example

```javascript
const xml = require('xml-parse');

var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element'));

var root = xmlDoc.document.getElementsByTagName("root")[0];

root.appendChild({
type: "element",
tagName: "appendedElement",
childNodes: [
{
type: "text",
text: "Hello World :) I'm appended!"
}
]
});
```

#### insertBefore

```javascript
element.insertBefore(childNode, elementAfter); // ChildNode is just an object literal, 'elementAfter' is just a child node of the parent element
```

##### Example

```javascript
const xml = require('xml-parse');

var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element'));

var root = xmlDoc.document.getElementsByTagName("root")[0];

root.insertBefore({
type: "element",
tagName: "insertedElement",
childNodes: [
{
type: "text",
text: "Hello World :) I'm appended!"
}
]
}, root.childNodes[0]);
```

#### removeChild

```javascript
element.removeChild(childNode); // 'childNode' is just a children of the parent element ('element')
```

##### Example

```javascript
const xml = require('xml-parse');

var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element'));

var root = xmlDoc.document.getElementsByTagName("root")[0];

root.removeChild(root.childNodes[0]);

```

### parentNode

The ```parentNode``` of a object node represents its parent element. It's a ```[Circular]``` reference.

```javascript
const xml = require('xml-parse');

var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element'));

var root = xmlDoc.document.getElementsByTagName("root")[0];

console.log(root.childNodes[0].parentNode); // Returns the 'root' element
```

## Get child nodes

### getElementsByTagName

```javascript
element.getElementsByTagName("myTagName"); // Returns all elements whose tag name is 'myTagName'
```

### getElementsByAttribute

```javascript
element.getElementsByAttribute("myAttribute", "myAttributeValue"); // Returns all elements whose attribute 'myAttribute' is 'myAttributeValue'
```

### getElementsByCheckFunction

```javascript
// With this method you can set custom 'get' methods.
element.getElementsByCheckFunction(function(element) {
if (element.type === "element" && element.childNodes.length == 30) {
return true;
}
}); // Returns all elements that have exactly 30 childNodes
```