Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mauriceconrad/xml-parser
A Node.js XML DOM, Parser & Stringifier.
https://github.com/mauriceconrad/xml-parser
crawler crawling dom html html-parser html-parsing xml xml-parser xml-parsing xml-schema
Last synced: 3 months ago
JSON representation
A Node.js XML DOM, Parser & Stringifier.
- Host: GitHub
- URL: https://github.com/mauriceconrad/xml-parser
- Owner: MauriceConrad
- Created: 2017-03-20T21:38:07.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2022-04-19T14:42:42.000Z (almost 3 years ago)
- Last Synced: 2024-10-22T03:56:18.505Z (3 months ago)
- Topics: crawler, crawling, dom, html, html-parser, html-parsing, xml, xml-parser, xml-parsing, xml-schema
- Language: JavaScript
- Homepage:
- Size: 18.6 KB
- Stars: 18
- Watchers: 5
- Forks: 8
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# XML Parser, Stringifier and DOM
Parse XML, HTML and more with a very tolerant XML parser and convert it into a DOM.
These three components are separated from each other as own modules.
|Component|Size|
|---|---|
|Parser|4.7 KB|
|Stringifier|1.3 KB|
|DOM|3.1 KB|## Install
```bash
npm install xml-parse
```
## Require
```javascript
const xml = require("xml-parse");
```## Parser
Parsing is very simple.
Just call the ```parse``` method of the *xml-parse* instance.
```javascript
const xml = require("xml-parse");// Valid XML string
var parsedXML = xml.parse('' +
'Root Element');
console.log(parsedXML);// Invalid XML string
var parsedInavlidXML = xml.parse('' +
'' +
'' +
'');
console.log(parsedInavlidXML);
```### Parsed Object Structure
The result of ```parse``` is an object that maybe looks like this:
(In this case we have the xml string of the given example)
```javascript
[
{
type: 'element',
tagName: '?xml',
attributes: {
version: '1.0',
encoding: 'UTF-8'
},
childNodes: [],
innerXML: '>',
closing: false,
closingChar: '?'
},
{
type: 'element',
tagName: 'root',
attributes: {},
childNodes: [
{
type: 'text',
text: 'Root Element'
}
],
innerXML: 'Root Element',
closing: true,
closingChar: null
}
]
```
**The root object is always an array because of the fact that it handles invalid xml with more than one root element.**#### Object Nodes
There are two kinds of objects. *element* and *text*.
An object has always the property ```type```.
The other keys depend from this *type*.##### 'Element' Object Node
```javascript
{
type: [String], // "element"
tagName: [String], // The tag name of the tag
attributes: [Object], // Object containing attributes as properties
childNodes: [Array], // Array containing child nodes as object nodes ("element" or "text")
innerXML: [String], // The inner XML of the tag
closing: [Boolean], // If the tag is closed typically ()
closingChar: [String] || null // If it is not closed typically, the char that is used to close it ("!" or "?")
}
```##### 'Text' Object Node
```javascript
{
type: [String], // "text"
text: [String] // Text contents of the text node
}
```## Stringifier
The stringifier is the simplest component. Just pass a parsed object structure.
```javascript
const xml = require("xml-parse");var xmlDoc = [
{
type: 'element',
tagName: '?xml',
attributes: {
version: '1.0',
encoding: 'UTF-8'
},
childNodes: [],
innerXML: '>',
closing: false,
closingChar: '?'
},
{
type: 'element',
tagName: 'root',
attributes: {},
childNodes: [
{
type: 'text',
text: 'Root Element'
}
],
innerXML: 'Root Element',
closing: true,
closingChar: null
}
]var xmlStr = xml.stringify(xmlDoc, 2); // 2 spaces
console.log(xmlStr);
```## DOM
The ```DOM``` method of *xml-parser* instance returns a Document-Object-Model with a few methods.
It is oriented on the official W3 DOM but not complex as the original.```javascript
const xml = require("xml-parse");var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element')); // Can also be a file path.xmlDoc.document; // Document Object. (Root)
```### 'Element' Object Node
```javascript
// An element (e.g the 'document' object) has the following prototype methods and properties:var objectNode = document.childNodes[1]; // Just an example
// This is the return of a object node element
objectNode = {
type: 'element',
tagName: 'tagName',
attributes: [Object],
childNodes: [Object],
innerXML: 'innerXML',
closing: true,
closingChar: null,
getElementsByTagName: [Function], // Returns all child nodes with a specific tagName
getElementsByAttribute: [Function], // Returns all child nodes with a specific attribute value
removeChild: [Function], // Removes a child node
appendChild: [Function], // Appends a child node
insertBefore: [Function], // Inserts a child node
getElementsByCheckFunction: [Function], // Returns all child nodes that are validated by validation function
parentNode: [Circular] // Parent Node
}
```### Handling with child nodes
With ```appendChild``` or ```insertBefore``` methods of every object node, you are allowed to append a child node. You do not have to do something like ```createElement```.
Because a child node is just an object literal, with some properties like ```type```, ```tagName```, ```attributes```and more you just have to pass such an object to the function.
#### appendChild
```javascript
element.appendChild(childNode); // ChildNode is just a object node
```##### Example
```javascript
const xml = require('xml-parse');var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element'));var root = xmlDoc.document.getElementsByTagName("root")[0];
root.appendChild({
type: "element",
tagName: "appendedElement",
childNodes: [
{
type: "text",
text: "Hello World :) I'm appended!"
}
]
});
```#### insertBefore
```javascript
element.insertBefore(childNode, elementAfter); // ChildNode is just an object literal, 'elementAfter' is just a child node of the parent element
```##### Example
```javascript
const xml = require('xml-parse');var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element'));var root = xmlDoc.document.getElementsByTagName("root")[0];
root.insertBefore({
type: "element",
tagName: "insertedElement",
childNodes: [
{
type: "text",
text: "Hello World :) I'm appended!"
}
]
}, root.childNodes[0]);
```#### removeChild
```javascript
element.removeChild(childNode); // 'childNode' is just a children of the parent element ('element')
```##### Example
```javascript
const xml = require('xml-parse');var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element'));var root = xmlDoc.document.getElementsByTagName("root")[0];
root.removeChild(root.childNodes[0]);
```
### parentNode
The ```parentNode``` of a object node represents its parent element. It's a ```[Circular]``` reference.
```javascript
const xml = require('xml-parse');var xmlDoc = new xml.DOM(xml.parse('' +
'Root Element'));var root = xmlDoc.document.getElementsByTagName("root")[0];
console.log(root.childNodes[0].parentNode); // Returns the 'root' element
```## Get child nodes
### getElementsByTagName
```javascript
element.getElementsByTagName("myTagName"); // Returns all elements whose tag name is 'myTagName'
```### getElementsByAttribute
```javascript
element.getElementsByAttribute("myAttribute", "myAttributeValue"); // Returns all elements whose attribute 'myAttribute' is 'myAttributeValue'
```### getElementsByCheckFunction
```javascript
// With this method you can set custom 'get' methods.
element.getElementsByCheckFunction(function(element) {
if (element.type === "element" && element.childNodes.length == 30) {
return true;
}
}); // Returns all elements that have exactly 30 childNodes
```