Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/stefanspringer1/SwiftXML
A library written in Swift to process XML
https://github.com/stefanspringer1/SwiftXML
Last synced: 3 months ago
JSON representation
A library written in Swift to process XML
- Host: GitHub
- URL: https://github.com/stefanspringer1/SwiftXML
- Owner: stefanspringer1
- License: apache-2.0
- Created: 2021-07-24T15:15:35.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-06-11T09:01:03.000Z (5 months ago)
- Last Synced: 2024-07-05T13:37:10.388Z (4 months ago)
- Language: Swift
- Size: 798 KB
- Stars: 11
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# SwiftXML
A library written in Swift to process XML.
This library is published under the Apache License v2.0 with Runtime Library Exception.
```swift
let transformation = XTransformation {XRule(forElements: "table") { table in
table.insertNext {
XElement("caption") {
"Table: "
table.children({ $0.name.contains("title") }).content
}
}
}XRule(forElements: "tbody", "tfoot") { tablePart in
for cell in tablePart.children("tr").children("th") {
cell.name = "td"
}
}}
```---
**NOTE****This library is not in a “final” state yet** despite its high version number, i.e. there might still be bugs, or some major improvements will be done, and breaking changes might happen without the major version getting augmented. Addionally, there will be more comments in the code. Also, when such a final state is reached, the library might be further developed using a new repository URL (and the version number set back to a lower one). Further notice will be added here. See [there](https://stefanspringer.com) for contact information.
**We plan for a final release in early 2024.** (This library will then already be used in a production environment.) For all who are already been interested in this library, thank you for your patience!
**UPDATE 1 (May 2023):** We changed the API a little bit recently (no more public `XSpot`, but you can set `isolated` for `XText`) and fixed some problems and are currently working on adding more tests to this library and to the `SwiftXMLParser`.
**UPDATE 2 (July 2023):** In order to keep the XML tree small **we removed the ability to directly access the attributes of a certain name in a document,** and accordingly also to formulate rules for attributes (rules for attributes were rarely used in applications). Instead of directly accessing attributes of certain names, you will have to inspect the descendants of a document (if not catching according events during parsing), maybe saving the result. _An easier replacement for the lost functionality will be available when we add a validation tool:_ When using an appropriate schema you will then be able to look up which elements – according to the schema – could have a certain attribute set, and you can then access these elements directly.
**UPDATE 3 (July 2023):** Renamed `havingProperties` to `conformingTo`.
**UPDATE 4 (July 2023):** The namespace handling is now in a conclusive state, see the new section about limitations of the XML input and the changed section on how to handle XML namespaces.
**UPDATE 5 (July 2023):** In order to further streamline the library, the functionality for tracking changes (of attributes) was removed. In most cases when you have to track changes you need a better way of setting those attributes, so there was a burden whenever setting attributes, but without much use.
**UPDATE 6 (August 2023):** Renamed `conformingTo` to `when`.
**UPDATE 7 (August 2023):** In order to conform to some type checks in Swift 5.9, we have to demand macOS 13, iOS 16, tvOS 16, or watchOS 9 for Apple platforms.
**UPDATE 8 (August 2023):** Renamed `applying` to `with`.
**UPDATE 9 (September 2023):** Renamed `with` to `applying` again. Renamed `when` to `fullfilling`. Renamed `hasProperties` to `fullfills`. Their implementations for a single items is now done via protocols.
**UPDATE 10 (October 2023):** Instead of `element(ofName:)` use `element(_:)` to better match the other methods that take names.
**UPDATE 11 (October 2023):** Instead of `XProduction`, `XProductionTemplate` and `XActiveProduction` are now used, see the updated description below.
**UPDATE 11 (October 2023):** Dropping the “X” prefix for implementations of `XProductionTemplate` and `XActiveProduction`.
**UPDATE 12 (October 2023):** `XNode.write(toFile:)` is renamed to `XNode.write(toPath:)`, and `XNode.write(toFileHandle:)` is renamed to `XNode.write(toFile:)`.
**UPDATE 13 (December 2023):** `texts` is renamed to `immediateTexts` so as not to confuse it with `allTexts`, and `text` is renamed to `allTextsCollected`. `immediateTextsCollected` and the `allTextsReversed` variants are added.
**UPDATE 14 (December 2023):** The subscript notation with integer values for a sequence of XContent, XElement, or XText now starts counting at 1.
**UPDATE 15 (December 2023):** `immediateTextsCollected` is removed.
**UPDATE 16 (December 2023):** The method `child(...)` is renamed to `firstChild(...)`.
**UPDATE 17 (December 2023):** Added some tracing capabilities for complex transformations.
**UPDATE 18 (January 2024):** `XContentLike` is renamed to `XContentConvertible`. When using SwiftXML, a new type can conform to `XContentConvertible` and as such then can be inserted as XML. The `asContent` property is not necessary any more and is removed, and `... as XContentConvertible` (previously `... as XContentLike`) should also not be necessary any more.
**UPDATE 19 (March 2024):** `description` add quotation marks for `XText`.
**UPDATE 20 (may 2024):** Renamed `allTextsCollected` to `allTextsCombined`.
---
## Related packages
### The `LoopsOnOptionals` package
For-in loops do not work on optionals e.g. optional chains in Swift. But when working with this XML libary being able to do so might be convenient at times. In order to be able to loop on optionals, include the very small `LoopsOnOptionals` package from https://github.com/stefanspringer1/LoopsOnOptionals.
When having the following extension to `XDocument`:
```swift
extension XDocument {
var metaDataSection: XElement? { ... }
}
```then with the `LoopsOnOptionals` package you can write:
```swift
for metaDataItem in myDocument.metaDataSection?.children("item") {
...
}
```Of course, especially in this simple case you can express the same as follows, without using the `LoopsOnOptionals` package:
```swift
if let metaDataSection = myDocument.metaDataSection {
for metaDataItem in metaDataSection.children("item") {
...
}
}
```But even more so in more complex situations, the introduction of such a `if let` (or `case let`) expression makes the code harder to understand.
### The `Workflow` package
When using SwiftXML in the context of the [SwiftWorkflow](https://github.com/stefanspringer1/SwiftWorkflow) framework, you might include the [WorkflowUtilitiesForSwiftXML](https://github.com/stefanspringer1/WorkflowUtilitiesForSwiftXML).
## Properties of the library
The library reads XML from a source into an XML document instance, and provides methods to transform (or manipulate) the document, and others to write the document to a file.
The library should be efficient and applications that use it should be very intelligible.
### Limitations of the XML input
- The encoding of the source must be UTF-8 (ASCII is considered as a subset of it). The parser checks for correct UTF-8 encoding and also checks (according to the data available to the currently used Swift implementation) if a found codepoint is a valid Unicode codepoint.
- For easier processing, declarations of namespace prefixes via `xmlns:...` attributes should only be at the root element.### Manipulation of an XML document
Other than some other libraries for XML, the manipulation of the document as built in memory is “in place”, i.e. no new XML document is built. The goal is to be able to apply many isolated manipulations to an XML document efficiently. But it is always possible to clone a document easily with references to or from the old version.
The following features are important:
- All iteration over content in the document using the according library functions are lazy by default, i.e. the iteration only looks at one item at a time and does not (!) collect all items in advance.
- While lazily iterating over content in the document in this manner, the document tree can be changed without negatively affecting the iteration.
- Elements of a certain name can be efficiently found without having to traverse the whole tree. An according iteration proceeds in the order by which the elements have been added to the document. When iterating in this manner, newly added elements are then also processed as part of the same iteration.The following code takes any `` with an integer value of `multiply` larger than 1 and additionally inserts an item with a `multiply` number one less, while removing the `multiply` value on the existing item (the library will be explained in more detail in subsequent sections):
```swift
let document = try parseXML(fromText: """
""")for item in document.elements("item") { in
if let multiply = item["multiply"], let n = Int(multiply), n > 1 {
item.insertPrevious {
XElement("item", ["multiply": n > 2 ? String(n-1) : nil])
}
item["multiply"] = nil
}
}document.echo()
```The output is:
Note that in this example – just to show you that it works – each new item is being inserted _before_ the current node but is then still being processed.
The elements returned by an iteration can even be removed without stopping the (lazy!) iteration:
```swift
let document = try parseXML(fromText: """
""")document.traverse { content in
if let element = content as? XElement, element["remove"] == "true" {
element.remove()
}
}document.echo()
```The output is:
Of course, since those iterations are regular sequences, all according Swift library functions like `map` and `filter` can be used. But in many cases, it might be better to use conditions on the content iterators (see the section on finding related content with filters) or chaining of content iterators (see the section on chained iterators).
The user of the library can also provide sets of rules to be applied (see the code at the beginning and a full example in the section about rules). In such a rule, the user defines what to do with an element or attribute with a certain name. A set of rules can then be applied to a document, i.e. the rules are applied in the order of their definition. This is repeated, guaranteeing that a rule is only applied once to the same object (if not fully removed from the document and added again, see the section below on document membership), until no more application takes places. So elements can be added during application of a rule and then later be processed by the same or another rule.
### Other properties
The library uses the [SwiftXMLParser](https://github.com/stefanspringer1/SwiftXMLParser) to parse XML which implements the according protocol from [SwiftXMLInterfaces](https://github.com/stefanspringer1/SwiftXMLInterfaces).
Depending on the configuration of the parse process, all parts of the XML source can be retained in the XML document, including all comments and parts of an internal subset e.g. all entity or element definitions. (Elements definitions and attribute list definitions are, besides their reported element names, only retained as their original textual representation, they are not parsed into any other representation.)
In the current implementation, the XML library does not implement any validation, i.e. validation against a DTD or other XML schema, telling us e.g. if an element of a certain name can be contained in an element of another certain name. The user has to use other libraries (e.g. [Libxml2Validation](https://github.com/stefanspringer1/Libxml2Validation)) for such validation before reading or after writing the document. Besides validating the structure of an XML document, validation is also important for knowing if the occurrence of a whitespace text is significant (i.e. should be kept) or not. (E.g., whitespace text between elements representing paragraphs of a text document is usually considered insignificant.) To compensate for that last issue, the user of the library can provide a function that decides if an instance of whitespace text between elements should be kept or not. Also, possible default values of attributes have to be set by the user if desired once the document tree is built.
This library gives full control of how to handle entities. Named entity references can persist inside the document event if they are not defined. Named entity references are being scored as internal or external entity references during parsing, the external entity references being those which are referenced by external entity definitions in the internal subset inside the document declaration of the document. Replacements of internal entity references by text can be done automatically according to the internal subset and/or controlled by the application.
Automated inclusion of the content external parsed entities can be configurated, the content might then be wrapped by elements with according information of the enities.
Elements or attributes with namespace prefixes are given the full name “prefix:unprefixed". See the section on handling of namespaces for motivation and about how to handle namespaces.
For any error during parsing an error is thrown and no document is then provided.
An XML tree (e.g. a document) must not be examined or changed concurrently.
---
**NOTE**The description of the library that follows might not include all types and methods. Please see the documentation produced by DocC or use autocompletion in an according integrated development environment (IDE).
---
## Reading XML
The following functions take a source and return an XML document instance (`XDocument`). The source can either be provided as a URL, a path to a file, a text, or binary data.
Reading from a URL which references a local file:
```swift
func parseXML(
fromURL: URL,
sourceInfo: String?,
textAllowedInElementWithName: ((String) -> Bool)?,
internalEntityAutoResolve: Bool,
internalEntityResolver: InternalEntityResolver?,
insertExternalParsedEntities: Bool,
externalParsedEntitySystemResolver: ((String) -> URL?)?,
externalParsedEntityGetter: ((String) -> Data?)?,
externalWrapperElement: String?,
keepComments: Bool,
keepCDATASections: Bool,
eventHandlers: [XEventHandler]?
) throws -> XDocument
```And accordingly:
```swift
func parseXML(
fromPath: String,
...
) throws -> XDocument
``````swift
func parseXML(
fromText: String,
...
) throws -> XDocument
``````swift
func parseXML(
fromData: Data,
...
) throws -> XDocument
```If you want to be indifferent about which kind of source to process, use `XDocumentSource` for the source definition and use:
```swift
func parseXML(
from: XDocumentSource,
...
) throws -> XDocument
```The optional `textAllowedInElementWithName` method gets the name of the surrounding element when text is found inside an element and should notify whether text is allowed in the specific context. If not, the text is discarded is it is whitespace. If no text is allowed in the context but the text is not whitespace, an error is thrown. If you need a more specific context than the element name to decide if text is allowed, use an `XEventHandler` to track more specific context information.
All internal entity references in attribute values have to be replaced by text during parsing. In order to achieve this (in case that internal entity references occur at all in attribute values in the source), an `InternalEntityResolver` can be provided. An `InternalEntityResolver` has to implement the following method:
```swift
func resolve(
entityWithName: String,
forAttributeWithName: String?,
atElementWithName: String?
) -> String?
```This method is always called when a named entity reference is encountered (either in text or attribute) which is scored as an internal entity. It returns the textual replacement for the entity or `nil`. If the method returns `nil`, then the entity reference is not replaced by a text, but is kept. In the case of a named entity in an attribute value, an error is thrown when no replacement is given. The function arguments `forAttributeWithName` (name of the attribute) and `atElementWithName` (name of the element) have according values if and only if the entity is encountered inside an attribute value.
If `internalEntityAutoResolve` is set to `true`, the parser first tries to replace the internal entities by using the declarations in the internal subset of the document before calling an `InternalEntityResolver`.
The content of external parsed entities are not inserted by default, but they are if you set `insertExternalParsedEntities` to `true`. You can provides a method in the argument `externalParsedEntitySystemResolver` to resolved the system identitfier of the external parsed entity to an URL. You can also provide a method in the argument `externalParsedEntityGetter` to get the data for the system identifier (if `externalParsedEntitySystemResolver` is provided, then `externalParsedEntitySystemResolver` first has to return `nil`). At the end the system identifier is just added as path component to the source URL (if it exists) and the parser tries to load the entity from there.
When the content of an external parsed entitiy is inserted, you can declare an element name `externalWrapperElement`: the inserted content then gets wrapped into an element of that name with the information about the entity in the attributes `name`, `systemID`, and `path` (`path` being optional, as an external parsed entity might get resolved without an explicit path). (During later processing, you might want to change this representation, e.g. if the external parsed entity reference is the only content of an element, you might replace the wrapper by its content and set the according information as some attachments of the parent element, so validation of the document succeeds.)
One a more event handlers can be given a `parseXML` call, which implement `XEventHandler` from [XMLInterfaces](https://github.com/stefanspringer1/SwiftXMLInterfaces). This allows for the user of the library to catch any event during parsing like entering or leaving an element. E.g., the resolving of an internal entity reference could depend on the location inside the document (and not only on the name of the element or attribute), so this information can be collected by such an event handler.
`keepComments` (default: `false`) decides if a comment should be preserved (as `XComment`), else they will be discarded without notice. `keepCDATASections` (default: `false`) decides if a CDATA section should be preserved (as `XCDATASection`), else all CDATA sections get resolved as text.
## Content of a document
An XML document (`XDocument`) can contain the following content:
- `XElement`: an element
- `XText`: a text
- `XInternalEntity`: an internal entity reference
- `XExternalEntity`: an external entity reference
- `XCDATASection`: a CDATA section
- `XProcessingInstruction`: a processing instruction
- `XComment`: a comment
- `XLiteral`: containing text that is meant to be serialized “as is”, i.e. no escaping e.g. of `<` and `&` is done, it could contain XML code that is to be serialized _literally,_ hence its name`XLiteral` is never the result of parsing XML, but might get added by an application. Subsequent `XLiteral` content is (just like `XText`, see the section on handling of text) always automatically combined.
Those content are of type type `XContent`, whereas the more general type `XNode` might be content or an `XDocument`.
The following is read from the internal subset:
- `XInternalEntityDeclaration`: an internal entity declaration
- `XExternalEntityDeclaration`: an external entity declaration
- `XUnparsedEntityDeclaration`: a declaration of an unparsed external entity
- `XNotationDeclaration`: a notation declaration
- `XParameterEntityDeclaration`: a parameter entity declaration
- `XElementDeclaration`: an element declaration
- `XAttributeListDeclaration`: an attribute list declarationThey can be accessed via property `declarationsInInternalSubset`.
A document gets the following additional properties from the XML source (some values might be `nil`:
- `encoding`: the encoding from the XML declaration
- `publicID`: the public identifier from the document type declaration
- `sourcePath`: the source to the XML document
- `standalone`: the standalone value from the XML declaration
- `systemID`: the system identifier from the document type declaration
- `xmlVersion`: the XML version from the XML declarationWhen not set explicitely in the XML source, some of those values are set to a sensible value.
## Displaying XML
When printing a content via `print(...)`, only a top-level represenation like the start tag is printed and never the whole tree. When you would like to print the whole tree or document, use:
```swift
func echo(pretty: Bool, indentation: String, terminator: String)
````pretty` defaults to `false`; if it is set to `true`, linebreaks and spaces are added for pretty print. `indentation` defaults to two spaces, `terminator` defaults to `"\n"`, i.e. a linebreak is then printed after the output.
With more control:
```swift
func echo(usingProductionTemplate: XProductionTemplate, terminator: String)
```Productions are explained in the next section.
When you want a serialization of a whole tree or document as text (`String`), use the following method:
```swift
func serialized(pretty: Bool) -> String
````pretty` again defaults to `false` and has the same effect.
With more control:
```swift
func serialized(usingProductionTemplate: XProductionTemplate) -> String
```Do not use `serialized` to print a tree or document, use `echo` instead, because using `echo` is more efficient in this case.
## Writing XML
Any XML node (including an XML document) can be written, including the tree of nodes that is started by it, via the following methods.
```swift
func write(toURL: URL, usingProductionTemplate: XProductionTemplate) throws
``````swift
func write(toPath: String, usingProductionTemplate: XProductionTemplate) throws
``````swift
func write(toFile: FileHandle, usingProductionTemplate: XProductionTemplate) throws
``````swift
func write(toWriter: Writer, usingProductionTemplate: XProductionTemplate) throws
```You can also use the `WriteTarget` protocol to allow all the above possiblities:
```swift
func write(to writeTarget: WriteTarget, usingProductionTemplate: XProductionTemplate) throws
```By the argument `usingProductionTemplate:` you can define a production, i.e. details of the serialization, e.g. if linebreaks are inserted to make the result look pretty. Its value defaults a an instance of `XActiveProductionTemplate`, which will give a a standard output.
The definition of such a production comes in two parts, a template that can be initialized with values for a further configuration of the serialization, and an active production which is to be applied to a certain target. This way the user has the ability to define completely what the serialization should look like, and then apply this definition to one or several serializations. In more detail:
A `XProductionTemplate` has a method `activeProduction(for writer: Writer) -> XActiveProduction` which by using the `writer` initializes an `XActiveProduction` where the according events trigger a writing to the `writer`. The configuration for such a production are to be provided via arguments to the initializer of the `XProductionTemplate`.
So an `XActiveProduction` defines how each part of the document is written, e.g. if `>` or `"` are written literally or as predefined XML entities in text sections. The production in the above function calls defaults to an instance of `DefaultProductionTemplate` which results in instances of `ActiveDefaultProduction`. `ActiveDefaultProduction` should be extended if only some details of how the document is written are to be changed. The productions `ActivePrettyPrintProduction` (which might be used by defining an `PrettyPrintProductionTemplate`) and `ActiveHTMLProduction` (which might be used by defining an `HTMLProductionTemplate`) already extend `ActiveDefaultProduction`, which might be used to pretty-print XML or output HTML. (Note that `HTMLProductionTemplate` can be given a `NamespaceReference` to consider a possible namespace prefix for the HTML elements.) But you also extend one of those classes youself, e.g. you could override `func writeText(text: XText)` and `func writeAttributeValue(name: String, value: String, element: XElement)` to again write some characters as named entity references. Or you just provide an instance of `DefaultProduction` itself and change its `linebreak` property to define how line breaks should be written (e.g. Unix or Windows style). You might also want to consider `func sortAttributeNames(attributeNames: [String], element: XElement) -> [String]` to sort the attributes for output.
Example: write a linebreak before all elements:
```swift
class MyProduction: DefaultProduction {override func writeElementStartBeforeAttributes(element: XElement) throws {
try write(linebreak)
try super.writeElementStartBeforeAttributes(element: element)
}}
try document.write(toFile: "myFile.xml", usingProduction: MyProduction())
```For generality, the following method is provided to apply any `XActiveProduction` to a node and its contained tree:
```swift
func applyProduction(activeProduction: XActiveProduction) throws
```## Cloning and document versions
Any node (including an XML document) can be cloned, including the tree of nodes that is started by it, using the following method:
```swift
func clone() -> XNode
```(The result will be more specific if the subject is known to be more specific.)
Any content and the document itself possesses the property `backLink` that can be used as a relation between a clone and the original node. If you create a clone by using the `clone()` method, the `backLink` value of a node in the clone points to the original node. So when working with a clone, you can easily look at the original nodes.
Note that the `backLink` reference references the original node weakly, i.e. if you do not save a reference to the original node or tree then the original node disapears and the `backLink` property will be `nil`.
If you would like to use cloning to just save a version of your document to a copy, use its following method:
```swift
func makeVersion()
```In that case a clone of the document will be created, but with the `backLink` property of an original node pointing to the clone, and the `backLink` property of the clone will point to the old `backLink` value of the original node. I.e. if you apply `saveVersion()` several times, when following the `backLink` values starting from a node in your original document, you will go through all versions of this node, from the newer ones to the older ones. The `backLinks` property gives you exactly that chain of backlinks. Other than when using `clone()`, a strong reference to such a document version will be remembered by the document, so the nodes of the clone will be kept. Use `forgetVersions(keeping:Int)` on the document in order to stop this remembering, just keeping the last number of versions defined by the argument `keeping` (`keeping` defaults to 0). In the oldest version then still remembered or, if no remembered version if left, in the document itself all `backLink` values will then be set to `nil`.
The `finalBackLink` property follows the whole chain of `backLink` values and gives you the last value in this chain.
Sometimes, only a “shallow” clone is needed, i.e. the node itself without the whole tree of nodes with the node as root. In this case, just use:
```swift
func shallowClone(forwardref: Bool) -> XNode
```The `backLink` is then set just like when using `clone()`.
## Content properties
### Source range
If the parser (as it is the case with the [SwiftXMLParser](https://github.com/stefanspringer1/SwiftXMLParser)) reports the where a part of the document it is in the text (i.e. at what line and column it starts and at what line and column it ends), the property `sourceRange: XTextRange` (using `XTextRange` from [SwiftXMLInterfaces](https://github.com/stefanspringer1/SwiftXMLInterfaces)) returns it for the respective node:
Example:
```swift
let document = try parseXML(fromText: """
Hello
""", textAllowedInElementWithName: { $0 == "b" })for content in document.allContent {
if let sourceRange = content.sourceRange {
print("\(sourceRange): \(content)")
}
else {
content.echo()
}
}
```Output:
```text
1:1 - 3:4:
2:5 - 2:16:
2:8 - 2:12: Hello
```### Element names
Element names can be read and set by the using the property `name` of an element. After setting of a new name different from the existing one, the element is registered with the new name in the document, if it is part of a document. Setting the same name does not change anything (it is an efficient non-change).
### Text
For a text content (`XText`) its text can be read and set via its property `value`. So there is no need to replace a `XText` content by another to change text. Please also see the section below on handling of text.
### Changing and reading attributes
The attributes of an element can be read and set via the “index notation”. If an attribute is not set, `nil` is returned; reversely, setting an attribute to `nil` results in removing it. Setting an attribute with a new name or removing an attribute changes the registering of attributes in the document, if the element is part of a document. Setting a non-nil value of an attribute that already exists is an efficient non-change concerning the registering if attributes.
Example:
```swift
// setting the "id" attribute to "1":
myElement["id"] = "1"// reading an attribute:
if let id = myElement["id"] {
print("the ID is \(id)")
}
```You can also get a sequence of attribute values (optional Strings) from a sequence of elements.
Example:
```swift
let document = try parseXML(fromText: """
""")
print(document.children.children["id"].joined(separator: ", "))
```Result:
```text
1, 2, 3
```If you want to get an attribute value and at the same time remove the attribute, use the method `pullAttribute(...)` of the element.
To get the names of all attributes of an element, use:
```swift
var attributeNames: [String]
```Note that you also can a (lazy) sequence of the attribute values of a certain attribute name of a (lazy) sequence of elements by using the same index notation:
```swift
print(myElement.children("myChildName")["myAttributeName"].joined(separator: ", "))
```### Attachments
All nodes can have “attachments”. Those are objects that can be attached via a textual key. Those attachments are not considered as belonging to the formal XML tree.
Those attachements are realized as a dictionary `attached` as a member of each node.
You can also set attachments immediately when creating en element or a document by using the argument `attached:` of the initializer. (Note that in this argument, some values might be `nil` for convenience.)
### XPath
Get the XPath of a node via:
```swift
var xPath: String
```## Traversals
Traversing a tree depth-first starting from a node (including a document) can be done by the following methods:
```swift
func traverse(down: (XNode) throws -> (), up: ((XNode) throws -> ())? = nil) rethrows
``````swift
func traverse(down: (XNode) async throws -> (), up: ((XNode) async throws -> ())? = nil) async rethrows
```For a “branch”, i.e. a node that might contain other nodes (like an element, opposed to e.g. text, which does not contain other nodes), when returning from the traversal of its content (also in the case of an empty branch) the closure given the optional `up:` argument is called.
Example:
```swift
document.traverse { node in
if let element = node as? XElement {
print("entering element \(element.name)")
}
}
up: { node in
if let element = node as? XElement {
print("leaving element \(element.name)")
}
}
```Note that the root of the traversal is not to be removed during the traversal.
## Direct access to elements
As mentioned and the general description, the library allows to efficiently find elements of a certain name in a document without having to traverse the whole tree.
Finding the elements of a certain name:
```swift
func elements(_: String) -> XElementsOfSameNameSequence
```Example:
```swift
for paragraph in myDocument.elements("paragraph") {
if let id = paragraph["id"] {
print("found paragraph with ID \"\(ID)\"")
}
}
```Find the elements of several name alternatives by using several names in `elements(_:)`. Note that just like the methods for single names, what you add during the iteration will then also be considered.
## Finding related content
Starting from some content, you might want to find related content, e.g. its children. The names chosen for the accordings methods come from the idea that all content have a natural order, namely the order of a depth-first traversal, which is the same order in which the content of an XML document is stored in a text file. This order gives a meaning to method names such a `nextTouching`. Note that, other than for the iterations you get via `elements(_:)`, even nodes that stay in the same document can occur in such an iteration sevaral times if moved accordingly during the iteration.
Sequences returned are always lazy sequences, iterating through them gives items of the obvious type. As mentioned in the general description of the library, manipulating the XML tree during such an iteration is allowed.
Finding the document the node is contained in:
```swift
var document: XDocument?
```Finding the parent element:
```swift
var parent: XElement?
```All its ancestor elements:
```swift
var ancestors: XElementSequence
```Get the first content of a branch:
```swift
var firstContent: XContent?
```Get the last content of a branch:
```swift
var lastContent: XContent?
```If there is exactly one node contained, get it, else get `nil`:
```swift
var singleContent: XContent?
```The direct content of a document or an element (“direct” means that their parent is this document or element):
```swift
var content: XContentSequence
```The direct content that is an element, i.e. all the children:
```swift
var children: XElementSequence
```The direct content that is text:
```swift
var immediateTexts: XTextSequence
```For the `content` and `children` sequences, there also exist the sequences `contentReversed`, `childrenReversed`, and `immediateTextsReversed` which iterate from the last corresponding item to the first.
All content in the tree of nodes that is started by the node itself, without the node itself, in the order of a depth-first traversal:
```swift
var allContent: XContentSequence
```All content in the tree of nodes that is started by the node, starting with the node itself:
```swift
var allContentIncludingSelf: XContentSequence
```All texts in the tree:
```swift
var allTexts: XTextSequence
```The descendants, i.e. all content in the tree of nodes that is started by the node, without the node itself, that is an element:
```swift
var descendants: XElementSequence
```If a node is an element, the element itself and the descendants, starting with the element itself:
```swift
var descendantsIncludingSelf: XElementSequence
```All texts in the tree of nodes that is started by the node itself, without the node itself, in the order of a depth-first traversal:
```swift
var allTexts: XTextSequence
```The same but only for the nodes contained as direct content:
```swift
var immediateTexts: XTextSequence
```The (direct) content of an branch (element or document) are “siblings” to each other.
The content item previous to the subject:
```swift
var previousTouching: XContent?
```The content item next to the subject:
```swift
var nextTouching: XContent?
```(Note that for autocompletion it might be better to start type “touch...” instead of “prev...” or “next...”.)
You might also just be interested if a previous or next node exists:
```swift
var hasPrevious: Bool
var hasNext: Bool
```The following very short method names `previous` and `next` actually mean “the previous content” and “the next content”, repectively. Those method names are chosen to be so short because they are such a common use case.
All nodes previous to the node (i.e. the previous siblings) _on the same level,_ i.e. of the same parent, in the order from the node:
```swift
var previous: XContentSequence
```Of those, the ones that are elements:
```swift
var previousElements: XElementSequence
```Analogously, the content next to the node:
```swift
var next: XContentSequence
```Of those, the ones that are elements:
```swift
var nextElements: XElementSequence
```Example:
```swift
for descendant in myElement.descendants {
print("the name of the descendant is \(descendant.name)")
}
```Note that a sequence might be used several times:
```swift
let document = try parseXML(fromText: """
""")let insideA = document.children.children
insideA.echo()
print("again:")
insideA.echo()
```Output:
```text
again:
```
Once you have such a sequence, you can get the first item in the sequence via its property `first` (which is introduced by this package in addition to the already defined `first(where:)`).
The usual methods of sequences can be used. E.g., use `mySequence.dropFirst(n)` to drop the first `n` items of the sequence `mySequence`. E.g. to get the third item of the sequence, use ``mySequence.dropFirst(2).first`.
Note that there is no property getting you the last item of those sequences, as it would be quite inefficient. Better use `contentReversed` or `childrenReversed` in combination with `first`.
Test if something exists in a sequence by using `exist`:
```swift
var exist: Bool
```Note that after using `exist`, you can still iterate normally along the same sequence, without loosing an item.
Test if nothing exists in a sequence by using `absent`:
```
var absent: Bool
```If you would like to test if certain items exist, and many cases you would also then use those items. The property `existing` of a sequence of content or elements returns the sequence itself if items exist, and `nil` otherwise:
```swift
var existing: XContentSequence?
var existing: XElementSequence?
```In the following example, a sequence is first tested for existing items and, if items exist, then used:
```swift
let document = try parseXML(fromText: """
""")if let theBs = document.descendants("b").existing {
theBs.echo()
}
```Note that what you get by using `existing` still is a lazy sequence, i.e. if you change content between the `existing` test and using its result, then there might be no more items left to be found.
You may also ask for the previous or next content item in the tree, in the order of a depth-first traversal. E.g. if a node is the last node of a subtree starting at a certain element and the element has a next sibling, this next sibling is “the next node in the tree” for that last node of the subtree. Getting the next or previous node in the tree is very efficient, as the library keep track of them anyway.
The next content item in the tree:
```swift
var nextInTreeTouching: XContent?
```The previous content item in the tree:
```swift
var previousInTreeTouching: XContent?
```Find all text contained the tree of a node and compose them into a single `String`:
```swift
var allTextsCombined: String
```You may use these text collecting properties even when you know that there is only one text to be “combined”, this case is efficiently implemented.
You might also turn a single content item or, more specifically, an element into an appropriate sequence using the following methods:
For any content:
```swift
var asSequence: XContentSequence
```For an element:
```swift
var asElementSequence: XElementSequence
```(These two methods are used in the tests of the library.)
## Finding related nodes with filters
All of the methods in the previous section that return a sequence also allow a condition as a first argument for filtering. We distinguish between the case of all items of the sequence fullfilling a condition, the case of all items while a condition is fullfilled, and the case of all items until a condition is fullfilled (excluding the found item where the condition fullfilled):
```swift
func content((XContent) -> Bool) -> XContentSequence
func content(while: (XContent) -> Bool) -> XContentSequence
func content(until: (XContent) -> Bool) -> XContentSequence
func content(untilAndIncluding: (XContent) -> Bool) -> XContentSequence
```The `untilAndIncluding` version also stops where the condition is fullfilled, but _includes_ the according item.
Sequences of a more specific type are returned in sensible cases.
Example:
```swift
let document = try parseXML(fromText: """
""")for descendant in document.descendants({ element in element["take"] == "true" }) {
print(descendant)
}
```Output:
```text```
Note that the round parentheses “(...)” around the condition in the example is needed to distinguish it from the `while:` and `until:` versions. (There is no `where:` argument name, because without it the less common case `while:` – and to a lesser degree `until:` – is more easily visually distinguished from it, the more common case being syntactically the shortest. This plays out well in actual code.)
There also exist a shortcut for the common of filtering elements according to a name:
```swift
for _ in document.descendants("paragraph") {
print("found a paragraph!")"
}
```You can also use multiple names (e.g. `descendants("paragraph", "table")`). If no name is given, all elements are given in the result regardless the name, e.g. `children()` means the same as `children`.
If you know that there at most one child element with a certain name, use the following method (it returns the first child with this name if it exist):
```swift
func firstChild(_ name: String) -> XElement?
```You might then also consider alternative names (giving you the first child where the name matches):
```swift
func firstChild(_ names: String...) -> XElement?
```If you want to get the first ancestor with a certain name, use one of the following methods:
```swift
func ancestor(_ name: String) -> XElement?
func ancestor(_ names: String...) -> XElement?
```## Chained iterators
Iterators can also be chained. The second iterator is executed on each of the node encountered by the first iterator. All this iteration is lazy, so the first iterator only searches for the next node if the second iterator is done with the current node found by the first iterator.
Example:
```swift
let document = try parseXML(fromText: """
""")for element in document.descendants.descendants { print(element) }
```Output:
```text
```
Also, in those chains operations finding single nodes when applied to a single node like `parent` also work, and you can use e.g. `insertNext` (see the section on tree manipulations), or `with` (see the next section on constructing XML), or `echo()`.
When using an index with a `String`, you get a sequence of the according attribute values (where set):
```swift
for childID in element.children["id"] {
print("found child ID \(childID)")
}
```Note that when using an `Int` as subscript value for a sequence of content, you get the child of the according index:
```swift
if let secondCHild = element.children[2] {
print("second child: \(secondChild)")
}
```---
**NOTE**If you use this subscript notation `[n]` for a sequence of XContent, XElement, or XText, then – despite using integer values – this is not (!) a random access to the elements (each time using such a subscript, the sequence is followed until the according item is found by counting), and the counting starts at 1 as in the XPath language, and not at 0 as e.g. for Swift arrays.
You should see this integer subscript more as a subscript with names, the integer values being the names that the positions are given in the XML, where counting from 1 is common.
---
## Constructing XML
### Constructing an empty element
When constructing an element (without content), the name is given as the first (nameless) argument and the attribute values are given as (nameless) a dictionary.
Example: constructing an empty “paragraph” element with attributes `id="1"` and `style="note"`:
```swift
let myElement = XElement("paragraph", ["id": "1", "style": "note"])
```### About the insertion of content
We would first like to give some important hints before we explain the corresponding functionalities in detail.
Note that when inserting content into an element or document and that content already exists somewhere else, the inserted content is _moved_ from its orginal place, and not copied. If you would like to insert a copy, insert the result of the `clone()` method of the content.
Be “courageous” when formulating your code, more might function than you might have thought. Anticipating the explanations in the following sections, e.g. the following code examples _do_ work:
Moving the “a” children and the “b” children of an element to the beginning of the element:
```swift
element.addFirst {
element.children(“a”)
element.children(“b”)
}
```As the content is first constructed and then inserted, there is no inifinite loop here.
Note that in the result, the order of the content is just like defined inside the parentheses `{...}`, so in the example inside the resulting `element` there are first the “a” children and then the “b” children.
Wrap an element with another element:
```swift
element.replace {
XElement("wrapper") {
element
}
}
```The content that you define inside parentheses `{...}` is constructed from the inside to the outside. From the notes above you might then think that `element` in the example is not as its original place any more when the content of the “wrapper” element has been constructed, before the replacement could actually happen. Yes, this is true, but nevertheless the `replace` method still knows where to insert this “wrapper” element. The operation does work as you would expect from a naïve perspective.
An instance of any type conforming to `XContentConvertible` (it has to implement its `collectXML(by:)` method) can be inserted as XML:
```swift
struct MyStruct: XContentConvertible {
let text1: String
let text2: String
func collectXML(by xmlCollector: inout XMLCollector) {
xmlCollector.collect(XElement("text1") { text1 })
xmlCollector.collect(XElement("text2") { text2 })
}
}let myStruct1 = MyStruct(text1: "hello", text2: "world")
let myStruct2 = MyStruct(text1: "greeting", text2: "you")let element = XElement("x") {
myStruct1
myStruct2
}element.echo(pretty: true)
```Result:
```xml
hello
world
greeting
you```
For `XContentConvertible` there is also the `xml` property that returns an according array of `XContent`.
### Defining content
When constructing an element, its contents are given in parentheses `{...}` (those parentheses are the `builder` argument of the initializer).
```swift
let myElement = XElement("div") {
XElement("hr")
XElement("paragraph") {
"Hello World"
}
XElement("hr")
}
```(The text `"Hello World"` could also be given as `XText("Hello World")`. The text will be converted in such an XML node automatically.)
The content might be given as an array or an appropriate sequence:
```swift
let myElement = XElement("div") {
XElement("hr")
myOtherElement.content
XElement("hr")
}
```When not defining content, using `map` might be a sensible option:
```swift
let element = XElement("z") {
XElement("a") {
XElement("a1")
XElement("a2")
}
XElement("b") {
XElement("b1")
XElement("b2")
}
}for content in element.children.map({ $0.children.first }) { print(content?.name ?? "-") }
```Output:
```text
a1
b1
```The same applies to e.g. the `filter` method, which, besides letting the code look more complex when used instead of the filter options described above, is not a good option when defining content.
The content of elements containing other elements while defining their content is being built from the inside to the ouside: Consider the following example:
```swift
let b = XElement("b")let a = XElement("a") {
b
"Hello"
}a.echo(pretty: true)
print("\n------\n")
b.replace {
XElement("wrapper1") {
b
XElement("wrapper2") {
b.next
}
}
}a.echo(pretty: true)
```First, the element “wrapper2” is built, and at that moment the sequence `b.next` contains the text `"Hello"`. So we will get as output:
```text
Hello------
Hello
```### Document membership in constructed elements
Elements that are part of a document (`XDocument`) are registered in the document. The reason is that this allows fast access to elements and attributes of a certain name via `elements(_:)` and the exact functioning of rules (see the section below on rules).
In the moment of constructing a new element with its content defined in `{...}` brackets during construction, the element is not part any document. The nodes inserted to it leave the document tree, but they are not (!) unregistered from the document. I.e. the iteration `elements(_:)` will still find them, and according rules will apply to them. The reason for this behaviour is the common case of the new element getting inserted into the same document. If the content of the new element would first get unregistered from the document and then get reinserted into the same document again, they would then count as new elements, and the mentioned iterations might iterate over them again.
If you would like to get the content a newly built element to get unregistered from the document, use its method `adjustDocument()`. This method diffuses the current document of the element to its content. For a newly built element this document is `nil`, which unregisters a node from its document. You might also set the attribute `adjustDocument` to `true` in the initializer of the element to automatically call `adjustDocument()` when the building of the new element is accomplished. This call or setting to adjust of the document is only necessary at the top-level element, it is dispersed through the whole tree.
Note that if you insert an element into another document that is part of a document, the new child gets registered in the document of its new parent if not already registered there (and unregistered from any different document where it was registered before).
Example: a newly constructed element gets added to a document:
```swift
let document = try parseXML(fromText: """
""")for element in document.elements("b") {
print("applying the rule to \(element)")
if element["id"] == "2" {
element.insertNext {
XElement("c") {
element.previous
}
}
}
}print("\n-----------------\n")
document.echo()
```Output:
```text
applying the rule to
applying the rule to-----------------
As you can see from the `print` commands in the last example, the element `` does not lose its “connection” to the document (although it seems to get added again to it), so it is only iterated over once by the iteration.
## Tree manipulations
Besides changing the node properties, an XML tree can be changed by the following methods. Some of them return the subject itself as a discardable result. For the content specified in `{...}` (the builder) the order is preserved.
Add nodes at the end of the content of an element or a document respectively:
```swift
func add(builder: () -> [XContent])
```Add nodes to the start of the content of an element or a document respectively:
```swift
func addFirst(builder: () -> [XContent])
```Add nodes as the nodes previous to the node:
```swift
func insertPrevious(_ insertionMode: InsertionMode = .following, builder: () -> [XContent])
```Add nodes as the nodes next to the node:
```swift
func insertNext(_ insertionMode: InsertionMode = .following, builder: () -> [XContent])
```A more precise type is returned from `insertPrevious` and `insertNext` if the type of the subject is more precisely known.
By using the next two methods, a node gets removed.
Remove the node from the tree structure and the document:
```swift
func remove()
```You might also use the method `removed()` of a node to remove the node but also use the node.
Replace the node by other nodes:
```swift
func replace(_ insertionMode: InsertionMode = .following, builder: () -> [XContent])
```Note that the content that replaces a node is allowed to contain the node itself.
Clear the contents of an element or a document respectively:
```swift
func clear()
```Test if an element or a document is empty:
```swift
var isEmpty: Bool
```Set the contents of an element or a document respectively:
```swift
func setContent(builder: () -> [XContent])
```Example:
```swift
for table in myDocument.elements("table") {
table.insertNext {
XElement("legend") {
"this is the table legend"
}
XElement("caption") {
"this is the table caption"
}
}
}
```Note that by default iterations continue with new nodes inserted by `insertPrevious` or `insertNext` also being considered. In the following cases, you have to add the `.skipping` directive to get the output as noted below (in the second case, you even get an infinite loop if you do not set `.skipping`):
```swift
let element = XElement("top") {
XElement("a1") {
XElement("a2")
}
XElement("b1") {
XElement("b2")
}
XElement("c1") {
XElement("c2")
}
}element.echo(pretty: true)
print("\n---- 1 ----\n")
for content in element.content {
content.replace(.skipping) {
content.content
}
}element.echo(pretty: true)
print("\n---- 2 ----\n")
for content in element.contentReversed {
content.insertPrevious(.skipping) {
XElement("I" + ((content as? XElement)?.name ?? "?"))
}
}element.echo(pretty: true)
```Output:
```text
---- 1 ----
---- 2 ----
```
Note that there is no such mechanism to skipping inserted content when not using `insertPrevious`, `insertNext`, or `replace`, e.g. when using `add`. Consider the combination `descendants.add`: there is then no “natural” way to correct the traversal of the tree. (A more common use case would be something like `descendants("table").add { XElement("caption") }`, so this should not be a problem in common cases, but something you should be aware of.)
When using `insertNext`, `replace` etc. in chained iterators, what happens is that the definition of the content in the parentheses `{...}` get _executed_ for each item in the sequence. You might should use the `collect` function to build content specifically for the current item instead. E.g. in the last example, you might use with the same result:
```swift
print("\n---- 1 ----\n")element.content.replace { content in
collect {
content.content
}
}element.echo(pretty: true)
print("\n---- 2 ----\n")
element.contentReversed.insertPrevious { content in
find {
XElement("I" + ((content as? XElement)?.name ?? "?"))
}
}element.echo(pretty: true)
```You may also not use `collect`:
```swift
let e = XElement("a") {
XElement("b")
XElement("c")
}for descendant in e.descendants({ $0.name != "added" }) {
descendant.add { XElement("added") }
}e.echo(pretty: true)
```Output:
Note that a new `` is created each time. From what has already bee said, it should be clear that this “duplication” does not work with existing content (unless you use `clone()` or `shallowClone()`):
```swift
let myElement = XElement("a") {
XElement("to-add")
XElement("b")
XElement("c")
}for descendant in myElement.descendants({ $0.name != "to-add" }) {
descendant.add {
myElement.descendants("to-add")
}
}myElement.echo(pretty: true)
```Output:
As a general rule, when inserting a content, and that content is already part of another element or document, that content does not get duplicated, but removed from its original position.
Use `clone()` (or `shallowClone()`) when you actually want content to get duplicated, e.g. using `myElement.descendants("to-add").clone()` in the last example would then output:
By default, When you insert content, this new content is also followed (insertion mode `.following`), as this best reflects the dynamic nature of this library. If you do not want this, set `.skipping` as first argument of `insertPrevious` or `insertNext`. For example, consider the following code:
```swift
let myElement = XElement("top") {
XElement("a")
}for element in myElement.descendants {
if element.name == "a" {
element.insertNext() {
XElement("b")
}
}
else if element.name == "b" {
element.insertNext {
XElement("c")
}
}
}myElement.echo(pretty: true)
```Output:
```text
```
When `` gets inserted, the traversal also follows this inserted content. When you would like to skip the inserted content, use `.skipping` as the first argument of `insertNext`:
```swift
...
element.insertNext(.skipping) {
XElement("b")
}
...
```Output:
```text
```
Similarly, if you replace a node, the content that gets inserted in place of the node is by default included in the iteration. Example: Assume you would like to replace every occurrence of some `` element by its content:
```swift
let document = try parseXML(fromText: """
Hello
""")
for bold in document.descendants("bold") { bold.replace { bold.content } }
document.echo()
```The output is:
```text
Hello
```## Handling of text
Subsequent text nodes (`XText`) are always automatically combined, and text nodes with empty text are automatically removed. The same treatment is applied to `XLiteral` nodes.
This can be very convenient when processing text, e.g. it is then very straightforward to apply regular expressions to the text in a document. But there might be some stumbling blocks involved here, when the different behaviour of text nodes and other nodes affects the result of your manipulations.
You can avoid merging of text `text` with other texts by setting the `isolated` property to `true` (you can also choose to set this value during initialization of an XText). Consider the following example where the occurrences of a search text gets a greenish background. In this example, you do not want `part` to be added to `text` in the iteration:
```swift
let searchText = "world"document.traverse { node in
if let text = node as? XText {
if text.value.contains(searchText) {
text.isolated = true
var addSearchText = false
for part in text.value.components(separatedBy: searchText) {
text.insertPrevious {
addSearchText ? XElement("span", ["style": "background:LightGreen"]) {
searchText
} : nil
part
}
addSearchText = true
}
text.remove()
text.isolated = false
}
}
}document.echo()
```Output:
```text
Hello world, the world is nice.
```Note that when e.g. inserting nodes, the `XText` nodes of them are then treated as being `isolated` while being moved.
A `String` can be used where an `XText` is required, e.g. you can write `"Hello" as XText"`.
`XText`, as well as `XLiteral` and `XCDATASection`, conforms to the `XTextualContentRepresentation` protocol, i.e. they all have a `String` property of name `value` that can be read and set and which represents content as it would be written into the serialized document (with some character escapes necessary in the case of `XText` when it is being written). Note that `XComment` does not conform to the `XTextualContentRepresentation` protocol.
## Rules
When you only want to apply a few changes to a document, just go directly to the few according elements and apply the changes you want. But if you would like to transform a whole document into “something else”, you need a better tool to organise your manipulations of the document, you need a “transformation”.
As mentioned in the general description, a set of rules `XRule` in the form of a transformation instance of type `XTransformation` can be used as follows.
In a rule, the user defines what to do with elements or attributes certain names. The set of rules can then be applied to a document, i.e. the rules are applied in the order of their definition. This is repeated, guaranteeing that a rule is only applied once to the same object (if not removed from the document and added again), until no application takes place. So elements can be added during application of a rule and then later be processed by the same or another rule.
Example:
```swift
let document = try parseXML(fromText: """
""")var count = 1
let transformation = XTransformation {
XRule(forElements: "formula") { element in
print("\n----- Rule for element \"formula\" -----\n")
print(" \(element)")
if count == 1 {
count += 1
print(" add image")
element.insertPrevious {
XElement("image", ["id": "\(count)"])
}}
}XRule(forElements: "image") { element in
print("\n----- Rule for element \"image\" -----\n")
print(" \(element)")
if count == 2 {
count += 1
print(" add formula")
element.insertPrevious {
XElement("formula", ["id": "\(count)"])
}
}
}}
transformation.execute(inDocument: document)
print("\n----------------------------------------\n")
document.echo()
``````text
----- Rule for element "formula" -----
add image----- Rule for element "image" -----
add formula----- Rule for element "formula" -----
----------------------------------------
As a side note, for such an `XTransformation` the lengths of the element names do not really matter: apart from the initialization of the transformation before the execution and from what happens inside the rules, the appliance of the rules is not less efficient if the element names are longer.
Instead of using a transformation with a very large number of rules, you should use several transformations, each dedicated to a separate “topic”. E.g. for some document format you might first transform the inline elements and then the block elements. Splitting a transformation into several transformations practically does not hurt performance.
Note that the order of the rules matters: If you need to look up e.g. the parent of the element in a rule, it is important to know if this parent has already been changed by another rule, i.e. if a preceding rule has transformed this element. An example is given in the following section “Transformations with inverse order”. The usage of several transformations as described in the preciding paragraph might help here. Methods to work with better contextual information are described in the sections “Transformations with attachments for context information”, “Transformations with document versions”, and “Transformations with traversals” below.
Also note that using an `XTransformation` you can only transform a whole document. In the section “Transformations with traversals” below, another option is described for transforming any XML tree.
A transformation can be stopped by calling `stop()` on the transformation, although that only works indirectly:
```swift
var transformationAlias: XTransformation? = nillet transformation = XTransformation {
XRule(forElements: "a") { _ in
transformationAlias?.stop()
}}
transformationAlias = transformation
transformation.execute(inDocument: myDocument)
```## Transformations with inverse order
As noted in the last section, the order of rules a crucial in some transformation, e.g. if the original context is important.
The “inverse order” of rules goes from the inner elements to the outer element so that the context is still unchanged when the rule applies, note the lookup of `element.parent?.name` to differentiate the color of the text:
```swift
let document = try parseXML(fromText: """
This is a hint.
This is a warning.
""", textAllowedInElementWithName: { $0 == "paragraph" })let transformation = XTransformation {
XRule(forElements: "paragraph") { element in
let style: String? = if element.parent?.name == "warning" {
"color:Red"
} else {
nil
}
element.replace {
XElement("p", ["style": style]) {
element.content
}
}
}XRule(forElements: "hint", "warning") { element in
element.replace {
XElement("div") {
XElement("p", ["style": "bold"]) {
element.name.uppercased()
}
element.content
}
}
}
}transformation.execute(inDocument: document)
document.echo(pretty: true)
```Result:
```XML
HINT
This is a hint.
WARNING
This is a warning.
```
This method might not be fully applicable in some transformations.
## Transformations with attachments for context information
To have information about the context in the original document of transformed elements, attachements might be used. See how in the following code `attached: ["source": element.name]` is used in the construction of the `div` element, and how this information is then used in the rules for the `paragraph` element (the input document is the same as in the section “Transformations with inverse order” above; note that the inverse order described in that section is _not_ used here):
```swift
let transformation = XTransformation {XRule(forElements: "hint", "warning") { element in
element.replace {
XElement("div", attached: ["source": element.name]) {
XElement("p", ["style": "bold"]) {
element.name.uppercased()
}
element.content
}
}
}XRule(forElements: "paragraph") { element in
let style: String? = if element.parent?.attached["source"] as? String == "warning" {
"color:Red"
} else {
nil
}
element.replace {
XElement("p", ["style": style]) {
element.content
}
}
}
}transformation.execute(inDocument: document)
document.echo(pretty: true)
```The result is the same as in the section “Transformations with inverse order” above.
## Transformations with document versions
As explained in the above section about rules, sometimes you need to know the original context of a transformed element. For this you can use document versions, as explained below.
Note that this method comes with an penalty regarding efficiency because to need to create a (temparary) clone, but for very difficult transformations that might come in handy. The method might be used when you need to examine the orginal context in a complex way.
You first create a document version (this creates a clone such that your current document contains backlinks to the clone), and in certian rules, you might then copy the backlink from the node to be replaced by using the `withBackLinkFrom:` argument in the creation of an element (the input document is the same as in the section “Transformations with inverse order” above):
```swift
let transformation = XTransformation {XRule(forElements: "hint", "warning") { element in
element.replace {
XElement("div", withBackLinkFrom: element) {
XElement("p", ["style": "bold"]) {
element.name.uppercased()
}
element.content
}
}
}XRule(forElements: "paragraph") { element in
let style: String? = if element.parent?.backLink?.name == "warning" {
"color:Red"
} else {
nil
}
element.replace {
XElement("p", ["style": style]) {
element.content
}
}
}
}// make a clone with inverse backlinks,
// pointing from the original document to the clone:
document.makeVersion()transformation.execute(inDocument: document)
// remove the clone:
document.forgetLastVersion()document.echo(pretty: true)
```The result is the same as in the section “Transformations with inverse order” above.
## Transformations with traversals
There is also another possibility for formulating transformations which uses traversals and which and can also be applied to parts of a document or to XML trees that are not part of a document.
As the XML tree can be changed during a traversal, you can traverse an XML tree and change the tree during the traversal by e.g. formulating manipulations according to the name of the current element inside a `switch` statement.
If you then formulate manipulations during the down direction of the traversal, you know that parents or other ancestors of the current node have already been transformed. Conversely, if you formulate manipulations only inside the `up:` traversal part and never manipulate any ancestors of the current element, you know that the parent and other ancestors are still the original ones (the input document is the same as in the section “Transformations with inverse order” above):
```swift
for section in document.elements("section") {
section.traverse { node in
// -
} up: { node in
if let element = node as? XElement {
guard node !== section else { return }
switch element.name {
case "paragraph":
let style: String? = if element.parent?.name == "warning" {
"color:Red"
} else {
nil
}
element.replace {
XElement("p", ["style": style]) {
element.content
}
}
case "hint", "warning":
element.replace {
XElement("div") {
XElement("p", ["style": "bold"]) {
element.name.uppercased()
}
element.content
}
}
default:
break
}
}
}
}document.echo(pretty: true)
```As the root of the traversal is not to be removed during the traversal, there is an according `guard` statement.
The result is the same as in the section “Transformations with inverse order” above.
Note that when using traversals for transforming an XML tree, using several transformations instead of one does have a negative impact on efficiency.
## Handling of namespaces
The library is very strong when it comes to tracking elements of a certain name and formulating according rules. Adding an additional layer by supporting namespaces directly at those points would make the implementation of the library more complicated and less efficient. Let us see then how one would then handle XML documents which are using namespaces.
First, you can always look up the namespace prefix settings (attributes `xmlns:...`) in your document. As mentioned in the section about limitations of the XML input, the annotations of namespace prefixes via `xmlns:...` attributes should only be at the root element of the XML source. There are then the following two helper methods to help you with the task of handling the namespaces:
Read the the full prefix for a namespace URL string from the root element:
```swift
XDocument.fullPrefix(forNamespace:) -> String
```“Full” means that a closing `:` is added automatically. If no prefix is defined, an empty string is returned.
Get a map from the namespace URL strings to the full prefixes from the root element:
```swift
XDocument.fullPrefixesForNamespaces
```When you then like to access or change elements in that namespace, add the according prefix dynamically in your code:
```swift
let fullMathMLPrefix = myDocument.fullPrefix(forNamespace: "http://www.w3.org/1998/Math/MathML")let transformation = XTransformation {
XRule(forElements: "\(fullMathMLPrefix)a") { a in
...
}...
```If you would like to add a namespace declaration at the root element, use the following method:
```swift
XDocument.setNamespace(:withPossiblyFullPrefix:)
```Here the prefix might be a “full” prefix, i.e. it could contain a closing `:`. An existing namespace declaration for the same namespace but with another prefix is not (!) removed.
Note these three helper methods are also avalaible for an element.
### Using async/await
You can use `traverse` with closures using `await`. And you can use the `async` property of the [Swift Async Algorithms package](https://github.com/apple/swift-async-algorithms) (giving a `AsyncLazySequence`) to apply `map` etc. with closures using `await` (e.g. `element.children.async.map { await a.f($0) }`).
Currently the SwiftXML packages defined a `forEachAsync` method for closure arguments using `await`, but this method might be removed in future versions of the package if the Swift Async Algorithms package should define it for `AsyncLazySequence`.
### Convenience extensions
`XContent` has the following extensions that are very convenient when working with XML in a complex manner:
- `applying`: apply some changes to an instance and return the instance
- `pulling`: take the content and give something else back, e.g. “pulling” something out of it
- `fullfilling`: test a condition for an instance and return it the condition is true, else return `nil`
- `fullfills`: test a condition on an instance return its result(`fullfilling` is, in principle, a variant of the `filter` method for just one item.)
It is difficult to show the convenience of those extension with simple examples, where is easy to formulate the code without them. But they come in handy if the situation gets more complex.
Example:
```swift
let element1 = XElement("a") {
XElement("child-of-a") {
XElement("more", ["special": "yes"])
}
}let element2 = XElement("b")
if let childOfA = element1.fullfilling({ $0.name == "a" })?.children.first,
childOfA.children.first?.fullfills({ $0["special"] == "yes" && $0["moved"] != "yes" }) == true {
element2.add {
childOfA.applying { $0["moved"] = "yes" }
}
}element2.echo()
```Result:
```text
````applying` is also predefined for a content sequence or a element sequence where it is shorter than using the `map` method in the general case (where a `return` statement might have to be included) and you can directly use it to define content (without the `asContent` property decribed above):
```swift
let myElement = XElement("a") {
XElement("b", ["inserted": "yes"]) {
XElement("c", ["inserted": "yes"])
}
}print(Array(myElement.descendants.applying{ $0["inserted"] = "yes" }))
```Result:
```text
[, ]
```## Tools
### `copyXStructure`
```swift
public func copyXStructure(from start: XContent, to end: XContent, upTo: XElement? = nil, correction: ((StructureCopyInfo) -> XContent)?) -> XContent?
```Copies the structure from `start` to `end`, optionally up to the `upTo` value. `start` and `end` must have a common ancestor. Returns `nil` if there is no common ancestor. The returned element is a clone of the `upTo` value if a) it is not `nil` and b) `upTo` is an ancestor of the common ancestor or the ancestor itself. Else it is the clone of the common ancestor (but generally with a different content in both cases). The `correction` can do some corrections.
## Debugging
If one uses multiple instances of `XRule` bundled into a `XTRansformation` to transform a whole document, in can be useful to know which actions belonging to which rules "touched" an element. In debug builds all filenames and line numbers that are executed by a transformation during execution are recorded in the `encounteredActionsAt` property.