https://github.com/dsillman2000/yaml-reference
YAML tagging system for reading and writing modular YAML files with Python
https://github.com/dsillman2000/yaml-reference
python ruamel-yaml yaml
Last synced: 4 months ago
JSON representation
YAML tagging system for reading and writing modular YAML files with Python
- Host: GitHub
- URL: https://github.com/dsillman2000/yaml-reference
- Owner: dsillman2000
- License: mit
- Created: 2025-05-25T14:37:49.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-01T15:17:12.000Z (about 1 year ago)
- Last Synced: 2025-10-29T06:44:49.911Z (8 months ago)
- Topics: python, ruamel-yaml, yaml
- Language: Python
- Homepage:
- Size: 43.9 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# yaml-reference
Using `ruamel.yaml`, support cross-file references and YAML composition in YAML files using tags `!reference`, `!reference-all`, `!flatten`, and `!merge`.
Install the package from PyPI with:
```bash
# pip
pip install yaml-reference
# poetry
poetry add yaml-reference
# uv
uv add yaml-reference
```
## Spec

This Python library implements the YAML specification for cross-file references and YAML composition in YAML files using tags `!reference`, `!reference-all`, `!flatten`, and `!merge` as defined in the [yaml-reference-specs project](https://github.com/dsillman2000/yaml-reference-specs).
## Example
```yaml
# root.yaml
version: "3.1"
services:
- !reference
path: "services/website.yaml"
- !reference
path: "services/database.yaml"
networkConfigs:
!reference-all
glob: "networks/*.yaml"
tags: !flatten
- !reference { path: "common/tags.yaml" }
- "web"
- "service"
config: !merge
- !reference { path: "config/defaults.yaml" }
- !reference { path: "config/overrides.yaml" }
```
Supposing there are `services/website.yaml` and `services/database.yaml` files in the same directory as `root.yaml`, and a `networks` directory with YAML files, the above will be expanded to account for the referenced files with the following Python code:
```python
from yaml_reference import load_yaml_with_references
data = load_yaml_with_references("root.yaml")
print(data)
# {"networkConfigs": [{"network": "vpn","version": "1.1"},{"network": "nfs","version": "1.0"}],"services": ["website","database"],"version": "3.1"}
# With path restrictions for security
data = load_yaml_with_references("root.yaml", allow_paths=["/allowed/path"])
```
Note that the `load_yaml_with_references` function instantiates a `ruamel.yaml.YAML` loader class (`typ='safe'`) to perform the deserialization of the YAML files, and returns a Python dictionary with the recursively-expanded YAML data.
If you wish to resolve one "layer" of references without recursively exhausting the entire reference graph, the `parse_yaml_with_references` function can be used to obtain the original YAML document's contents with `!reference`/`!reference-all` tags as dedicated objects called `Reference` and `ReferenceAll`.
```python
from yaml_reference import parse_yaml_with_references
data = parse_yaml_with_references("root.yaml")
print(data["networkConfigs"])
# ReferenceAll(glob="networks/*.yaml", location="/path/to/root.yaml")
# With path restrictions for security
data = parse_yaml_with_references("root.yaml", allow_paths=["/allowed/path"])
```
### The `!merge` Tag
The `!merge` tag combines multiple YAML mappings (dictionaries) into a single mapping. This is useful for composing configuration from multiple sources or applying overrides. When you use `!merge`, you provide a sequence of mappings that will be merged together, with later mappings overriding keys from earlier ones.
```yaml
# Example: Merge default and override configurations
config: !merge
- {host: "localhost", port: 8080, debug: false}
- {port: 9000, debug: true} # Overrides port and debug from the first mapping
```
When loaded with `load_yaml_with_references`, this becomes `{"host": "localhost", "port": 9000, "debug": true}`. The `!merge` tag can also be nested and combined with `!reference` and `!flatten` tags for complex YAML composition scenarios.
Note that, if a nested sequence of mappings is provided to `!merge`, the sequence argument will be flattened first, and then the resulting mappings will be merged together. For example:
```yaml
config: !merge
- - a: 1
- b: 2
- c: 3
- - [{c: 5, a: 5}]
```
Will be processed into `{"config": {"a": 5, "b": 2, "c": 5}}` because the nested sequence of mappings will be flattened into a single sequence of mappings before merging.
### Using Anchors with `!reference` and `!reference-all`
Both `!reference` and `!reference-all` tags support an optional `anchor` parameter that allows you to import only a specific anchored section from a file, rather than the entire file contents. This is useful when you want to extract a particular part of a larger YAML document.
```yaml
# main.yaml
database_config: !reference
path: "config.yaml"
anchor: db_settings
api_keys: !reference-all
glob: "secrets/*.yaml"
anchor: api_key
```
In this example, if `config.yaml` contains multiple anchored sections, only the one labeled with `&db_settings` will be imported. Similarly, `!reference-all` will extract the `&api_key` anchor from each file matching the glob pattern.
Here's a practical example:
```yaml
# config.yaml
app_name: MyApplication
db_settings: &db_settings
host: localhost
port: 5432
database: myapp
cache_settings: &cache_settings
ttl: 3600
```
```yaml
# main.yaml
config: !reference
path: "config.yaml"
anchor: db_settings
```
When loaded with `load_yaml_with_references("main.yaml")`, the result will be:
```python
{
"config": {
"host": "localhost",
"port": 5432,
"database": "myapp"
}
}
```
Note that the `app_name` and `cache_settings` fields from `config.yaml` are not included in the result because only the anchored section was imported. If the specified anchor is not found in the referenced file, a `ValueError` will be raised.
### VSCode squigglies
To get rid of red squigglies in VSCode when using the `!reference`, `!reference-all`, `!flatten`, and `!merge` tags, you can add the following to your `settings.json` file:
```json
"yaml.customTags": [
"!reference mapping",
"!reference-all mapping",
"!flatten sequence",
"!merge sequence"
]
```
## CLI interface
There is a CLI interface for this package which can be used to read a YAML file which contains `!reference` tags and dump its contents as pretty-printed JSON with references expanded. This is useful for generating a single file for deployment or other purposes. Note that the keys of mappings will be sorted alphabetically. This CLI interface is used to test the contract of this package against the `yaml-reference-specs` project.
```bash
$ yaml-reference-cli -h
usage: yaml-reference-cli [-h] [--allow ALLOW_PATHS] input_file
Compile a YAML file containing !reference tags into a new YAML file with resolved references. Expects a YAML file to be provided via the "input_file" argument.
Outputs JSON content to stdout.
positional arguments:
input_file Path to the input YAML file with references to resolve and print as JSON.
options:
-h, --help show this help message and exit
--allow ALLOW_PATHS Path to allow references from.
$ yaml-reference-cli root.yaml
{
"networkConfigs": [
{
"network": "vpn",
"version": "1.1"
},
{
"network": "nfs",
"version": "1.0"
}
],
"services": [
"website",
"database"
],
"tags": [
"common:aws",
"common:http",
"common:security",
"common:waf",
"web",
"service"
],
"version": "3.1"
}
```
It's still possible to yield the results as a YAML file using the `yq` CLI tool ([mikefarah/yq](https://github.com/mikefarah/yq)).
```bash
$ yaml-reference-cli root.yaml | yq -P
networkConfigs:
- network: vpn
version: 1.1
- network: nfs
version: 1.0
services:
- website
- database
tags:
- common:aws
- common:http
- common:security
- common:waf
- web
- service
version: 3.1
# Pipe it to a result file
$ yaml-reference-cli root.yaml | yq -P > .compiled/root.yaml
```
## Circular reference protection
As required by the yaml-reference-specs specification, this package includes circular reference detection to prevent infinite recursion. If a circular reference is detected (e.g., A references B, B references C, C references A), a `ValueError` will be raised with a descriptive error message. This protects against self-references and circular chains in both `!reference` and `!reference-all` tags.
## Security considerations
### Path restriction and `allow_paths`
By default, `!reference` and `!reference-all` tags can only reference files within the same directory as the source YAML file (or child subdirectories). To allow references to files in other disparate directory trees, you must explicitly specify allowed paths using the `allow_paths` parameter:
```python
from yaml_reference import load_yaml_with_references
# Allow references from specific directories only
data = load_yaml_with_references(
"config.yml",
allow_paths=["/allowed/path1", "/allowed/path2"]
)
```
In the CLI, use the `--allow` flag:
```bash
yaml-reference compile input.yml --allow /allowed/path1 --allow /allowed/path2
```
Whether or not `allow_paths` is specified, the default behavior is to allow references to files in the same directory as the source YAML file (or subdirectories). "Back-navigating" out of a the root directory is not allowed (".." local references in a root YAML file). This provides a secure baseline to prevent unsafe access which is not explicitly allowed.
### Absolute path restrictions
References using absolute paths (e.g., `/tmp/file.yml`) are explicitly rejected with a `ValueError`. All reference paths must be relative to the source file's directory. If you absolutely must reference an absolute path, relative paths to symlinks can be used. Note that their target directories must be explicitly allowed to avoid permission errors (see the above section about "Path restriction and `allow_paths`").
## Acknowledgements
Contributor(s):
- David Sillman
- Personal website: https://www.dsillman.com
- Ryan Johnson