https://github.com/bug-ops/fast-yaml
Parse YAML at Rust speed. Full 1.2.2 spec, zero unsafe code, built-in linter, parallel processing. Native bindings for Python & Node.js.
https://github.com/bug-ops/fast-yaml
high-performance linter napi-rs nodejs parallel-processing parser pyo3 python rust yaml yaml-linter yaml-parser
Last synced: 20 days ago
JSON representation
Parse YAML at Rust speed. Full 1.2.2 spec, zero unsafe code, built-in linter, parallel processing. Native bindings for Python & Node.js.
- Host: GitHub
- URL: https://github.com/bug-ops/fast-yaml
- Owner: bug-ops
- Created: 2025-12-12T22:12:24.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-12-22T15:23:18.000Z (about 2 months ago)
- Last Synced: 2025-12-24T01:55:40.075Z (about 2 months ago)
- Topics: high-performance, linter, napi-rs, nodejs, parallel-processing, parser, pyo3, python, rust, yaml, yaml-linter, yaml-parser
- Language: Rust
- Homepage:
- Size: 517 KB
- Stars: 4
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# fast-yaml
[](https://github.com/bug-ops/fast-yaml/actions)
[](https://codecov.io/gh/bug-ops/fast-yaml)
[](https://crates.io/crates/fast-yaml-cli)
[](https://docs.rs/fast-yaml-core)
[](https://pypi.org/project/fastyaml-rs/)
[](https://www.npmjs.com/package/fastyaml-rs)
[](LICENSE-MIT)
**High-performance YAML 1.2.2 parser for Python and Node.js, powered by Rust.**
Drop-in replacement for PyYAML and js-yaml. Matches or beats PyYAML C on small/medium files, **2-4x faster** than pure Python, **1.2-1.4x faster** than js-yaml. Full YAML 1.2.2 Core Schema compliance, comprehensive linting, and multi-threaded parallel processing.
> [!IMPORTANT]
> **YAML 1.2.2 Compliance** — Unlike PyYAML (YAML 1.1), `fast-yaml` follows the modern YAML 1.2.2 specification. This means `yes/no/on/off` are strings, not booleans.
## Installation
```bash
# Python
pip install fastyaml-rs
# Node.js
npm install fastyaml-rs
# CLI
cargo install fast-yaml-cli
```
> [!WARNING]
> Requires Rust 1.88+, Python 3.10+ or Node.js 20+
Build from source
```bash
git clone https://github.com/bug-ops/fast-yaml.git
cd fast-yaml
# Python
uv sync && uv run maturin develop
# Node.js
cd nodejs && npm install && npm run build
```
## Quick Start
### Python
```python
import fast_yaml
data = fast_yaml.safe_load("""
name: fast-yaml
features: [fast, safe, yaml-1.2.2]
""")
yaml_str = fast_yaml.safe_dump(data)
```
> [!TIP]
> Migrating from PyYAML? Just change your import: `import fast_yaml as yaml`
### Node.js
```typescript
import { safeLoad, safeDump } from 'fastyaml-rs';
const data = safeLoad(`name: fast-yaml`);
const yamlStr = safeDump(data);
```
### CLI
```bash
# Single file operations
fy parse config.yaml # Validate syntax
fy format -i config.yaml # Format in-place
fy convert json config.yaml # YAML → JSON
fy lint config.yaml # Lint with diagnostics
# Batch mode (directories, globs, multiple files)
fy format -i src/ # Format entire directory
fy format -i "**/*.yaml" # Format with glob pattern
fy format -i -j 8 project/ # Parallel processing (8 workers)
fy lint --exclude "tests/**" . # Lint all except tests
```
> [!TIP]
> Batch mode activates automatically for directories, globs, or multiple files. Supports parallel processing, include/exclude patterns, and respects `.gitignore`.
## Features
- **High Performance** — Matches PyYAML C on small/medium files, 2-4x faster than pure Python
- **YAML 1.2.2** — Full Core Schema compliance
- **Drop-in API** — Compatible with PyYAML/js-yaml
- **Batch Processing** — Multi-file operations with parallel workers, glob patterns, .gitignore support
- **Linting** — Rich diagnostics with line/column tracking
- **Parallel** — Multi-threaded processing for large files
- **Safe** — Memory-safe Rust with minimal `unsafe` (FFI boundaries only, explicitly documented)
> [!TIP]
> Parallel processing provides 3-6x speedup on 4-8 core systems for multi-document files.
Feature details
### Linting
```python
from fast_yaml._core.lint import lint
diagnostics = lint("key: value\nkey: duplicate")
for diag in diagnostics:
print(f"{diag.severity}: {diag.message} at line {diag.span.start.line}")
```
### Parallel Processing
```python
from fast_yaml._core.parallel import parse_parallel, ParallelConfig
config = ParallelConfig(thread_count=4, max_input_size=100*1024*1024)
docs = parse_parallel(multi_doc_yaml, config)
```
## Performance
> [!NOTE]
> Three separate benchmark suites: **Python API** (vs PyYAML), **Node.js API** (vs js-yaml), and **CLI Batch Mode** (vs yamlfmt).
> [!NOTE]
> Process startup overhead (~15ms for Python, ~20-25ms for Node.js) affects small file benchmarks. In long-running servers (persistent processes), speedups would be 2-4x higher.
> [!TIP]
> Batch mode is where fast-yaml excels with parallel processing. Use `-j` to specify worker count.
Benchmark results
### Python API vs PyYAML
**Parse (loading):**
| File Size | fast-yaml | PyYAML (C) | PyYAML (pure) | vs C | vs pure |
|-----------|-----------|------------|---------------|------|---------|
| Small (502B) | **15.5 ms** | 20.2 ms | 20.8 ms | **1.30x** | **1.34x** |
| Medium (44KB) | **26.3 ms** | 26.4 ms | 61.2 ms | **1.00x** | **2.33x** |
| Large (449KB) | 130.3 ms | **79.3 ms** | 429.6 ms | 0.61x | **3.30x** |
**Dump (serialization):**
| File Size | fast-yaml | PyYAML (C) | PyYAML (pure) | vs C | vs pure |
|-----------|-----------|------------|---------------|------|---------|
| Small (502B) | **15.7 ms** | 20.8 ms | 21.2 ms | **1.33x** | **1.35x** |
| Medium (44KB) | **31.6 ms** | 31.7 ms | 82.7 ms | **1.00x** | **2.62x** |
| Large (449KB) | 177.6 ms | **131.1 ms** | 653.8 ms | 0.74x | **3.68x** |
**Key findings:**
- **Small/Medium files**: fast-yaml matches or beats PyYAML C (1.0-1.3x speedup)
- **Pure Python**: fast-yaml consistently 1.3-3.7x faster across all sizes
- **Large files**: PyYAML C optimized for single large files; use fast-yaml's parallel mode for multi-document streams
Full benchmarks: [benches/comparison](benches/comparison/)
### Node.js API vs js-yaml (Apple M3 Pro, 12 cores)
**Parse (loading):**
| File Size | fast-yaml | js-yaml | Speedup |
|-----------|-----------|---------|---------|
| Small (502B) | **24.4 ms** | 28.1 ms | **1.15x** |
| Medium (44KB) | **26.2 ms** | 31.9 ms | **1.22x** |
| Large (449KB) | **40.4 ms** | 48.3 ms | **1.20x** |
**Dump (serialization):**
| File Size | fast-yaml | js-yaml | Speedup |
|-----------|-----------|---------|---------|
| Small (502B) | **24.1 ms** | 29.3 ms | **1.22x** |
| Medium (44KB) | **27.1 ms** | 34.9 ms | **1.29x** |
| Large (449KB) | **50.7 ms** | 72.1 ms | **1.42x** |
**Key findings:**
- **Consistent advantage**: fast-yaml 1.15-1.42x faster across all scenarios
- **Best performance**: Large file dump operations (1.42x speedup)
- **V8 JIT competitive**: js-yaml benefits from TurboFan optimization, reducing speedup vs pure Python
- **Real-world servers**: In persistent processes without startup overhead, expect 2-4x speedup
### CLI Single-File vs yamlfmt (Apple M3 Pro, 12 cores)
| File Size | fast-yaml | yamlfmt | Result |
|-----------|-----------|---------|--------|
| Small (502 bytes) | **1.7 ms** | 3.1 ms | **1.80x faster** ✓ |
| Medium (45 KB) | **2.5 ms** | 2.9 ms | **1.19x faster** ✓ |
| Large (460 KB) | 8.4 ms | **2.9 ms** | yamlfmt 2.88x faster |
### CLI Batch Mode vs yamlfmt
| Workload | fast-yaml (parallel) | yamlfmt (sequential) | Speedup |
|----------|---------------------|----------------------|---------|
| 50 files (26 KB) | **4.3 ms** | 10.3 ms | **2.40x faster** ✓ |
| 200 files (204 KB) | **8.0 ms** | 52.7 ms | **6.63x faster** ✓ |
| 500 files (1 MB) | **15.5 ms** | 244.7 ms | **15.77x faster** ⚡ |
| 1000 files (1 MB) | **23.4 ms** | 323.4 ms | **13.80x faster** ⚡ |
**Key takeaway:** Batch mode with parallel workers provides 6-15x speedup on multi-file operations, making it ideal for formatting entire codebases.
```bash
# Run benchmarks
bash benches/comparison/scripts/run_python_benchmark.sh # Python API
bash benches/comparison/scripts/run_nodejs_benchmark.sh # Node.js API
bash benches/comparison/scripts/run_batch_benchmark.sh # CLI batch mode
```
**Test environment:** macOS 14, Apple M3 Pro (12 cores), fast-yaml 0.4.1, PyYAML 6.0.3, js-yaml 4.1.1, Node.js 25.2.1, yamlfmt 0.21.0
## YAML 1.2.2 Differences
Differences from PyYAML (YAML 1.1)
| Feature | PyYAML (YAML 1.1) | fast-yaml (YAML 1.2.2) |
|---------|-------------------|------------------------|
| `yes/no` | `True/False` | `"yes"/"no"` (strings) |
| `on/off` | `True/False` | `"on"/"off"` (strings) |
| `014` (octal) | `12` | `14` (decimal) |
| `0o14` (octal) | Error | `12` |
```python
fast_yaml.safe_load("yes") # "yes" (string, not True!)
fast_yaml.safe_load("0o14") # 12 (octal)
fast_yaml.safe_load("014") # 14 (decimal, NOT octal!)
```
## API Reference
Loading YAML
```python
# Single document
data = fast_yaml.safe_load(yaml_string)
# Multiple documents
for doc in fast_yaml.safe_load_all(yaml_string):
print(doc)
# PyYAML-compatible
data = fast_yaml.load(yaml_string, Loader=fast_yaml.SafeLoader)
```
Dumping YAML
```python
yaml_str = fast_yaml.safe_dump(data)
# With options
yaml_str = fast_yaml.dump(
data,
indent=2,
width=80,
explicit_start=True,
sort_keys=False,
)
# Multiple documents
yaml_str = fast_yaml.safe_dump_all([doc1, doc2, doc3])
```
Type mappings
| YAML Type | Python Type |
|-----------|-------------|
| `null`, `~` | `None` |
| `true`, `false` | `bool` |
| `123`, `0x1F`, `0o17` | `int` |
| `1.23`, `.inf`, `.nan` | `float` |
| `"string"`, `'string'` | `str` |
| `[a, b, c]` | `list` |
| `{a: 1, b: 2}` | `dict` |
## Security
Input validation prevents denial-of-service attacks.
Security limits
| Limit | Default | Configurable |
|-------|---------|--------------|
| Max input size | 100 MB | Yes (up to 1GB) |
| Max documents | 100,000 | Yes (up to 10M) |
| Max threads | 128 | Yes |
## Project
Project structure
```
fast-yaml/
├── crates/
│ ├── fast-yaml-core/ # Core YAML parser/emitter
│ ├── fast-yaml-linter/ # Linting engine
│ ├── fast-yaml-parallel/ # Multi-threaded processing
│ └── fast-yaml-ffi/ # FFI utilities
├── python/ # PyO3 Python bindings
├── nodejs/ # NAPI-RS Node.js bindings
└── Cargo.toml # Workspace manifest
```
Technology stack
| Component | Library |
|-----------|---------|
| YAML Parser | [saphyr](https://github.com/saphyr-rs/saphyr) |
| Python Bindings | [PyO3](https://pyo3.rs/) |
| Node.js Bindings | [NAPI-RS](https://napi.rs/) |
| Parallelism | [Rayon](https://github.com/rayon-rs/rayon) |
**Rust 2024 Edition** • **Python 3.10+** • **Node.js 20+**
## Contributing
Contributions welcome! All PRs must pass CI checks:
```bash
cargo +nightly fmt --all
cargo clippy --workspace --all-targets -- -D warnings
cargo nextest run --workspace
```
## FAQ
Why not just use PyYAML?
PyYAML is excellent. Use fast-yaml when you need performance (5-10x faster), YAML 1.2.2 compliance, built-in linting, or parallel processing.
Is this a drop-in replacement?
For `safe_*` functions, yes. Just change `import yaml` to `import fast_yaml as yaml`. Note that YAML 1.2.2 has different boolean/octal handling.
When should I use parallel processing?
Use `parse_parallel()` for multi-document YAML files (separated by `---`) larger than 1MB with 4+ CPU cores. For single documents, use `safe_load()`.
## License
Licensed under [MIT](LICENSE-MIT) or [Apache-2.0](LICENSE-APACHE) at your option.