{"id":33336665,"url":"https://github.com/s-celles/tokenorientedobjectnotation.jl","last_synced_at":"2025-11-23T07:00:44.150Z","repository":{"id":324566414,"uuid":"1097648799","full_name":"s-celles/TokenOrientedObjectNotation.jl","owner":"s-celles","description":"Token-Oriented Object Notation (TOON) encoder/decoder for Julia","archived":false,"fork":false,"pushed_at":"2025-11-18T19:27:13.000Z","size":684,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-18T21:08:30.669Z","etag":null,"topics":["aigenerated","decoder","deserialization","encoder","json","julia-language","serialization","work-in-progress"],"latest_commit_sha":null,"homepage":"https://s-celles.github.io/TokenOrientedObjectNotation.jl/dev/","language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/s-celles.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-16T15:26:44.000Z","updated_at":"2025-11-18T19:26:35.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/s-celles/TokenOrientedObjectNotation.jl","commit_stats":null,"previous_names":["s-celles/toon.jl","s-celles/tokenorientedobjectnotation.jl"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/s-celles/TokenOrientedObjectNotation.jl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-celles%2FTokenOrientedObjectNotation.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-celles%2FTokenOrientedObjectNotation.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-celles%2FTokenOrientedObjectNotation.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-celles%2FTokenOrientedObjectNotation.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/s-celles","download_url":"https://codeload.github.com/s-celles/TokenOrientedObjectNotation.jl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s-celles%2FTokenOrientedObjectNotation.jl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285742680,"owners_count":27224048,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-22T02:00:05.934Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aigenerated","decoder","deserialization","encoder","json","julia-language","serialization","work-in-progress"],"created_at":"2025-11-21T05:00:53.287Z","updated_at":"2025-11-23T07:00:44.144Z","avatar_url":"https://github.com/s-celles.png","language":"Julia","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TokenOrientedObjectNotation.jl\n\n[![CI](https://github.com/s-celles/TokenOrientedObjectNotation.jl/workflows/CI/badge.svg)](https://github.com/s-celles/TokenOrientedObjectNotation.jl/actions/workflows/CI.yml)\n[![Documentation](https://github.com/s-celles/TokenOrientedObjectNotation.jl/workflows/Documentation/badge.svg)](https://github.com/s-celles/TokenOrientedObjectNotation.jl/actions/workflows/Documentation.yml)\n[![codecov](https://codecov.io/gh/s-celles/TokenOrientedObjectNotation.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/s-celles/TokenOrientedObjectNotation.jl)\n[![Aqua QA](https://raw.githubusercontent.com/JuliaTesting/Aqua.jl/master/badge.svg)](https://github.com/JuliaTesting/Aqua.jl)\n[![SPEC v2.0](https://img.shields.io/badge/spec-v2.0-lightgrey)](https://github.com/toon-format/spec/blob/main/SPEC.md)\n[![Compliance](https://img.shields.io/badge/compliance-100%25-brightgreen)](./COMPLIANCE_VALIDATION_REPORT.md)\n[![Tests](https://img.shields.io/badge/tests-1750%20passing-brightgreen)](./test/COMPLIANCE_TEST_COVERAGE.md)\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)\n\nJulia implementation of **Token-Oriented Object Notation (TOON)**, a compact, human-readable serialization format optimized for LLM contexts.\n\n## What is TOON?\n\nTOON is a line-oriented, indentation-based text format that encodes the JSON data model with explicit structure and minimal quoting. It achieves 30-60% token reduction compared to JSON while maintaining readability and deterministic structure.\n\n**Key Features:**\n- Compact representation of tabular data\n- Minimal quoting requirements\n- Explicit array lengths for validation\n- Support for multiple delimiter types (comma, tab, pipe)\n- Strict mode for validation\n- 100% compatible with JSON data model\n\n## Installation\n\n```julia\nusing Pkg\nPkg.add(url=\"https://github.com/s-celles/TokenOrientedObjectNotation.jl\")\n```\n\nOr in the Julia REPL package mode:\n```julia-repl\npkg\u003e add https://github.com/s-celles/TokenOrientedObjectNotation.jl\n```\n\n## Quick Start\n\n### Encoding\n\n```julia\nusing TokenOrientedObjectNotation\n\n# Simple object\ndata = Dict(\"name\" =\u003e \"Alice\", \"age\" =\u003e 30)\ntoon_str = TokenOrientedObjectNotation.encode(data)\nprintln(toon_str)\n# name: Alice\n# age: 30\n\n# Array of objects (tabular format)\nusers = [\n    Dict(\"id\" =\u003e 1, \"name\" =\u003e \"Alice\", \"role\" =\u003e \"admin\"),\n    Dict(\"id\" =\u003e 2, \"name\" =\u003e \"Bob\", \"role\" =\u003e \"user\")\n]\ntoon_str = TokenOrientedObjectNotation.encode(Dict(\"users\" =\u003e users))\nprintln(toon_str)\n# users[2]{id,name,role}:\n#   1,Alice,admin\n#   2,Bob,user\n```\n\n### Decoding\n\n```julia\nusing TokenOrientedObjectNotation\n\n# Decode a simple object\ninput = \"name: Alice\\nage: 30\"\ndata = TokenOrientedObjectNotation.decode(input)\n# Dict(\"name\" =\u003e \"Alice\", \"age\" =\u003e 30)\n\n# Decode an array\ninput = \"[3]: 1,2,3\"\ndata = TokenOrientedObjectNotation.decode(input)\n# [1, 2, 3]\n```\n\n### Options\n\n```julia\nusing TokenOrientedObjectNotation\n\n# Encoding with custom options\noptions = TokenOrientedObjectNotation.EncodeOptions(\n    indent = 4,                    # Use 4 spaces per indentation level\n    delimiter = TokenOrientedObjectNotation.TAB,          # Use tab as delimiter\n    keyFolding = \"safe\",           # Enable key folding\n    flattenDepth = 2               # Limit folding depth\n)\n\ndata = Dict(\"user\" =\u003e Dict(\"name\" =\u003e \"Alice\"))\ntoon_str = TokenOrientedObjectNotation.encode(data, options=options)\n\n# Decoding with custom options\noptions = TokenOrientedObjectNotation.DecodeOptions(\n    indent = 4,                    # Expect 4 spaces per level\n    strict = true,                 # Enable strict validation\n    expandPaths = \"safe\"           # Enable path expansion\n)\n\ndata = TokenOrientedObjectNotation.decode(toon_str, options=options)\n```\n\n## Examples\n\n### JSON vs TOON Comparison\n\n**JSON:**\n```json\n{\n  \"users\": [\n    { \"id\": 1, \"name\": \"Alice\", \"role\": \"admin\" },\n    { \"id\": 2, \"name\": \"Bob\", \"role\": \"user\" }\n  ],\n  \"count\": 2\n}\n```\n\n**TOON:**\n```\nusers[2]{id,name,role}:\n  1,Alice,admin\n  2,Bob,user\ncount: 2\n```\n\nToken savings: ~45% reduction\n\n### Complex Nested Structures\n\n```julia\nusing TokenOrientedObjectNotation\n\ndata = Dict(\n    \"server\" =\u003e Dict(\n        \"host\" =\u003e \"localhost\",\n        \"port\" =\u003e 8080,\n        \"tags\" =\u003e [\"web\", \"api\"]\n    ),\n    \"database\" =\u003e Dict(\n        \"type\" =\u003e \"postgresql\",\n        \"connections\" =\u003e 10\n    )\n)\n\nprintln(TokenOrientedObjectNotation.encode(data))\n# server:\n#   host: localhost\n#   port: 8080\n#   tags[2]: web,api\n# database:\n#   type: postgresql\n#   connections: 10\n```\n\n### Key Folding (Compact Nested Objects)\n\n```julia\nusing TokenOrientedObjectNotation\n\n# Deep nesting with key folding\ndata = Dict(\"api\" =\u003e Dict(\"v1\" =\u003e Dict(\"users\" =\u003e Dict(\"endpoint\" =\u003e \"/api/v1/users\"))))\n\n# Without key folding (default)\nprintln(TokenOrientedObjectNotation.encode(data))\n# api:\n#   v1:\n#     users:\n#       endpoint: /api/v1/users\n\n# With key folding\noptions = TokenOrientedObjectNotation.EncodeOptions(keyFolding=\"safe\")\nprintln(TokenOrientedObjectNotation.encode(data, options=options))\n# api.v1.users.endpoint: /api/v1/users\n```\n\n### Path Expansion (Round-trip with Key Folding)\n\n```julia\nusing TokenOrientedObjectNotation\n\n# Decode with path expansion\ninput = \"api.v1.users.endpoint: /api/v1/users\"\noptions = TokenOrientedObjectNotation.DecodeOptions(expandPaths=\"safe\")\ndata = TokenOrientedObjectNotation.decode(input, options=options)\n# Dict(\"api\" =\u003e Dict(\"v1\" =\u003e Dict(\"users\" =\u003e Dict(\"endpoint\" =\u003e \"/api/v1/users\"))))\n\n# Round-trip: folding + expansion\nencode_opts = TokenOrientedObjectNotation.EncodeOptions(keyFolding=\"safe\")\ndecode_opts = TokenOrientedObjectNotation.DecodeOptions(expandPaths=\"safe\")\noriginal = Dict(\"a\" =\u003e Dict(\"b\" =\u003e Dict(\"c\" =\u003e 42)))\nencoded = TokenOrientedObjectNotation.encode(original, options=encode_opts)  # \"a.b.c: 42\"\ndecoded = TokenOrientedObjectNotation.decode(encoded, options=decode_opts)   # Reconstructs original structure\n```\n\n### Different Delimiters\n\n```julia\nusing TokenOrientedObjectNotation\n\nusers = [\n    Dict(\"name\" =\u003e \"Alice\", \"role\" =\u003e \"admin\"),\n    Dict(\"name\" =\u003e \"Bob\", \"role\" =\u003e \"user\")\n]\n\n# Comma delimiter (default)\nprintln(TokenOrientedObjectNotation.encode(Dict(\"users\" =\u003e users)))\n# users[2]{name,role}:\n#   Alice,admin\n#   Bob,user\n\n# Tab delimiter\noptions = TokenOrientedObjectNotation.EncodeOptions(delimiter=TokenOrientedObjectNotation.TAB)\nprintln(TokenOrientedObjectNotation.encode(Dict(\"users\" =\u003e users), options=options))\n# users[2\t]{name\trole}:\n#   Alice\tadmin\n#   Bob\tuser\n\n# Pipe delimiter\noptions = TokenOrientedObjectNotation.EncodeOptions(delimiter=TokenOrientedObjectNotation.PIPE)\nprintln(TokenOrientedObjectNotation.encode(Dict(\"users\" =\u003e users), options=options))\n# users[2|]{name|role}:\n#   Alice|admin\n#   Bob|user\n```\n\n### Strict Mode Validation\n\n```julia\nusing TokenOrientedObjectNotation\n\n# Strict mode catches errors (default)\ninput = \"[3]: 1,2\"  # Declares 3 items but only has 2\ntry\n    TokenOrientedObjectNotation.decode(input)  # strict=true by default\ncatch e\n    println(e)  # \"Array length mismatch: expected 3, got 2\"\nend\n\n# Non-strict mode is lenient\noptions = TokenOrientedObjectNotation.DecodeOptions(strict=false)\nresult = TokenOrientedObjectNotation.decode(input, options=options)  # [1, 2] - accepts actual count\n```\n\n## API Reference\n\n### Main Functions\n\n#### `encode(value; options::EncodeOptions=EncodeOptions()) -\u003e String`\n\nEncode a Julia value to TOON format string.\n\n**Arguments:**\n- `value`: Any Julia value (will be normalized to JSON model)\n- `options`: Optional encoding configuration\n\n**Returns:** TOON formatted string\n\n#### `decode(input::String; options::DecodeOptions=DecodeOptions()) -\u003e JsonValue`\n\nDecode a TOON format string to a Julia value.\n\n**Arguments:**\n- `input`: TOON formatted string\n- `options`: Optional decoding configuration\n\n**Returns:** Parsed Julia value (Dict, Array, or primitive)\n\n### Types\n\n#### `EncodeOptions`\n\nConfiguration for encoding:\n- `indent::Int = 2`: Number of spaces per indentation level\n- `delimiter::Delimiter = \",\"`: Delimiter for arrays (`,`, `\\t`, or `|`)\n- `keyFolding::String = \"off\"`: Key folding mode (`\"off\"` or `\"safe\"`)\n- `flattenDepth::Int = typemax(Int)`: Maximum folding depth\n\n#### `DecodeOptions`\n\nConfiguration for decoding:\n- `indent::Int = 2`: Expected spaces per indentation level\n- `strict::Bool = true`: Enable strict validation\n- `expandPaths::String = \"off\"`: Path expansion mode (`\"off\"` or `\"safe\"`)\n\n## Specification Compliance\n\n**✅ FULLY COMPLIANT with TOON Specification v2.0**\n\nThis implementation has been validated against all normative requirements in the official [TOON Specification v2.0](https://github.com/toon-format/spec/blob/main/SPEC.md) with **1750 passing tests**.\n\n### Core Features\n- ✅ All primitive types (string, number, boolean, null)\n- ✅ Canonical number formatting (no exponents, no trailing zeros)\n- ✅ Objects with nested structures\n- ✅ Primitive arrays (inline format)\n- ✅ Tabular arrays (uniform objects with all delimiters)\n- ✅ Mixed/complex arrays (expanded list format)\n- ✅ Objects as list items with proper depth handling\n- ✅ Root form detection (array, primitive, object)\n\n### String Handling\n- ✅ Five valid escape sequences (\\\\, \\\", \\n, \\r, \\t)\n- ✅ Complete quoting rules (empty, whitespace, reserved literals, numeric-like, special chars)\n- ✅ Delimiter-aware quoting (document vs active delimiter)\n\n### Delimiters and Formatting\n- ✅ Multiple delimiters (comma, tab, pipe)\n- ✅ Proper delimiter scoping (document vs active)\n- ✅ Array header syntax with delimiter symbols\n- ✅ Consistent indentation and whitespace rules\n\n### Validation and Options\n- ✅ Strict mode validation (all §14 error conditions)\n- ✅ Array count and row width validation\n- ✅ Indentation validation (multiples, no tabs)\n- ✅ Configurable encoding/decoding options\n\n### Advanced Features (v2.0)\n- ✅ Key folding (safe mode with depth limits)\n- ✅ Path expansion (safe mode with conflict detection)\n- ✅ Round-trip compatibility between folding and expansion\n\n### Known Limitations\n- Number precision limited to Float64 (~15-17 decimal digits)\n- Very deeply nested structures (100+ levels) may impact performance\n- Julia Dict preserves insertion order (implementation detail, not guaranteed by language spec)\n\n## Testing\n\nRun the comprehensive test suite (1750 tests):\n\n```julia\nusing Pkg\nPkg.test(\"TOON\")\n```\n\n### Test Coverage\n\nThe test suite includes:\n- **Requirements Testing:** All 15 normative requirements (900+ tests)\n- **Round-trip Testing:** Encode/decode preservation (69 tests)\n- **Determinism Testing:** Consistent output validation (24 tests)\n- **Edge Cases:** Empty values, deep nesting, large arrays (75 tests)\n- **Spec Examples:** All examples from the specification (79 tests)\n- **Error Conditions:** All §14 error scenarios (57 tests)\n- **Integration Tests:** Real-world usage patterns (546 tests)\n\nSee [COMPLIANCE_VALIDATION_REPORT.md](./COMPLIANCE_VALIDATION_REPORT.md) for detailed validation results.\n\n## Performance\n\nTOON achieves significant token reduction compared to JSON:\n\n- **Tabular data:** 40-60% reduction\n- **Nested objects:** 20-40% reduction\n- **Mixed structures:** 30-50% reduction\n\nExample token counts (using GPT-4 tokenizer):\n```julia\n# JSON: 156 tokens\n# TOON: 89 tokens (43% reduction)\nusers = [\n    Dict(\"id\" =\u003e 1, \"name\" =\u003e \"Alice\", \"email\" =\u003e \"alice@example.com\", \"active\" =\u003e true),\n    Dict(\"id\" =\u003e 2, \"name\" =\u003e \"Bob\", \"email\" =\u003e \"bob@example.com\", \"active\" =\u003e false)\n]\n```\n\n## Documentation\n\nComprehensive documentation is available in the `docs/` folder:\n\n- **Getting Started** - Installation and basic usage\n- **User Guide** - Detailed encoding and decoding guide\n- **Examples** - Real-world usage examples\n- **API Reference** - Complete API documentation\n- **Compliance** - Specification compliance details\n\n### Building Documentation\n\n```bash\njulia --project=docs -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()'\njulia --project=docs docs/make.jl\n```\n\nThen open `docs/build/index.html` in your browser.\n\n## Contributing\n\nContributions are welcome! See [CONTRIBUTING.md](docs/src/contributing.md) for guidelines.\n\n### Development\n\n```julia\n# Clone the repository\ngit clone https://github.com/s-celles/TokenOrientedObjectNotation.jl.git\ncd TokenOrientedObjectNotation.jl\n\n# Run tests\njulia --project=. -e 'using Pkg; Pkg.test()'\n\n# Run specific test file\njulia --project=. test/test_encoder.jl\n```\n\n## License\n\n[MIT](./LICENSE) License © 2025 TOON Format Organization\n\n## Related Projects\n\n- [Official TOON Specification](https://github.com/toon-format/spec)\n- [TypeScript/JavaScript Implementation](https://github.com/toon-format/toon)\n- [Python Implementation](https://github.com/toon-format/toon-python)\n\n## Links\n\n- **Specification:** [SPEC.md](https://github.com/toon-format/spec/blob/main/SPEC.md)\n- **Test Fixtures:** [Spec test suite](https://github.com/toon-format/spec/tree/main/tests/fixtures)\n- **Benchmarks:** [Token efficiency results](https://github.com/toon-format/toon/tree/main/benchmarks)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs-celles%2Ftokenorientedobjectnotation.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fs-celles%2Ftokenorientedobjectnotation.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs-celles%2Ftokenorientedobjectnotation.jl/lists"}