{"id":47444114,"url":"https://github.com/JuliaCloud/LazyJSON.jl","last_synced_at":"2026-04-06T13:00:59.552Z","repository":{"id":45987507,"uuid":"121088778","full_name":"JuliaCloud/LazyJSON.jl","owner":"JuliaCloud","description":"LazyJSON is an interface for reading JSON data in Julia programs.","archived":false,"fork":false,"pushed_at":"2021-11-22T19:01:37.000Z","size":701,"stargazers_count":38,"open_issues_count":12,"forks_count":8,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-11-18T10:16:29.207Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JuliaCloud.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-02-11T05:33:34.000Z","updated_at":"2025-06-06T08:43:21.000Z","dependencies_parsed_at":"2022-09-17T11:20:49.876Z","dependency_job_id":null,"html_url":"https://github.com/JuliaCloud/LazyJSON.jl","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/JuliaCloud/LazyJSON.jl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaCloud%2FLazyJSON.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaCloud%2FLazyJSON.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaCloud%2FLazyJSON.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaCloud%2FLazyJSON.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JuliaCloud","download_url":"https://codeload.github.com/JuliaCloud/LazyJSON.jl/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JuliaCloud%2FLazyJSON.jl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31473271,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-06T08:36:52.050Z","status":"ssl_error","status_checked_at":"2026-04-06T08:36:51.267Z","response_time":112,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-03-23T06:00:59.902Z","updated_at":"2026-04-06T13:00:59.541Z","avatar_url":"https://github.com/JuliaCloud.png","language":"Julia","funding_links":[],"categories":["Data Forensics and Analysis"],"sub_categories":["Data Parsing"],"readme":"# LazyJSON.jl\n\n[![Build Status](https://travis-ci.org/JuliaCloud/LazyJSON.jl.svg)](https://travis-ci.org/JuliaCloud/LazyJSON.jl)\n\n\nLazyJSON is an interface for reading JSON data in Julia programs.\n\nIf you find it useful, or not, please report your experiance in the [discourse thread](https://discourse.julialang.org/t/announce-a-different-way-to-read-json-data-lazyjson-jl/9046).\n\nLazyJSON provides direct access to values stored in a JSON text though standard Julia\ninterfaces: `Number`, `AbstractString`, `AbstractVector` and `AbstractDict`.\n\nThe function `LazyJSON.value` constructs an object representing the value(s) of a JSON text.\n\n```julia\nLazyJSON.value(jsontext::AbstractString) -\u003e Union{Bool,\n                                                  LazyJSON.Number,\n                                                  LazyJSON.String,\n                                                  LazyJSON.Array,\n                                                  LazyJSON.Object,\n                                                  Nothing}\nLazyJSON.Number \u003c: Base.Number\nLazyJSON.String \u003c: AbstractString\nLazyJSON.Array  \u003c: AbstractVector{Any}\nLazyJSON.Object \u003c: AbstractDict{AbstractString,Any}\n```\n\ne.g.\n```julia\njulia\u003e j = LazyJSON.value(\"\"\"{\n           \"foo\": [1, 2, 3, \"four\"]\n           \"bar\": null\n       }\"\"\")\nLazyJSON.Object with 2 entries:\n  \"foo\" =\u003e Any[1, 2, 3, \"four\"]\n  \"bar\" =\u003e nothing\n\njulia\u003e j[\"foo\"]\n4-element LazyJSON.Array:\n 1\n 2\n 3\n  \"four\"\n\njulia\u003e j[\"foo\"][4]\n\"four\"\n\njulia\u003e typeof(j[\"bar\"])\nNothing\n```\n\nThe fields of JSON objects can also be accessed using `'.'` (`getproperty`)\nsyntax.\n\ne.g.\n```julia\njulia\u003e j = LazyJSON.value(\"\"\"{\n           \"foo\": [1, 2, 3, \"four\"]\n           \"bar\": null\n       }\"\"\"; getproperty=true)\njulia\u003e j.foo\n4-element LazyJSON.Array:\n 1\n 2\n 3\n  \"four\"\n```\n\nJSON Objects can be converted to `struct` types.\n\ne.g.\n```julia\njulia\u003e struct Point\n           x::Int\n           y::Int\n       end\n\njulia\u003e struct Line\n           a::Point\n           b::Point\n       end\n\njulia\u003e struct Arrow\n           label::String\n           segments::Vector{Line}\n           dashed::Bool\n       end\n\njulia\u003e convert(Arrow, LazyJSON.value(\"\"\"{\n           \"label\": \"Hello\",\n           \"segments\": [\n                {\"a\": {\"x\": 1, \"y\": 1}, \"b\": {\"x\": 2, \"y\": 2}},\n                {\"a\": {\"x\": 2, \"y\": 2}, \"b\": {\"x\": 3, \"y\": 3}}\n            ],\n            \"dashed\": false\n       }\"\"\"))\nArrow(\"Hello\", Line[Line(Point(1, 1), Point(2, 2)), Line(Point(2, 2), Point(3, 3))], false)\n```\n\n\n\n_For compatibility with other JSON interfaces that have a `parse` function,\n`LazyJSON.parse` is provided as an alias for `LazyJSON.value`. e.g._\n\ne.g.\n```julia\njulia\u003e j = LazyJSON.parse(\"\"\"{\n           \"foo\": [1, 2, 3, \"four\"]\n           \"bar\": null\n       }\"\"\")\n\njulia\u003e j[\"foo\"][4]\n\"four\"\n```\n\n# Lazyness\n\nLazyJSON is lazy in the sense that it assumes that its input is well formed JSON\ntext. It does not try to detect every type of JSON syntax error. If security is\na concern, JSON data of unknown providence should probably be validated before\nuse.\n\nLazyJSON is also lazy in the sense that it does not process any part of the JSON\ntext until values are requested through the `AbstractVector` and `AbstractDict`\ninterfaces.\n\ni.e. `j = LazyJSON.value(jsontext)` does no parsing and immediately\nreturns a thin wrapper object.\n\n`j[\"foo\"]` calls `get(::AbstractDict, \"foo\")`, which parses just enough to find\nthe `\"foo\"` field.\n\n`j[\"foo\"][4]` calls `getindex(::AbstractArray, 4)`, which continues paring up to\nthe fourth item in the array.\n\nThis results in much less memory allocation compared to non-lazy parsers:\n\nJSON.jl:\n```julia\nj = String(read(\"ec2-2016-11-15.normal.json\"))\njulia\u003e function f(json)\n           v = JSON.parse(json)\n           v[\"shapes\"][\"scope\"][\"enum\"][1]\n       end\n\njulia\u003e @time f(j)\n  0.066773 seconds (66.43 k allocations: 7.087 MiB)\n\"Availability Zone\"\n```\n\nLazyJSON.jl:\n```julia\njulia\u003e function f(json)\n           v = LazyJSON.parse(json)\n           v[\"shapes\"][\"scope\"][\"enum\"][1]\n       end\n\njulia\u003e @time f(j)\n  0.001392 seconds (12 allocations: 384 bytes)\n\"Availability Zone\"\n```\n\nLazyJSON's `AbstractString` and `Number` implementations are lazy too.\n\nThe text of a `LazyJSON.Number` is not parsed to `Int64` or `Float64` form\nuntil it is needed for a numeric operation. If the number is only used in a\ntextual context, it need never be parsed at all. e.g.\n\n```julia\nj = LazyJSON.value(jsontext)\nhtml = \"\"\"\u003cimg width=$(j[\"width\"]), height=$(j[\"height\"])\u003e\"\"\"\n```\n\nLikewise, the content of a `LazyJSON.String` is not interpreted until it is\naccessed. If a `LazyJSON.String` containing complex UTF16 escape sequences is\ncompared to a UTF8 `Base.String`, and the two strings differ in the first\nfew characters, then the comparison will terminate before the any unescaping\nwork needs to be done.\n\n\n\n# LazyJSON Performance Considerations\n\n## LazyJSON.Array Performance\n\nThe `LazyJSON.Array` does not keep track of the indices of its items.\nEvery `array[i]` access scans all the values in the array until it reaches\nthe `i`th value. This is fast if you only need to access a single item,\neven near the end of the array, because the alternative of transforming the\n`LazyJSON.Array` into a `Base.Array` must scan the entire array and allocate\nnew memory for each item. It is also fast to access multiple items near the\nstart of the array. However, if you need random access to many items in a large\narray it is better to convert it to a `Base.Array`.\n\ne.g.\n```\nv = LazyJSON.value(jsontext)[\"foo\"][\"bar\"][\"an_array\"]\nv = convert(Vector{Any}, v)\n```\n\nIf you need to access the items in the array sequentially, the iteration\ninterface is very efficient, but incrementing an index is very inefficient.\n`length(::LazyJSON.Array)` is also inefficient, in that it must scan the whole\narray.\n\ne.g.\n```julia\nv = LazyJSON.value(jsontext)[\"foo\"][\"bar\"][\"an_array\"]\nfor i in v ✅\n    println(i)\nend\n\nr = map(i -\u003e f(i), v) ✅\n\ni = 1\nwhile i \u003c length(v) ❌\n    println(v[i]) ❌\nend\n```\n\n\n## LazyJSON.Object Performance\n\nThe performance considerations for `LazyJSON.Object` are similar to those\ndescribed above for `LazyJSON.Array`. The `LazyJSON.Object` does not keep a\nhash table of keys. Every `object[\"key\"]` access scans all the keys in the\nobject until it finds a match. Accessing a keys in an object with a small\nnumber of keys is efficient. Accessing a few keys in an object with many keys\nis effiecient. However, if you need random acess to many keys in a large object\nit is better to convert it to a `Base.Dict`.\n\ne.g.\n```julia\nv = LazyJSON.value(jsontext)[\"foo\"][\"bar\"][\"an_object_with_many_keys\"]\nv = convert(Dict, v)\n```\n\n`length(::LazyJSON.Object)` is inefficient, in that it must scan the whole\nobject.\nIf you need to access the key value pairs sequentially, the iteration\ninterface is very efficient.\n\ne.g.\n\n```julia\no = LazyJSON.value(jsontext)[\"foo\"][\"bar\"][\"an_object_with_many_keys\"]\nfor (k, v) in o ✅\n    println(k, v)\nend\n\nr = filter((k, v) -\u003e contains(i, r\".jpg$\", o)) ✅\n\nfor k in long_list_of_keys\n    println(o[k]) ❌\nend\n\nd = convert(Dict, o)\nfor k in long_list_of_keys\n    println(d[k]) ✅\nend\n```\n\n\n## LazyJSON.Number Performance\n\nWhenever a `LazyJSON.Number` is used in a numeric operation it must be parsed\nfrom its string form into an `Int` or a `Float`. If you are only using each\neach numetic value once, there is no performance penalty, as the string is only\nparsed once. However if you need to use the numeric value many times, it is\nbetter to convert it to a normal `Base` number type.\n\ne.g.\n```julia\ni = LazyJSON.value(jsontext)[\"foo\"]\nx = origin.x + i[\"width\"],  ✅ used once in an addition operation\ny = origin.y + i[\"height\"]  ✅\ndraw(i[\"data\"], x, y)\n\n\nlimit = LazyJSON.value(jsontext)[\"foo\"][\"limit\"]\ni = 0\nwhile i \u003c limit ❌ re-parsed every time the less than operation is evaluated\n    i += 1\n    ...\nend\nlimit = convert(Int, LazyJSON.value(jsontext)[\"foo\"][\"limit\"]) ✅\n\n\nv = LazyJSON.value(jsontext)[\"foo\"][\"ammounts\"]\ntotal = sum(v) ✅ iteration is efficient, each number is parsed once.\n\n\nstruct Foo\n    x::Int\n    y::Int\nend\ni = LazyJSON.value(jsontext)[\"foo\"]\nFoo(i[\"x\"], i[\"y\"]) ✅ converted to `Int` on assignment to struct fields.\n\n\nv = LazyJSON.value(jsontext)[\"foo\"][\"values\"]\nints = convert(Vector{Int}, v) ✅ manual conversion when needed\n```\n\n\n# Implementation\n\nValues are represented by a reference to the JSON text `String`\nand the byte index of the value text. The `LazyJSON.value(jsontext)` function\nsimply returns a `LazyJSON.Value` object with `s = jsontext` and `i = 1`.\n\n```\n    String: {\"foo\": 1,    \"bar\": [1, 2, 3, \"four\"]}\n            ▲                    ▲      ▲  ▲\n            │                    │      │  │\n            ├─────────────────┐  │      │  │\n            │ LazyJSON.Array( s, i=9)   │  │   == Any[1, 2, 3, \"four\"]\n            │                           │  │\n            ├─────────────────┐  ┌──────┘  │\n            │ LazyJSON.Number(s, i=16)     │   == 3\n            │                              │\n            ├─────────────────┐  ┌─────────┘\n            │ LazyJSON.String(s, i=19)         == \"four\"\n            │\n            └─────────────────┬──┐\n              LazyJSON.Object(s, i=1)\n```\n\nLazyJSON does not parse and translate values into concrete Julia `Number`,\n`String`, `Array` or `Dict` objects. Instead it provides interface methods that\nconform to the protocols of `Base.Number`, `AbstractString`, `AbstractVector`\nand `AbstractDict`.  These methods interpret the JSON text on the fly and parse\nonly as much as is needed return the requested values.\n\n\n\n# Large JSON Texts\n\nLazyJSON can process JSON files that are too big to fit in available RAM\nby using the `mmap` interface.\n\ne.g.\n```julia\nusing Mmap\nf = open(\"huge_file_that_wont_fit_in_ram.json\", \"r\")\ns = String(Mmap.mmap(f))\nj = LazyJSON.value(s)\nv = j[\"foo\"][\"bar\"]\n```\nThe operating stytem will lazily load enough chunks of the file into RAM to\nreach field `\"bar\"` of opject `\"foo\"`.\n\n\n\n# Benchmarks\n\nFor some workloads lazyness makes LazyJSON faster and less memory intensive\nthan JSON parsers that parse the entire JSON text and allocate a tree of\ncollection and value objects.\n\nThe `test/benchmark.jl` test uses a [1MB AWS API definition JSON file](https://github.com/samoconnor/jsonhack/blob/master/test/ec2-2016-11-15.normal.json)\nto compare performance vs JSON.jl.  When accessing a value close to the\nstart of the file the lazy parser is orders of magnitude faster than JSON.jl,\nfor values near then end of the file, the lazy parser is about 6 times faster.\n(Each test case is run once for JIT warmup, then 190 times for measurement.)\n\n```\nJulia Version 0.7.0-DEV.3761\nJSON.jl master Tue Feb 6, 98727675b635c8428effa30a2287a9fe6370e664\n\nAccess value close to start:\nLazyJSON.jl:  0.000568 seconds (3.42 k allocations: 139.531 KiB)\nJSON.jl:      6.410700 seconds (13.28 M allocations: 1.337 GiB, 3.17% gc time)\n\n\nAccess 2 values close to end:\nLazyJSON.jl:  0.177059 seconds (7.79 k allocations: 347.344 KiB)\nJSON.jl:      6.417241 seconds (13.28 M allocations: 1.337 GiB, 3.18% gc time)\n```\n_Note, until recently JSON.jl was taking ~1 second for the tests above.\nIt seems that it may be hampered deprecation of `IOBuffer(maxsize::Integer)`._\n\n\nThe `test/benchmark_geo.jl` test uses a 1.2MB GeoJSON file\nto compare performance vs JSON.jl. The first test extracts a country name\nnear the middle of the file. The second test checks that the country outline\npolygon is at the expected coordinates.\n\n```\nCountry name\nLazyJSON.jl:  0.004762 seconds (190 allocations: 5.938 KiB)\nJSON.jl:      1.063652 seconds (8.62 M allocations: 373.471 MiB, 11.19% gc time)\n\nMap data\nLazyJSON.jl:  0.011075 seconds (27.30 k allocations: 679.547 KiB)\nJSON.jl:      1.064750 seconds (8.62 M allocations: 373.541 MiB, 10.75% gc time)\n```\n\n\n# TODO:\n - New Lazyer parser looses some format validation, consider recovering old\n   code validation code from `src/OldLazyJSON.jl`\n\n\n# References\n\n - Another lazy JSON parser: https://github.com/doubledutch/LazyJSON\n - RFC 7159: https://tools.ietf.org/html/rfc7159\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJuliaCloud%2FLazyJSON.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJuliaCloud%2FLazyJSON.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJuliaCloud%2FLazyJSON.jl/lists"}