Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tonsky/fast-edn
Drop-in replacement for clojure.edn that is 6 times faster
https://github.com/tonsky/fast-edn
clojure edn parser
Last synced: 1 day ago
JSON representation
Drop-in replacement for clojure.edn that is 6 times faster
- Host: GitHub
- URL: https://github.com/tonsky/fast-edn
- Owner: tonsky
- License: mit
- Created: 2024-11-07T18:48:29.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-12-30T16:13:43.000Z (11 days ago)
- Last Synced: 2024-12-30T17:21:41.934Z (11 days ago)
- Topics: clojure, edn, parser
- Language: Clojure
- Homepage:
- Size: 356 KB
- Stars: 78
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# Fast EDN parser
> EDN format is very similar to JSON, thus it should parse as fast as JSON.
Fast EDN is a drop-in replacement for `clojure.edn/read-string` that is roughly 6 times faster:
| Test file | clojure.edn | fast-edn.core | speed up, times |
| :--- | ---: | ---: | ---: |
| basic_10 | 0.504 | 0.290 | × 1.7 |
| basic_100 | 3.040 | 0.594 | × 5.1 |
| basic_1000 | 19.495 | 2.815 | × 6.9 |
| basic_10000 | 221.773 | 37.560 | × 5.9 |
| basic_100000 | 2138.255 | 370.045 | × 5.8 |
| ints_1400 | 431.432 | 33.164 | × 13.0 |
| keywords_10 | 3.961 | 0.625 | × 6.3 |
| keywords_100 | 34.980 | 4.769 | × 7.3 |
| keywords_1000 | 369.404 | 53.943 | × 6.8 |
| keywords_10000 | 4168.732 | 662.099 | × 6.3 |
| nested_100000 | 2585.372 | 503.644 | × 5.1 |
| strings_1000 | 651.043 | 40.455 | × 16.1 |
| strings_uni_250 | 641.900 | 108.341 | × 5.9 |Fast EDN achieves JSON parsing speeds (json + keywordize keys vs EDN of the same size):
| File size | cheshire | jsonista | charred | fast-edn |
| :--- | ---: | ---: | ---: | --: |
| basic_10 | 0.588 | 0.137 | 0.328 | 0.290 |
| basic_100 | 1.043 | 0.594 | 0.721 | 0.594 |
| basic_1000 | 4.224 | 2.999 | 3.016 | 2.815 |
| basic_10000 | 37.793 | 34.374 | 32.623 | 37.560 |
| basic_100000 | 359.558 | 327.997 | 313.280 | 370.045 |Speed of EDN parsing makes Transit obsolete on JVM:
| file | clojure.edn | transit+msgpack | transit+json | fast-edn |
| :--- | ---: | ---: | ---: | ---: |
| basic_10 | 0.481 | 2.832 | 1.474 | 0.290 |
| basic_100 | 2.799 | 4.242 | 2.297 | 0.594 |
| basic_1000 | 17.548 | 14.738 | 6.583 | 2.815 |
| basic_10000 | 211.536 | 125.741 | 46.849 | 37.560 |
| basic_100000 | 2016.885 | 1167.972 | 447.013 | 370.045 |All execution times above are in µs, M1 Pro 16 Gb, single thread, JDK Zulu23.30+13-CA.
To run benchmarks yourself:
```sh
./script/bench_json.sh
./script/bench_edn.sh
./script/bench_transit.sh
```## Other benefits
Fast EDN has more consistent error reporting. Clojure:
```clojure
(clojure.edn/read-string "1a")
; => NumberFormatException: Invalid number: 1a(clojure.edn/read-string "{:a 1 :b")
; => RuntimeException: EOF while reading(clojure.edn/read-string "\"{:a 1 :b")
; => RuntimeException: EOF while reading string(clojure.edn/read-string "\"\\u123\"")
; => IllegalArgumentException: Invalid character length: 3, should be: 4
```Fast EDN includes location information in exceptions:
```clojure
(fast-edn.core/read-string "1a")
; => NumberFormatException: For input string: "1a", offset: 2, context:
; 1a
; ^(fast-edn.core/read-string "{:a 1 :b")
; => RuntimeException: Map literal must contain an even number of forms: {:a 1, :b, offset: 8, context:
; {:a 1 :b
; ^(fast-edn.core/read-string "\"{:a 1 :b")
; => RuntimeException: EOF while reading string: "{:a 1 :b, offset: 9, context:
; "{:a 1 :b
; ^(fast-edn.core/read-string "\"\\u123\"")
; => RuntimeException: Unexpected digit: ", offset: 7, context:
; "\u123"
; ^
```Optionally, you can include line number/column information at the cost of a little performance:
```clojure
(read-string {:count-lines true} "\"abc")
; => RuntimeException: EOF while reading string: "abc, line: 1, column: 5, offset: 4, context:
; "abc
; ^
```## Using
Add this to `deps.edn`:
```clojure
io.github.tonsky/fast-edn {:mvn/version "1.1.0"}
````read-string` works exactly the same as in `clojure.edn`:
```clojure
(require '[fast-edn.core :as edn]);; Read from string
(edn/read-string "{:a 1}");; Options
(edn/read-string
{:eof ::eof
:readers {'inst #(edn/parse-timestamp edn/construct-instant %)}
:default (fn [tag value]
(clojure.core/tagged-literal tag value))})
```In addition to strings, `fast-edn.core/read-once` allows you to read from `InputStream`, `File`, `byte[]`, `char[]` and `String`:
```clojure
(edn/read-once (io/file "data.edn"))
```Note that `read-once` closes the Reader/InputStream you pass to it, so it’s not a direct analogue of `clojure.edn/read`.
Consuming multiple sequential objects from the same Reader/InputStream is possible but looks slightly different. In Clojure:
```clojure
(let [r (java.io.PushbackReader. reader)]
(take-while #(not= ::eof %)
(repeatedly #(clojure.edn/read {:eof ::eof} r))))
```In Fast EDN:
```clojure
(let [p (fast-edn.core/parser {:eof ::eof} reader)]
(take-while #(not= ::eof %)
(repeatedly #(fast-edn.core/read-next p))))
```## Compatibility
Fast EDN is 100% compatible with clojure.edn. It will read everything that clojure.edn would.
Most cases that clojure.edn rejects, Fast EDN will reject too. There are some minor exceptions though: Fast EDN is a tiny bit more permissive than clojure.edn. We tried to follow intent and just simplify/streamline edge cases where it made sense.
In Fast EDN, ratios can be specified with arbitrary integers:
```clojure
(clojure.edn/read-string "2r1111N")
; => NumberFormatException: For input string: "1111N" under radix 2(fast-edn.core/read-string "2r1111N")
; => 15N(clojure.edn/read-string "0xFF/0x02")
; => NumberFormatException: Invalid number: 0xFF/0x02(fast-edn.core/read-string "0xFF/0x02")
; => 255/2
```Symbols/keywords can have slashes anywhere, first slash is ns separator. Clojure allows them _almost_ anywhere but rules for when it doesn’t are _weird_:
```clojure
(clojure.edn/read-string ":ns/sym/")
; => RuntimeException: Invalid token: :ns/sym/(read-string ":ns/sym/")
; => :ns/sym/
```Same goes for keywords starting with a number. Clojure allows `:1a` but not `:ns/1a` and it seems like an oversight rather than a deliberate design decision:
```clojure
(clojure.edn/read-string ":ns/1a")
; => RuntimeException: Invalid token: :ns/1a(fast-edn.core/read-string ":ns/1a")
; => :ns/1a
```We also support vectors in metadata since Clojure supports them and EDN parser was probably just not updated in time.
```clojure
(clojure.edn/read-string "^[tag] {}")
; => IllegalArgumentException: Metadata must be Symbol,Keyword,String or Map(fast-edn.core/read-string "^[tag] {}")
; => {:param-tags ['tag]} {}
```According to [github.com/edn-format/edn](https://github.com/edn-format/edn), metadata should not be handled by EDN at all, but `clojure.edn` supports it and so are we.
## Test coverage
Fast EDN is extensively tested by test suite from clojure.core, by our own generative test suite and by a set of hand-crafted test cases.
To run tests yourself:
```sh
./script/test.sh
```## What’s the secret?
Fast EDN achieves its speed mainly by avoiding two things clojure.edn does:
- reading from Reader one char at a time,
- using regexps.## Appreciation
- [charred](https://github.com/cnuernber/charred) for starting point
- [clj-async-profiler](https://github.com/clojure-goes-fast/clj-async-profiler) and
[criterium](https://github.com/hugoduncan/criterium/) for providing the tools## License
Copyright © 2024 Nikita Prokopov
Licensed under [MIT](LICENSE).