https://github.com/pitr/jsontokenizer
Zero-allocation JSON tokenizer
https://github.com/pitr/jsontokenizer
go golang json tokenizer
Last synced: about 1 month ago
JSON representation
Zero-allocation JSON tokenizer
- Host: GitHub
- URL: https://github.com/pitr/jsontokenizer
- Owner: pitr
- License: mit
- Created: 2021-06-21T00:01:05.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2023-12-31T16:00:42.000Z (over 2 years ago)
- Last Synced: 2025-03-30T05:45:04.380Z (about 1 year ago)
- Topics: go, golang, json, tokenizer
- Language: Go
- Homepage:
- Size: 69.3 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://pkg.go.dev/pitr.ca/jsontokenizer)
# JSON Tokenizer
Zero-allocation JSON tokenizer.
## Features
- Fast. ~15x faster than `encoding/json.Decoder`. See benchmarks below.
- Similar API to `encoding/json.Decoder`.
- No reflection.
- No allocations, beyond small buffer for reading.
- Can be reused with a call to `Reset`.
## Anti-Features
- Does **NOT** parse JSON. Will not verify semantic correctness. `[}` will produce 2 tokens without errors.
- Needs an `io.Writer` to write numbers and strings into. Based on the use case, can be `os.Stdout`, `bytes.Buffer`, [ByteBuffer](https://github.com/valyala/bytebufferpool), etc.
- Does not escape strings. `"he is 5'11\\"."` will be exactly that.
- Does not parse numbers into floats/ints. Use `strconv.Atoi()` if needed.
- Not thread safe. Use with `sync.Mutex` or the like to prevent simultaneous calls.
## Quick Start
```go
import (
"io"
json "pitr.ca/jsontokenizer"
)
func example(in io.Reader) error {
tk := json.New(in)
for {
tok, err := tk.Token()
if err == io.EOF {
return nil
}
if err != nil {
return err
}
switch tok {
case json.TokNull:
println("got null")
case json.TokTrue, json.TokFalse:
println("got bool")
case json.TokArrayOpen, json.TokArrayClose, json.TokObjectOpen, json.TokObjectClose, json.TokObjectColon, json.TokComma:
println("got delimiter")
case json.TokNumber:
println("got number")
_, err := tk.ReadNumber(io.Discard)
if err != nil {
return err
}
case json.TokString:
println("got string")
_, err := tk.ReadString(io.Discard)
if err != nil {
return err
}
}
}
}
```
## Benchmarks
Sizes are buffer sizes, which can be specified with `NewWithSize`. Default is 64. Tokenizer is re-used between benchmark iterations, but this doesn't impact performance.
`BenchmarkBuiltinDecoder` is `encoding/json.Decoder`.
```
BenchmarkTokenizer/size=8-8 1419 788208 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=16-8 1668 688656 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=32-8 1792 628601 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=64-8 2040 571411 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=128-8 2228 520646 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=256-8 2392 482151 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=512-8 2516 460283 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=1024-8 2553 458148 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=2048-8 2618 451937 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=4096-8 2499 451601 ns/op 0 B/op 0 allocs/op
BenchmarkTokenizer/size=8192-8 2610 443493 ns/op 0 B/op 0 allocs/op
BenchmarkBuiltinDecoder-8 157 7607729 ns/op 1755495 B/op 107836 allocs/op
```