https://github.com/pitr/jsontokenizer

Zero-allocation JSON tokenizer
https://github.com/pitr/jsontokenizer

go golang json tokenizer

Last synced: 2 months ago
JSON representation

Zero-allocation JSON tokenizer

Host: GitHub
URL: https://github.com/pitr/jsontokenizer
Owner: pitr
License: mit
Created: 2021-06-21T00:01:05.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2023-12-31T16:00:42.000Z (over 2 years ago)
Last Synced: 2025-03-30T05:45:04.380Z (over 1 year ago)
Topics: go, golang, json, tokenizer
Language: Go
Homepage:
Size: 69.3 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          [![Go Reference](https://pkg.go.dev/badge/pitr.ca/jsontokenizer.svg)](https://pkg.go.dev/pitr.ca/jsontokenizer)

# JSON Tokenizer

Zero-allocation JSON tokenizer.

## Features

- Fast. ~15x faster than `encoding/json.Decoder`. See benchmarks below.

- Similar API to `encoding/json.Decoder`.

- No reflection.

- No allocations, beyond small buffer for reading.

- Can be reused with a call to `Reset`.

## Anti-Features

- Does **NOT** parse JSON. Will not verify semantic correctness. `[}` will produce 2 tokens without errors.

- Needs an `io.Writer` to write numbers and strings into. Based on the use case, can be `os.Stdout`, `bytes.Buffer`, [ByteBuffer](https://github.com/valyala/bytebufferpool), etc.

- Does not escape strings. `"he is 5'11\\"."` will be exactly that.

- Does not parse numbers into floats/ints. Use `strconv.Atoi()` if needed.

- Not thread safe. Use with `sync.Mutex` or the like to prevent simultaneous calls.

## Quick Start

```go

import (

	"io"

	json "pitr.ca/jsontokenizer"

)

func example(in io.Reader) error {

	tk := json.New(in)

	for {

		tok, err := tk.Token()

		if err == io.EOF {

			return nil

		}

		if err != nil {

			return err

		}

		switch tok {

		case json.TokNull:

			println("got null")

		case json.TokTrue, json.TokFalse:

			println("got bool")

		case json.TokArrayOpen, json.TokArrayClose, json.TokObjectOpen, json.TokObjectClose, json.TokObjectColon, json.TokComma:

			println("got delimiter")

		case json.TokNumber:

			println("got number")

			_, err := tk.ReadNumber(io.Discard)

			if err != nil {

				return err

			}

		case json.TokString:

			println("got string")

			_, err := tk.ReadString(io.Discard)

			if err != nil {

				return err

			}

		}

	}

}

```

## Benchmarks

Sizes are buffer sizes, which can be specified with `NewWithSize`. Default is 64. Tokenizer is re-used between benchmark iterations, but this doesn't impact performance.

`BenchmarkBuiltinDecoder` is `encoding/json.Decoder`.

```

BenchmarkTokenizer/size=8-8         	    1419	    788208 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=16-8         	    1668	    688656 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=32-8         	    1792	    628601 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=64-8         	    2040	    571411 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=128-8        	    2228	    520646 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=256-8        	    2392	    482151 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=512-8        	    2516	    460283 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=1024-8       	    2553	    458148 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=2048-8       	    2618	    451937 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=4096-8       	    2499	    451601 ns/op	       0 B/op	       0 allocs/op

BenchmarkTokenizer/size=8192-8       	    2610	    443493 ns/op	       0 B/op	       0 allocs/op

BenchmarkBuiltinDecoder-8            	     157	   7607729 ns/op	 1755495 B/op	  107836 allocs/op

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pitr/jsontokenizer

Awesome Lists containing this project

README