{"id":13439301,"url":"https://github.com/pikkr/pikkr","last_synced_at":"2025-03-20T07:33:08.327Z","repository":{"id":40659928,"uuid":"101627798","full_name":"pikkr/pikkr","owner":"pikkr","description":"JSON parser which picks up values directly without performing tokenization in Rust","archived":false,"fork":false,"pushed_at":"2017-09-21T14:14:55.000Z","size":634,"stargazers_count":633,"open_issues_count":7,"forks_count":15,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-03-01T22:55:03.166Z","etag":null,"topics":["json","json-parser","pikkr","rust","simd"],"latest_commit_sha":null,"homepage":"https://crates.io/crates/pikkr","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pikkr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-08-28T09:35:23.000Z","updated_at":"2025-02-06T21:52:38.000Z","dependencies_parsed_at":"2022-09-01T01:40:29.311Z","dependency_job_id":null,"html_url":"https://github.com/pikkr/pikkr","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pikkr%2Fpikkr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pikkr%2Fpikkr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pikkr%2Fpikkr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pikkr%2Fpikkr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pikkr","download_url":"https://codeload.github.com/pikkr/pikkr/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244571032,"owners_count":20474168,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["json","json-parser","pikkr","rust","simd"],"created_at":"2024-07-31T03:01:12.781Z","updated_at":"2025-03-20T07:33:03.316Z","avatar_url":"https://github.com/pikkr.png","language":"Rust","funding_links":[],"categories":["Libraries","Rust","库 Libraries","库"],"sub_categories":["Encoding","编码 Encoding","编码(Encoding)","加密 Encoding"],"readme":"# Pikkr\n\n[![Crates.io version shield](https://img.shields.io/crates/v/pikkr.svg)](https://crates.io/crates/pikkr)\n[![Build Status](https://travis-ci.org/pikkr/pikkr.svg?branch=master)](https://travis-ci.org/pikkr/pikkr)\n\nJSON parser which picks up values directly without performing tokenization in Rust\n\n## Abstract\n\nPikkr is a JSON parser which picks up values directly without performing tokenization in Rust. This JSON parser is implemented based on [Y. Li, N. R. Katsipoulakis, B. Chandramouli, J. Goldstein, and D. Kossmann. Mison: a fast JSON parser for data analytics. In *VLDB*, 2017](http://www.vldb.org/pvldb/vol10/p1118-li.pdf).\n\nThis JSON parser extracts values from a JSON record without using finite state machines (FSMs) and performing tokenization. It parses JSON records in the following procedures:\n\n1. [Indexing] Creates an index which maps logical locations of queried fields to their physical locations by using SIMD instructions and bit manipulation.\n2. [Basic parsing] Finds values of queried fields by scanning a JSON record using the index created in the previous process and learns their logical locations (i.e. pattern of the JSON structure) in the early stages.\n3. [Speculative parsing] Speculates logical locations of queried fields by using the learned result information, jumps directly to their physical locations and extracts values in the later stages. Fallbacks to basic parsing if the speculation fails.\n\nThis JSON parser performs well when there are a limited number of different JSON structural variants in a JSON data stream or JSON collection, and that is a common case in data analytics field.\n\nPlease read the paper mentioned in the opening paragraph for the details of the JSON parsing algorithm.\n\n## Performance\n\n### Benchmark Result\n\n![](https://raw.githubusercontent.com/pikkr/pikkr/master/img/benchmark.png)\n\n### Hardware\n\n```\nModel Name: MacBook Pro\nProcessor Name: Intel Core i7\nProcessor Speed: 3.3 GHz\nNumber of Processors: 1\nTotal Number of Cores: 2\nL2 Cache (per Core): 256 KB\nL3 Cache: 4 MB\nMemory: 16 GB\n```\n\n### Rust\n\n```bash\n$ cargo --version\ncargo 0.23.0-nightly (34c0674a2 2017-09-01)\n\n$ rustc --version\nrustc 1.22.0-nightly (d93036a04 2017-09-07)\n```\n\n### Crates\n\n* [serde_json](https://crates.io/crates/serde_json) 1.0.3\n* [json](https://crates.io/crates/json) 0.11.9\n* [pikkr](https://crates.io/crates/pikkr) 0.16.0\n\n### JSON Data\n\n* \"a JSON data set of startup company information\" on [JSON Data Sets | JSON Studio](http://jsonstudio.com/resources/).\n\n### Benchmark Code\n\n* [pikkr/rust-json-parser-benchmark: Rust JSON Parser Benchmark](https://github.com/pikkr/rust-json-parser-benchmark)\n\n## Example\n\n### Code\n\n```rust\nextern crate pikkr;\n\nfn main() {\n    let queries = vec![\n        \"$.f1\".as_bytes(),\n        \"$.f2.f1\".as_bytes(),\n    ];\n    let train_num = 2; // Number of records used as training data\n                       // before Pikkr starts speculative parsing.\n    let mut p = match pikkr::Pikkr::new(\u0026queries, train_num) {\n        Ok(p) =\u003e p,\n        Err(err) =\u003e panic!(\"There was a problem creating a JSON parser: {:?}\", err.kind()),\n    };\n    let recs = vec![\n        r#\"{\"f1\": \"a\", \"f2\": {\"f1\": 1, \"f2\": true}}\"#,\n        r#\"{\"f1\": \"b\", \"f2\": {\"f1\": 2, \"f2\": true}}\"#,\n        r#\"{\"f1\": \"c\", \"f2\": {\"f1\": 3, \"f2\": true}}\"#, // Speculative parsing starts from this record.\n        r#\"{\"f2\": {\"f2\": true, \"f1\": 4}, \"f1\": \"d\"}\"#,\n        r#\"{\"f2\": {\"f2\": true, \"f1\": 5}}\"#,\n        r#\"{\"f1\": \"e\"}\"#\n    ];\n    for rec in recs {\n        match p.parse(rec.as_bytes()) {\n            Ok(results) =\u003e {\n                for result in results {\n                    print!(\"{} \", match result {\n                        Some(result) =\u003e String::from_utf8(result.to_vec()).unwrap(),\n                        None =\u003e String::from(\"None\"),\n                    });\n                }\n                println!();\n            },\n            Err(err) =\u003e println!(\"There was a problem parsing a record: {:?}\", err.kind()),\n        }\n    }\n    /*\n    Output:\n        \"a\" 1\n        \"b\" 2\n        \"c\" 3\n        \"d\" 4\n        None 5\n        \"e\" None\n    */\n}\n```\n\n### Build\n\n```bash\n$ cargo --version\ncargo 0.23.0-nightly (34c0674a2 2017-09-01) # Make sure that nightly release is being used.\n$ RUSTFLAGS=\"-C target-cpu=native\" cargo build --release\n```\n\n### Run\n\n```bash\n$ ./target/release/[package name]\n\"a\" 1\n\"b\" 2\n\"c\" 3\n\"d\" 4\nNone 5\n\"e\" None\n```\n\n## Documentation\n\n* [pikkr - Rust](https://pikkr.github.io/doc/pikkr/)\n\n## Restrictions\n\n* [Rust nightly channel](https://github.com/rust-lang-nursery/rustup.rs/blob/master/README.md#working-with-nightly-rust) and [CPUs with AVX2](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2) are needed to build Rust source code which depends on Pikkr and run the executable binary file because Pikkr uses AVX2 Instructions.\n\n## Contributing\n\nAny kind of contribution (e.g. comment, suggestion, question, bug report and pull request) is welcome.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpikkr%2Fpikkr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpikkr%2Fpikkr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpikkr%2Fpikkr/lists"}