{"id":19619700,"url":"https://github.com/fxamacker/fxamacker","last_synced_at":"2026-03-19T10:48:47.862Z","repository":{"id":45982096,"uuid":"359252948","full_name":"fxamacker/fxamacker","owner":"fxamacker","description":null,"archived":false,"fork":false,"pushed_at":"2025-02-16T18:22:28.000Z","size":167,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-26T18:39:46.906Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fxamacker.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-18T21:07:35.000Z","updated_at":"2025-02-16T18:22:32.000Z","dependencies_parsed_at":"2025-01-09T11:11:52.257Z","dependency_job_id":"dcb81923-2331-4a28-9330-e6682d08cfbd","html_url":"https://github.com/fxamacker/fxamacker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/fxamacker/fxamacker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fxamacker%2Ffxamacker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fxamacker%2Ffxamacker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fxamacker%2Ffxamacker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fxamacker%2Ffxamacker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fxamacker","download_url":"https://codeload.github.com/fxamacker/fxamacker/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fxamacker%2Ffxamacker/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259494860,"owners_count":22866580,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T11:14:43.110Z","updated_at":"2026-02-13T22:07:53.893Z","avatar_url":"https://github.com/fxamacker.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"On GitHub, I maintain or contribute to projects such as [fxamacker/cbor](https://github.com/fxamacker/cbor), [onflow/atree](https://github.com/onflow/atree), [onflow/ccf](https://github.com/onflow/ccf), [onflow/cadence](https://github.com/onflow/cadence), [onflow/flow-go](https://github.com/onflow/flow-go), etc.\n\nMy first open source project was [fxamacker/cbor](https://github.com/fxamacker/cbor). It's used in projects by Arm Ltd., EdgeX\u0026nbsp;Foundry, Flow\u0026nbsp;Foundation, Fraunhofer\u0026#8209;AISEC, IBM, Kubernetes, Linux\u0026nbsp;Foundation, Microsoft, Red Hat, Tailscale, Veraison, [and\u0026nbsp;others](https://github.com/fxamacker/cbor#who-uses-fxamackercbor).\n\n`fxamacker/cbor` passed multiple confidential security assessments in 2022.  A [nonconfidential security assessment](https://github.com/veraison/go-cose/blob/v1.0.0-rc.1/reports/NCC_Microsoft-go-cose-Report_2022-05-26_v1.0.pdf) (prepared by NCC\u0026nbsp;Group for Microsoft\u0026nbsp;Corporation) includes a subset of `fxamacker/cbor` v2.4.0 without finding any vulnerabilities.\n\nMost of the code I wrote is closed source (in many languages but mostly multithreaded C++).  I'm currently enjoying open source projects and the amazing simplicity of Go.\n\nSome of my open source work is described here.\n\n## Design \u0026 Implementation\n\n\u003c!-- ![image](https://user-images.githubusercontent.com/57072051/145697520-4dc89ec2-435b-46f1-8e2c-f9e8ba0ca1df.png) --\u003e\n\n\u003cimg src=\"https://github.com/user-attachments/assets/81e44838-e1ab-4282-9c88-762055f412c5\" width=\"720\" alt=\"fxamacker stats in atree\"\u003e\n\n__[onflow/atree](https://github.com/onflow/atree)__: Atree provides scalable arrays and maps.  It is used by [Cadence](https://github.com/onflow/cadence) in the [Flow Blockchain](https://www.onflow.org/).\n\nAtree segments, encodes, and stores data into chunks of relatively small size.  This enables blockchains to only hash and transmit modified chunks (aka payloads) instead of the entire array, map, or large element.\n\nAtree deduplicates the metadata sent to Atree when feasible, and smaller data is automatically inlined into existing payloads to reduce the total number of payloads created.\n\nAcknowledgements:  Atree wouldn't exist without Dieter Shirley making priorities clear and inspiring us, Ramtin M. Seraj leading the R\u0026D and empowering us to innovate, and Bastian Müller improving Atree while leading the integration into Cadence. Many thanks to Supun Setunga for the very complex data migration work and more!\n\n### Invented Novel Collision Handler Used in Atree\n\nI invented, designed, and implemented a novel hash collision handling method.  It is different from published methods such as [Cuckoo Hashing](https://en.wikipedia.org/wiki/Cuckoo_hashing), [Double Hashing](https://en.wikipedia.org/wiki/Cuckoo_hashing), [2-Choice Hashing](https://en.wikipedia.org/wiki/2-choice_hashing), etc.\n\nI designed it to allow tradeoffs to be balanced (speed, security, digest size) by using a deferred and segmented cryptographic hash digest that is gradually/partially combined with a fast non-cryptographic hash digest.\n\nThe cryptographic digest is not computed or stored unless there is a collision, and only a small part of the cryptographic hash digest can be stored to resolve collisions.\n\nCollisions on a full 512-bit cryptographic hash (such as SHA3-512) would be handled gracefully (not catastrophic), and the expensive cryptographic computation or storage of the cryptoraphic digest is practically never needed on non-malicious input data.  This allows us to choose a faster and smaller hash such as BLAKE3, etc.\n\nBy default, Atree uses [CircleHash64f](https://github.com/fxamacker/circlehash) together with BLAKE3 for a max combined digest size limit of 320 bits.  Larger digest sizes are possible by using SHA3-512, etc. instead of BLAKE3.\n\n\u003cdetails\u003e\u003csummary\u003e 🔍 Expand for more details\u003c/summary\u003e\n\nFor example, the seeded fast 64-bit digest is used first and collisions on that (if it were to ever happen) is handled by a single deferred cryptographic hash computation that is stored in the fewest segments needed to resolve the collision.\n\nThe worst-case scenario of a collision on the entire combined digest is handled gracefully, so this aspect allows the configurable size limit for the combined digest size to be significantly smaller than 576-bit.  For example, combined digest size can be configured to be in the range of 64..320 bits instead of 64..576 bits.\n\nAlso, the max combined digest size limit can be truncated until desired tradeoffs are reached.  However, I generally prefer not to truncate 256-bit cryptographic digests and only truncate 512-bit digests if the tradeoffs are acceptable.  And if a system only requires 384 bits from SHA3-512, then I prefer to use SHA3-384 instead of truncating SHA3-512.\n\n\u003c/details\u003e\n\n## Optimization\n\nWhen feasible, my optimizations simultaneously improve speed, memory, storage, and network use without negative tradeoffs.\n\n### Optimized Unfamiliar Code (11.4 hours ➡️ 4 minutes, -431 GB/op, -7.6 billion allocs/op)\n\n__[onflow/flow-go](https://github.com/onflow/flow-go):__  Found optimizations by reading unfamiliar source code and [proposed improvements](https://github.com/onflow/flow-go/issues/1750#issuecomment-1004870851) to resolve [issue #1750](https://github.com/onflow/flow-go/issues/1750). Very grateful for Ramtin M. Seraj for opening a batch of issues and letting me tackle this one.\n\n[PR #1944](https://github.com/onflow/flow-go/pull/1944) (Optimize MTrie Checkpoint for speed, memory, and file size):\n- __SPEED__: 171x speedup (11.4 hours to 4 minutes) in MTrie traversing/flattening/writing phase (without adding concurrency) which led to a 47x speedup in checkpointing (11.7 hours to 15 mins).\n- __MEMORY__: -431 GB alloc/op (-54.35%) and -7.6 billion allocs/op (-63.67%)\n- __STORAGE__: -6.9 GB file size (without using compression yet)\n\nAfter [PR #1944](https://github.com/onflow/flow-go/pull/1944) reduced Mtrie flattening and serialization phase to under 5 minutes (which sometimes took 17 hours on mainnet16), creating a separate MTrie state used most of the remaining duration and memory.\n\nAdditional optimizations (add concurrency, add compression, etc.) were moved to separate issue/PR and I switched my focus to related issues like [#1747](https://github.com/onflow/flow-go/issues/1747).\n\nUPDATE: About six months later, checkpoint file size grew from 53GB to 126GB and checkpointing frequency increased to every few hours (instead of about once daily) due to increased transactions and data size.  Without [PR #1944](https://github.com/onflow/flow-go/pull/1944), checkpointing would be taking over 20-30 hours each time, require more operational RAM, and slowdown the system with increased GC pressure.  More info: [issue #2286](https://github.com/onflow/flow-go/issues/2286) and [PR #2792](https://github.com/onflow/flow-go/pull/2792).\n\n### Optimized By Adding New Features (Inlining and Deduplication)\n\n__[onflow/atree](https://github.com/onflow/atree):__  Designed and implemented [Atree Inlining \u0026 Deduplication](https://github.com/onflow/atree/releases/tag/v0.8.0) which was deployed on Sept. 4, 2024. It eliminated over 1 billion mtrie nodes (-61%) and inlined over 500 million payloads to improve memory, storage, and speed on same hardware.\n\nReducing the total number of payloads by 500+ million and slowing down the future growth rate of payloads benefits different types of servers that run payload databases, payload indexers, payload caches, and MTrie (execution state). \n\nAs one example, Atree Inlining \u0026 Deduplication reduced peak RAM use by hundreds of GB on each Flow Execution Node, improved Flow transaction speed by ~7%, and also sped up other servers: \"btw amazing work this atree inlining, my tinyAN bootstrap time improved like 5x\".\n\n## Evaluations and Improvements\n\n__[fxamacker/circlehash](https://github.com/fxamacker/circlehash)__: I created CircleHash64 on weekends after evaluating state-of-the-art fast hashes for work. At the time, I needed a fast hash for short input sizes typically \u003c128 bytes but didn't like existing hashes.  I didn't want to reinvent the wheel so I based it on Google Abseil C++ internal hash.  CircleHash64 is well-rounded: it balances speed, digest quality, and maintainability.\n\n#### CircleHash64 has good results in [Strict Avalanche Criterion (SAC)](https://en.wikipedia.org/wiki/Avalanche_effect#Strict_avalanche_criterion).\n\n|                | CircleHash64 | Abseil C++ | SipHash-2-4 | xxh64 |\n| :---           | :---:         | :---:  | :---: | :---: |\n| SAC worst-bit \u003cbr/\u003e 0-128 byte inputs \u003cbr/\u003e (lower % is better) | 0.791% 🥇 \u003cbr/\u003e w/ 99 bytes | 0.862% \u003cbr/\u003e w/ 67 bytes | 0.802% \u003cbr/\u003e w/ 75 \u0026 117 bytes | 0.817% \u003cbr/\u003e w/ 84 bytes |\n\n☝️ Using demerphq/smhasher updated to test all input sizes 0-128 bytes (SAC test will take hours longer to run).\n\n#### CircleHash64f is fast at hashing short inputs with a 64-bit seed\n\n|              | CircleHash64\u003cbr/\u003e(seeded) | XXH3\u003cbr/\u003e(seeded) | XXH64\u003cbr/\u003e(w/o seed) | SipHash\u003cbr/\u003e(seeded) |\n|:-------------|:---:|:---:|:---:|:---:|\n| 4 bytes | 1.34 GB/s | 1.21 GB/s| 0.877 GB/s | 0.361 GB/s |\n| 8 bytes | 2.70 GB/s | 2.41 GB/s | 1.68 GB/s | 0.642 GB/s |\n| 16 bytes | 5.48 GB/s | 5.21 GB/s | 2.94 GB/s | 1.03 GB/s |\n| 32 bytes | 8.01 GB/s | 7.08 GB/s | 3.33 GB/s | 1.46 GB/s |\n| 64 bytes | 10.3 GB/s | 9.33 GB/s | 5.47 GB/s | 1.83 GB/s |\n| 128 bytes | 12.8 GB/s | 11.6 GB/s | 8.22 GB/s | 2.09 GB/s |\n| 192 bytes | 14.2 GB/s | 9.86 GB/s | 9.71 GB/s | 2.17 GB/s |\n| 256 bytes | 15.0 GB/s | 8.19 GB/s | 10.2 GB/s | 2.22 GB/s |\n\n- Using Go 1.17.7, darwin_amd64, i7-1068NG7 CPU\n- Results from `go test -bench=. -count=20` and `benchstat`\n- Fastest XXH64 in Go+Assembly doesn't support seed\n\nCircleHash64 doesn't have big GB/s drops in throughput as input size gets larger.  Other CircleHash variants are faster for larger input sizes and a bit slower for short inputs (not yet published).\n\n## 📚 Implement IETF Internet Standards (RFC 8949 \u0026 RFC 7049)\n\n__[fxamacker/cbor](https://github.com/fxamacker/cbor)__: I designed and implemented a secure CBOR codec after reading RFC 7049.  During implementation, I helped review [the draft](https://github.com/cbor-wg/CBORbis) leading to [RFC 8949](https://datatracker.ietf.org/doc/html/rfc8949).  The CBOR codec rejects malformed CBOR data and has an option to detect duplicate map keys.  It doesn't crash when decoding bad CBOR data.\n\nDecoding 9 or 10 bytes of malformed CBOR data shouldn't exhaust memory. For example,  \n`[]byte{0x9B, 0x00, 0x00, 0x42, 0xFA, 0x42, 0xFA, 0x42, 0xFA, 0x42}`\n\n|     | Decode bad 10 bytes to interface{} | Decode bad 10 bytes to []byte |\n| :--- | :------------------ | :--------------- |\n| fxamacker/cbor\u003cbr/\u003e1.0-2.3 | 49.44 ns/op, 24 B/op, 2 allocs/op* | 51.93 ns/op, 32 B/op, 2 allocs/op* |\n| ugorji/go 1.2.6 | ⚠️ 45021 ns/op, 262852 B/op, 7 allocs/op | 💥 runtime: out of memory: cannot allocate |\n| ugorji/go 1.1-1.1.7 | 💥 runtime: out of memory: cannot allocate | 💥 runtime: out of memory: cannot allocate|\n\n*Speed and memory are for latest codec version listed in the row (compiled with Go 1.17.5).\n\nfxamacker/cbor CBOR safety settings include: MaxNestedLevels, MaxArrayElements, MaxMapPairs, and IndefLength.\n\n## Professional Background\n\nI try to balance competing factors such as speed, security, usability, and maintainability based on each project's priorities.\n\nMost recently, I accepted an offer I received on April 13, 2021 from Dapper Labs. I had been working for them as an independent contractor for about two weeks to help optimize Cadence storage layer and to create a streaming mode branch of fxamacker/cbor.  On my first day as a contractor, I created [issue 738](https://github.com/onflow/cadence/issues/738) and the Cadence team was very welcoming and productive to work with.  I subsequently opened 100+ issues and 100+ PRs at work in 2021.\n\nMy prior experience before Dapper Labs includes co-founding \u0026 bootstrapping enterprise software company, and working as an IT consultant.\n\n- My smallest consulting client - a startup.  I assisted with prototyping which helped secure their next round of funding.\n- My largest consulting client - an S\u0026P 500 company with almost 50,000 employees.  I evaluated (as part of a large team) various technologies to be selected for their new global stack for deployment to over 100 countries.\n- My largest software licensing+subscription+support client - a company with over 3,000 employees in the US that deployed my data security system to all their US-based offices and factories.  The tamper-resistant system used 4 types of servers to distribute and enforce security policies across multiple timezones for various client software.  The system was designed to repair and update itself with bugfixes without introducing downtime.  I was only one of two people to ever have access to the source code: just two of us conceived, designed, implemented, tested, and maintained the system.  Our system beat enterprise solutions from well-funded large competitors year after year during customer evaluations which included testing employee-attempted data theft.  It was not approved for export or use outside the US.\n\nDeveloping commercial software provided the advantage of choosing the most appropriate language and framework for each part of the system because the customers didn't know what programming languages, tools, or frameworks were used.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffxamacker%2Ffxamacker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffxamacker%2Ffxamacker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffxamacker%2Ffxamacker/lists"}