{"id":18339023,"url":"https://github.com/xoofx/fast1brc","last_synced_at":"2025-04-06T05:31:59.386Z","repository":{"id":216933835,"uuid":"742586510","full_name":"xoofx/Fast1BRC","owner":"xoofx","description":"C# version for the The One Billion Row Challenge","archived":false,"fork":false,"pushed_at":"2024-03-17T14:21:05.000Z","size":4438,"stargazers_count":31,"open_issues_count":0,"forks_count":4,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-03-21T18:06:08.425Z","etag":null,"topics":["csharp","dotnet"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xoofx.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"license.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":["xoofx"]}},"created_at":"2024-01-12T20:08:35.000Z","updated_at":"2024-12-13T05:37:52.000Z","dependencies_parsed_at":"2024-01-22T11:02:09.055Z","dependency_job_id":"0cadff9f-e4ab-4166-bb92-8f0e548af0e3","html_url":"https://github.com/xoofx/Fast1BRC","commit_stats":null,"previous_names":["xoofx/fast1brc"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xoofx%2FFast1BRC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xoofx%2FFast1BRC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xoofx%2FFast1BRC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xoofx%2FFast1BRC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xoofx","download_url":"https://codeload.github.com/xoofx/Fast1BRC/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247440349,"owners_count":20939220,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csharp","dotnet"],"created_at":"2024-11-05T20:16:14.626Z","updated_at":"2025-04-06T05:31:54.602Z","avatar_url":"https://github.com/xoofx.png","language":"C#","funding_links":["https://github.com/sponsors/xoofx"],"categories":[],"sub_categories":[],"readme":"# 1️⃣🐝🏎️ The One Billion Row Challenge - .NET Edition\n\n\u003e The One Billion Row Challenge (1BRC [Original Java Challenge](https://github.com/gunnarmorling/1brc)) is a fun exploration of how far modern .NET can be pushed for aggregating one billion rows from a text file.\n\u003e Grab all your (virtual) threads, reach out to SIMD, optimize your GC, or pull any other trick, and create the fastest implementation for solving this task!\n\nAggregated results for C#/F# at https://github.com/praeclarum/1brc\n\n**Fast1BRC** is one of the fastest 😅 implementation in the .NET wild west. 🚀\n\n## Techniques used\n\n- Multiple threads and some SIMD\n  - SIMD used [here](https://github.com/xoofx/Fast1BRC/blob/28589e047c4106357995d4bdb37b70d16f5184d7/Program.cs#L356-L388) mostly for finding the index of `;` separating the city name with the temperature\n  - We can then keep the full city name in a single `Vector256\u003cbyte\u003e` register (when the city name is `\u003c=` 32)\n  - For the threads, I'm keeping the main thread busy processing the last block\n- On Windows, no memory mapped file but RandomAccess reopening the same handle per thread\n  - As I discovered that it is lowering OS contention on Windows\n  - On Linux, it seems that it's less an issue, performance being relatively the same\n  - On Mac, memory mapped file seem to perform much better\n- Initially, I implemented the dictionary to store the correspondance between city names and counters with a FNV-1A 64 bit hashing aligned on 8 bytes boundary, but I was told that this is not accepted as discussed [here](https://github.com/gunnarmorling/1brc/pull/186#issuecomment-1880132600), so I change the implementation to 2 dictionaries:\n  - One with a key of 32 bytes - implemented with `Vector256\u003cbyte\u003e` that covers the default dataset (names below 32 chars)\n  - One with a key of 128 bytes to support the limit of city names (100 characters)\n  - Both are implemented with `Vector256\u003cbyte\u003e`\n  - Cache entries are aligned on 64 bytes to allow better cache usage\n  - [Dictionary implementation](https://github.com/xoofx/Fast1BRC/blob/28589e047c4106357995d4bdb37b70d16f5184d7/Program.cs#L889-L932) is taken from .NET BCL but reimplemented with unsafe and with using pointers directly instead of indices\n  - For the 32 bytes key, the implementation inline things directly so that we maximize the codegen with everything being kept in registers.\n  - For the hash of the dictionary (both 32 bytes and 128 bytes key), I'm hashing the first 16 bytes by XORing the 2 first `ulong` with an intermediate multiplication by 397 for the first `ulong` (prime number that gives good results). It is enough simple and efficient to not have any collisions.\n- No particular tricks for parsing the temperature, apart assuming that there is only 1 digit after the `.`.\n  - When I changed the code to not parse the temperature, it was not changing meaningfully the results.\n\n## Results\n\nBenchmark performed on 3 different machines with a different combination of OS, with the following top libraries:\n\n- Fast1BRC (This repository)\n- [Nietras's 1brc](https://github.com/nietras/1brc.cs)\n- [Buybackoff's 1brc](https://github.com/buybackoff/1brc)\n\n\n![Results](results.png)\n\nSome comments:\n\n- Results are varying a lot! 📊\n- My solution comes really really close 🥈 to Nietras solution (5% on average), but his solution is probably slightly more consistent and overall winning! 🥇\n  - I discovered that it can be easy to mess things with a slight change that could work on one machine/platform but not work on another. Process/Threading priority tweaking are super sensitive and I stopped tweaking them in the end and I'm using the default. 😅\n- The results vary vastly between HW / OS 💾\n  - The first aspect is the performance of the M2/SSD disk access\n  - The second aspect is CPU performance (number of cores, caches, clock...etc.)\n  - When looking at a profiler, on my Windows machine, the time passed in IO is almost equal to CPU.\n  - Some machines could give very different results (e.g buybackoff own results on his repo)\n  \n## Build\n\nYou need to have [.NET 8 SDK installed](https://dotnet.microsoft.com/en-us/download/dotnet/8.0)\n\n```\ndotnet publish -c Release -r win-x64\n.\\bin\\Release\\net8.0\\win-x64\\publish\\Fast1BRC measurements.txt\n```\n\nTo fully test, you need to generate `measurements.txt`, easier on Ubuntu/MacOS:\n\n- Install OpenJDK 21 https://jdk.java.net/21/\n- Install Maven 3.9+ https://maven.apache.org/download.cgi\n- Clone `https://github.com/gunnarmorling/1brc` and go to its folder\n- Run `mvn package` \n- Run `./create_measurements.sh 1000000000`\n- Copy `measurements.txt` to a place where you can use it with Fast1BRC\n\n## License\n\nThis software is released under the [BSD-2-Clause license](https://opensource.org/licenses/BSD-2-Clause). \n\n## Author\n\nAlexandre Mutel aka [xoofx](https://xoofx.github.io).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxoofx%2Ffast1brc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxoofx%2Ffast1brc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxoofx%2Ffast1brc/lists"}