{"id":21835066,"url":"https://github.com/simdutf/simdbase64","last_synced_at":"2026-02-27T23:15:22.114Z","repository":{"id":256789832,"uuid":"803405147","full_name":"simdutf/SimdBase64","owner":"simdutf","description":"Fast WHATWG forgiving-base64 decoding in C#","archived":false,"fork":false,"pushed_at":"2025-04-03T02:02:04.000Z","size":27073,"stargazers_count":36,"open_issues_count":2,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-06-22T06:05:29.482Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simdutf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-05-20T16:54:59.000Z","updated_at":"2025-04-11T00:47:55.000Z","dependencies_parsed_at":"2025-04-14T08:52:17.583Z","dependency_job_id":"be02f792-d8bc-4add-9d94-33970c63e2ad","html_url":"https://github.com/simdutf/SimdBase64","commit_stats":null,"previous_names":["simdutf/simdbase64"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/simdutf/SimdBase64","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdutf%2FSimdBase64","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdutf%2FSimdBase64/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdutf%2FSimdBase64/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdutf%2FSimdBase64/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simdutf","download_url":"https://codeload.github.com/simdutf/SimdBase64/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simdutf%2FSimdBase64/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29918976,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-27T19:37:42.220Z","status":"ssl_error","status_checked_at":"2026-02-27T19:37:41.463Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-27T20:17:17.579Z","updated_at":"2026-02-27T23:15:22.095Z","avatar_url":"https://github.com/simdutf.png","language":"C#","readme":"# SimdBase64\n[![.NET](https://github.com/simdutf/SimdBase64/actions/workflows/dotnet.yml/badge.svg)](https://github.com/simdutf/SimdBase64/actions/workflows/dotnet.yml)\n\n## Fast WHATWG forgiving base64 decoding in C#\n\nBase64 is a standard approach to represent any binary data as ASCII. It is part of the email\nstandard (MIME) and is commonly used to embed data in XML, HTML or JSON files. For example,\nimages can be encoded as text using base64. Base64 is also used to represent cryptographic keys.\n\nOur processors have fast instructions (SIMD) that can process blocks of data at once. They are ideally \nsuited to encode and decode base64.\nThe C# .NET runtime library has fast (SIMD-based) base64 functions[^1] when the input is UTF-8. \n\nEncoding is somewhat easier than decoding. Decoding is a more challenging problem than base64 encoding because\nof the presence of allowable white space characters and the need to validate the input. Indeed, all\ninputs are valid for encoding, but only some inputs are valid for decoding. Having to skip white space \ncharacters makes accelerated decoding somewhat difficult. We refer to this decoding as WHATWG forgiving-base64 decoding.\n\nTo handle spaces and validation, we recently designed faster base64 decoding algorithm. It has been deployed\nin the [simdutf](https://github.com/simdutf/simdutf) C++ library and used in production systems (e.g., the JavaScript runtime systems Node.js and Bun).\nWith this new algorithm, we beat the C# .NET runtime functions by 1.7 x to 2.3 x on realistic inputs of a few kilobytes.\n\nThe algorithm is unpatented (free) and we make our\nC# code available under a liberal open-source licence (MIT).\n\n\n## Results (SimdBase64 vs. fast .NET functions)\n\nWe use the enron base64 data for benchmarking, see benchmark/data/email.\nWe process the data as UTF-8 (ASCII) using the .NET accelerated functions\nas a reference (`System.Buffers.Text.Base64.DecodeFromUtf8`). Our benchmark results are\nfully reproducible.\n\n\n| processor and base freq.      | SimdBase64 (GB/s) | .NET speed (GB/s) | speed up |\n|:----------------|:------------------------|:-------------------|:-------------------|\n| Apple M2 processor (ARM, 3.5 Ghz)   | 10                     | 3.8               | 2.6 x |\n| AWS Graviton 3 (ARM, 2.6 GHz)   | 5.1 | 2.0 | 2.6 x |\n| Intel Ice Lake (2.0 GHz)  | 7.6                     | 3.4              | 2.2 x |\n| AMD EPYC 7R32 (Zen 2, 2.8 GHz)    |  6.9       | 3.0 | 2.3 x |\n\n## Results (SimdBase64 vs. string .NET functions)\n\nThe .NET runtime did not accelerate the `Convert.FromBase64String(mystring)` functions.\nWe can multiply the decoding speed compared to the .NET standard library.\n\nReplace the following code based on the standard library...\n\n```C#\nbyte[] newBytes = Convert.FromBase64String(s);\n```\n\nwith our version...\n\n```C#\nbyte[] newBytes = SimdBase64.Base64.FromBase64String(s);\n```\n\n| processor and base freq.      | SimdBase64 (GB/s) | .NET speed (GB/s) | speed up |\n|:----------------|:------------------------|:-------------------|:-------------------|\n| Apple M2 processor (ARM, 3.5 Ghz)   | 4.0                     | 1.1              | 3.6 x |\n| Intel Ice Lake (2.0 GHz)  | 2.5                      | 0.65             | 3.8 x |\n\n## AVX-512\n\nAs for .NET 9, the support for AVX-512 remains incomplete in C#. In particular, important\nVBMI2 instructions are missing. Hence, we are not using AVX-512 under x64 systems at this time.\nHowever, as soon as .NET offers the necessary support, we will update our results.\n\n## Requirements\n\nWe require .NET 9 or better: https://dotnet.microsoft.com/en-us/download/dotnet/9.0\n\n## Usage\n\nThe library only provides Base64 decoding functions, because the .NET library already has\nfast Base64 encoding functions. We support both `Span\u003cbyte\u003e` (ASCII or UTF-8) and\n`Span\u003cchar\u003e` (UTF-16) as input. If you have C# string, you can get its `Span\u003cchar\u003e` with\nthe `AsSpan()` method.\n\n```c#\n        string base64 = \"SGVsbG8sIFdvcmxkIQ==\"; // could be span\u003cbyte\u003e in UTF-8 as well\n        byte[] buffer = new byte[SimdBase64.Base64.MaximalBinaryLengthFromBase64(base64.AsSpan())];\n        int bytesConsumed; // gives you the number of characters consumed\n        int bytesWritten;\n        var result = SimdBase64.Base64.DecodeFromBase64(base64.AsSpan(), buffer, out bytesConsumed, out bytesWritten, false); // false is for regular base64, true for base64url\n        // result == OperationStatus.Done\n        var answer = buffer.AsSpan().Slice(0, bytesWritten); // decoded result\n        // Encoding.UTF8.GetString(answer) == \"Hello, World!\"\n```\n\n## Running tests\n\n```\ndotnet test\n```\n\nTo get a list of available tests, enter the command:\n\n```\ndotnet test --list-tests\n```\n\nTo run specific tests, it is helpful to use the filter parameter:\n\n```\ndotnet test -c Release --filter DecodeBase64CasesScalar\n```\n\n## Running Benchmarks\n\nTo run the benchmarks, run the following command:\n```\ncd benchmark\ndotnet run -c Release\n```\n\nTo run just one benchmark, use a filter:\n\n```\ncd benchmark\ndotnet run -c Release --filter \"SimdUnicodeBenchmarks.RealDataBenchmark.AVX2DecodingRealDataUTF8(FileName: \\\"data/email/\\\")\"\n```\n\nIf you are under macOS or Linux, you may want to run the benchmarks in privileged mode:\n\n```\ncd benchmark\nsudo dotnet run -c Release\n```\n\nFor UTF-16 benchmarks, you need to pass a flag as they are not enabled by default:\n\n```\ncd benchmark\ndotnet run -c Release --anyCategories UTF16\n```\n\n\n## Building the library\n\n```\ncd src\ndotnet build\n```\n\n## Code format\n\nWe recommend you use `dotnet format`. E.g.,\n\n```\ncd test\ndotnet format\n```\n\n## Programming tips\n\nYou can print the content of a vector register like so:\n\n```C#\n        public static void ToString(Vector256\u003cbyte\u003e v)\n        {\n            Span\u003cbyte\u003e b = stackalloc byte[32];\n            v.CopyTo(b);\n            Console.WriteLine(Convert.ToHexString(b));\n        }\n        public static void ToString(Vector128\u003cbyte\u003e v)\n        {\n            Span\u003cbyte\u003e b = stackalloc byte[16];\n            v.CopyTo(b);\n            Console.WriteLine(Convert.ToHexString(b));\n        }\n```\n\nYou can convert an integer to a hex string like so: `$\"0x{MyVariable:X}\"`.\n\n## Performance tips\n\n- Be careful: `Vector128.Shuffle` is not the same as `Ssse3.Shuffle` nor is  `Vector256.Shuffle` the same as `Avx2.Shuffle`. Prefer the latter.\n- Similarly `Vector128.Shuffle` is not the same as `AdvSimd.Arm64.VectorTableLookup`, use the latter.\n- `stackalloc` arrays should probably not be used in class instances.\n- In C#, `struct` might be preferable to `class` instances as it makes it clear that the data is thread local.\n- You can ask for an asm dump: `DOTNET_JitDisasm=NEON64HTMLScan dotnet run -c Release`.  See [Viewing JIT disassembly and dumps](https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/jit/viewing-jit-dumps.md).\n- You can get profiling data: `dotnet run -c Release -- -p EP`.\n\n## Scientific References\n\n- Wojciech Muła, Daniel Lemire, [Base64 encoding and decoding at almost the speed of a memory copy](https://arxiv.org/abs/1910.05109), Software: Practice and Experience 50 (2), 2020.\n- Wojciech Muła, Daniel Lemire, [Faster Base64 Encoding and Decoding using AVX2 Instructions](https://arxiv.org/abs/1704.00605), ACM Transactions on the Web 12 (3), 2018.\n\n## References\n\n- [base64 encoding with simd-support](https://github.com/dotnet/runtime/issues/27433)\n- [gfoidl.Base64](https://github.com/gfoidl/Base64): original code that lead to the SIMD-based code in the runtime\n- [simdutf's base64 decode](https://github.com/simdutf/simdutf/blob/74126531454de9b06388cb2de78b18edbfcfbe3d/src/westmere/sse_base64.cpp#L337)\n- [WHATWG forgiving-base64 decode](https://infra.spec.whatwg.org/#forgiving-base64-decode)\n\n## More reading \n\n- https://learn.microsoft.com/en-us/dotnet/standard/design-guidelines/\n- https://learn.microsoft.com/en-us/dotnet/csharp/fundamentals/coding-style/coding-conventions\n\n\n[^1]: The .NET runtime appear to have received some of its fast SIMD base64 functions from [gfoidl.Base64](https://github.com/gfoidl/Base64) who built on earlier work by Klomp, Muła and others. See [Faster Base64 Encoding and Decoding using AVX2 Instructions](https://arxiv.org/abs/1704.00605) for a review.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimdutf%2Fsimdbase64","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimdutf%2Fsimdbase64","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimdutf%2Fsimdbase64/lists"}