{"id":23499630,"url":"https://github.com/stuartcarnie/go-simd","last_synced_at":"2025-04-15T17:35:58.784Z","repository":{"id":46244034,"uuid":"114180344","full_name":"stuartcarnie/go-simd","owner":"stuartcarnie","description":"Optimized functions for Go using SIMD","archived":false,"fork":false,"pushed_at":"2020-10-09T14:47:04.000Z","size":91,"stargazers_count":190,"open_issues_count":4,"forks_count":9,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-11-16T17:23:23.428Z","etag":null,"topics":["clang","go","golang","llvm","performance","utf-8"],"latest_commit_sha":null,"homepage":"","language":"Assembly","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stuartcarnie.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-12-13T23:42:24.000Z","updated_at":"2024-10-03T20:35:35.000Z","dependencies_parsed_at":"2022-09-13T15:32:41.949Z","dependency_job_id":null,"html_url":"https://github.com/stuartcarnie/go-simd","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stuartcarnie%2Fgo-simd","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stuartcarnie%2Fgo-simd/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stuartcarnie%2Fgo-simd/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stuartcarnie%2Fgo-simd/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stuartcarnie","download_url":"https://codeload.github.com/stuartcarnie/go-simd/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":231183392,"owners_count":18340353,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clang","go","golang","llvm","performance","utf-8"],"created_at":"2024-12-25T06:18:01.357Z","updated_at":"2024-12-25T06:18:02.450Z","avatar_url":"https://github.com/stuartcarnie.png","language":"Assembly","funding_links":[],"categories":[],"sub_categories":[],"readme":"go-simd\n=======\n\nMake certain functions Go faster with SIMD, loop unrolling, c2goasm or other optimization techniques.\n\nThis package chooses the most appropriate implementation at runtime, based on the host CPU features, however,\nit is possible to disable certain implementations using the  `INTEL_DISABLE_EXT` environment variable.\nSee the [cpu package README](https://github.com/stuartcarnie/go-simd/blob/master/internal/cpu/README.md) for \na description of this environment variable.\n\n\nBenchmarks\n----------\n\n### SumFloat64\n\nBenchmark various sum implementations, aggregating 1000 and 10000 element slices of `float64` values. \n\n* `Intrinsics` uses handwritten AVX intrinsics via clang\n* `AVX2` uses plain C code, exploiting auto-vectorization and AVX2 architecture enabled via clang \n* `SSE4` uses plain C code, exploiting auto-vectorization and SSE4 architecture enabled via clang \n* `Go` is an equivalent loop in Go\n    * `Unroll4` and `Unroll8` are unrolled versions\n\n```\nBenchmarkSumFloat64_1000-8                   20000000          59 ns/op    134057.61 MB/s\nBenchmarkSumFloat64_10000-8                   2000000         842 ns/op     94949.30 MB/s\nBenchmarkSumFloat64_Intrinsics_1000-8         5000000         245 ns/op     32550.11 MB/s\nBenchmarkSumFloat64_Intrinsics_10000-8         500000        2913 ns/op     27460.17 MB/s\nBenchmarkSumFloat64_AVX2_1000-8              30000000          56 ns/op    142336.45 MB/s\nBenchmarkSumFloat64_AVX2_10000-8              2000000         847 ns/op     94426.99 MB/s\nBenchmarkSumFloat64_SSE4_1000-8               5000000         277 ns/op     28806.44 MB/s\nBenchmarkSumFloat64_SSE4_10000-8               500000        2903 ns/op     27556.33 MB/s\nBenchmarkSumFloat64_Go_1000-8                 1000000        1124 ns/op      7116.81 MB/s\nBenchmarkSumFloat64_Go_10000-8                 200000       11583 ns/op      6906.38 MB/s\nBenchmarkSumFloat64_GoUnroll4_1000-8          5000000         287 ns/op     27790.03 MB/s\nBenchmarkSumFloat64_GoUnroll4_10000-8          500000        2896 ns/op     27616.44 MB/s\nBenchmarkSumFloat64_GoUnroll8_1000-8         10000000         188 ns/op     42341.91 MB/s\nBenchmarkSumFloat64_GoUnroll8_10000-8          500000        2924 ns/op     27358.12 MB/s\n```\n\n### unicode/utf8.Valid\n\nProvide a fast implementation of `utf8.Valid` using SSE and AVX2 functions. Credit for these SIMD implementations go to \nDaniel Lemire.\n\nRead [this post](https://lemire.me/blog/2018/10/19/validating-utf-8-bytes-using-only-0-45-cycles-per-byte-avx-edition/)\nfor more information on these SIMD optimized functions.\n\n\n```\nBenchmarkValid/utf8.Valid/ASCII/100-8          20000000            79 ns/op    1257.68 MB/s\nBenchmarkValid/utf8.Valid/ASCII/10000-8          200000          6140 ns/op    1628.48 MB/s\nBenchmarkValid/utf8.Valid/ASCII/1000000-8          2000        608369 ns/op    1643.74 MB/s\nBenchmarkValid/utf8.Valid/UTF8/100-8           10000000           139 ns/op     724.09 MB/s\nBenchmarkValid/utf8.Valid/UTF8/10000-8            50000         32722 ns/op     305.60 MB/s\nBenchmarkValid/utf8.Valid/UTF8/1000000-8            500       3953426 ns/op     252.95 MB/s\nBenchmarkValid/sse4.Valid/UTF8/100-8           30000000            43 ns/op    2311.65 MB/s\nBenchmarkValid/sse4.Valid/UTF8/10000-8           500000          2436 ns/op    4104.65 MB/s\nBenchmarkValid/sse4.Valid/UTF8/1000000-8          10000        243250 ns/op    4110.98 MB/s\nBenchmarkValid/sse4.Valid/ASCII/100-8          30000000            43 ns/op    2294.62 MB/s\nBenchmarkValid/sse4.Valid/ASCII/10000-8          500000          2439 ns/op    4099.68 MB/s\nBenchmarkValid/sse4.Valid/ASCII/1000000-8          5000        246138 ns/op    4062.75 MB/s\nBenchmarkValid/avx2.Valid/ASCII/100-8          50000000            24 ns/op    4042.96 MB/s\nBenchmarkValid/avx2.Valid/ASCII/10000-8         5000000           256 ns/op   39043.62 MB/s\nBenchmarkValid/avx2.Valid/ASCII/1000000-8         50000         30786 ns/op   32481.66 MB/s\nBenchmarkValid/avx2.Valid/UTF8/100-8           50000000            35 ns/op    2864.81 MB/s\nBenchmarkValid/avx2.Valid/UTF8/10000-8          1000000          1440 ns/op    6943.45 MB/s\nBenchmarkValid/avx2.Valid/UTF8/1000000-8          10000        142939 ns/op    6995.97 MB/s\n```\n\n\n### encoding/ascii.Valid\n\nA fast implementation for determining if a buffer is valid ASCII data. Credit for SIMD implementations go\nto Daniel Lemire.\n\n```\nBenchmarkValid/go.Valid/100-8         20000000          52 ns/op     1911.59 MB/s\nBenchmarkValid/go.Valid/10000-8         500000        3048 ns/op     3280.27 MB/s\nBenchmarkValid/go.Valid/1000000-8         5000      303508 ns/op     3294.80 MB/s\nBenchmarkValid/sse4.Valid/100-8      100000000          11 ns/op     8674.49 MB/s\nBenchmarkValid/sse4.Valid/10000-8      5000000         379 ns/op    26379.43 MB/s\nBenchmarkValid/sse4.Valid/1000000-8      50000       37061 ns/op    26982.04 MB/s\nBenchmarkValid/avx2.Valid/100-8      200000000           8 ns/op    12437.12 MB/s\nBenchmarkValid/avx2.Valid/10000-8     10000000         137 ns/op    72718.12 MB/s\nBenchmarkValid/avx2.Valid/1000000-8     100000       17767 ns/op    56280.99 MB/s\n``` \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstuartcarnie%2Fgo-simd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstuartcarnie%2Fgo-simd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstuartcarnie%2Fgo-simd/lists"}