{"id":13687712,"url":"https://github.com/sslotin/amh-code","last_synced_at":"2025-04-04T17:07:52.782Z","repository":{"id":46749338,"uuid":"320249282","full_name":"sslotin/amh-code","owner":"sslotin","description":"Complete implementations from \"Algorithms for Modern Hardware\"","archived":false,"fork":false,"pushed_at":"2022-12-11T14:30:07.000Z","size":9293,"stargazers_count":738,"open_issues_count":2,"forks_count":46,"subscribers_count":28,"default_branch":"main","last_synced_at":"2025-03-28T16:07:07.174Z","etag":null,"topics":["algorithms","computer-science","hpc","performance"],"latest_commit_sha":null,"homepage":"https://en.algorithmica.org/hpc","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sslotin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-12-10T11:20:06.000Z","updated_at":"2025-03-27T02:28:06.000Z","dependencies_parsed_at":"2023-01-27T01:00:59.186Z","dependency_job_id":null,"html_url":"https://github.com/sslotin/amh-code","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sslotin%2Famh-code","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sslotin%2Famh-code/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sslotin%2Famh-code/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sslotin%2Famh-code/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sslotin","download_url":"https://codeload.github.com/sslotin/amh-code/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247217183,"owners_count":20903009,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithms","computer-science","hpc","performance"],"created_at":"2024-08-02T15:00:59.192Z","updated_at":"2025-04-04T17:07:52.764Z","avatar_url":"https://github.com/sslotin.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook"],"sub_categories":[],"readme":"# Algorithms for Modern Hardware\n\nThis repository contains full examples and other associated code from https://en.algorithmica.org/hpc\n\nThe book is still unfinished, and my writing process is very slow and non-sequential — sometimes the \"idea → code → benchmarks → article\" pipeline may take 6 months or even more — so in this repository you can get a preview on a lot of interesting things that I haven't yet properly written up and published.\n\nThings that have improved on the state-of-the-art:\n\n- Many variants of [binary search](https://github.com/sslotin/amh-code/tree/main/binsearch), the [fastest one](https://github.com/sslotin/amh-code/blob/main/binsearch/bplus.cc) achieving ~15x speedup over `std::lower_bound` for small arrays (that fit in cache) and ~8x speedup for large arrays (\u003e1e6).\n- [Argmin at the speed of memory](https://github.com/sslotin/amh-code/blob/main/argmin/simdmin.cc).\n- Implementation of [the Floyd-Warshall algorithm](https://github.com/sslotin/amh-code/tree/main/floyd) that is about 50x faster than the naive \"for-for-for\" algorithm.\n\nThings that match current state-of-the-art:\n\n- [A version of a segment tree](https://github.com/sslotin/amh-code/blob/main/segtree/refactor2.cc) that can compute prefix sums in ~2ns plus the time of the slowest memory read.\n- (✓ published) [An implementation of GCD](https://en.algorithmica.org/hpc/analyzing-performance/gcd/) that works 2-3 faster than `std::gcd`.\n- [Integer factorization](https://github.com/sslotin/amh-code/blob/main/factor/montgomery.cc) taking ~0.5ms per 60-bit integer.\n- An algorithm for parsing series of integers ~2x faster than `scanf(\"%d\")` does.\n- An implementation of [BLAS-level matrix multiplication](https://github.com/sslotin/amh-code/blob/main/matmul/v6.cc) that can be expressed in [about 30 lines of C](https://gist.github.com/sslotin/fae39ea49a812732ae45db7b72f6a7ff).\n- Various efficient [hash tables](https://github.com/sslotin/amh-code/tree/main/hash-tables).\n- Efficient [FFT](https://github.com/sslotin/amh-code/tree/main/fft) and Karatsuba algorithm implementations.\n\nVarious benchmarks:\n\n- Benchmarks for [branching and predication](https://github.com/sslotin/amh-code/tree/main/branching).\n- Benchmarks for [RAM and CPU cache system](https://github.com/sslotin/amh-code/tree/main/cpu-cache).\n\nAt the implementation stage:\n\n- Ordered Trees (apply the same technique as with binary searching, but with dynamically-allocated B-tree nodes)\n- Range minimum queries (both static and dynamic)\n- Filters (Bloom, cuckoo, xor, theoretical minimum)\n- Dot product / logistic regression (newton's method, SIMD, quantization)\n- Prime number sieves (blocking plus wheel)\n- Sorting (speeding up quicksort and mergesort with SIMD and radix sort)\n- Writing series of integers (SIMD + fast mod-10)\n- Bitmaps (blocking, SIMD)\n\nAt the idea stage:\n\n- String searching (SIMD-based strstr and rolling hashing)\n- Using SIMD to speed up Pollard's algorithm (naive sqrt-parallelization)\n- SIMD-based random number generation and hashing\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsslotin%2Famh-code","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsslotin%2Famh-code","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsslotin%2Famh-code/lists"}