Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/andreysolovyev381/hash_table_no_invalidation
C++20 hash table that never invalidates its pointers and iterators. Slightly better than std::unordered - see ./benchmark
https://github.com/andreysolovyev381/hash_table_no_invalidation
cpp cpp20 data-structures hash-table hashmap hashtable header-only invalidating invalidation
Last synced: about 2 months ago
JSON representation
C++20 hash table that never invalidates its pointers and iterators. Slightly better than std::unordered - see ./benchmark
- Host: GitHub
- URL: https://github.com/andreysolovyev381/hash_table_no_invalidation
- Owner: andreysolovyev381
- License: mit
- Created: 2024-05-26T11:10:01.000Z (8 months ago)
- Default Branch: master
- Last Pushed: 2024-07-27T10:36:26.000Z (5 months ago)
- Last Synced: 2024-07-27T11:43:57.401Z (5 months ago)
- Topics: cpp, cpp20, data-structures, hash-table, hashmap, hashtable, header-only, invalidating, invalidation
- Language: C++
- Homepage:
- Size: 4.43 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
## Hash table with *NO* invalidation of pointers and iterators
### Rational
Another project of mine required a **hash table that never invalidates** its pointers and iterators. I made one with a humble hope to be on par with `std::unordered`. Header only, single C++20 file, see ./include folder. See ./tests for more detailed usage examples. In general, more or less it reproduces `std::unordered` interface.### Short usage example
```cpp
#include "include/hash_table.hpp"
...::containers::hash_table::Set hashSet;
hashSet.insert(42);
auto found = hashSet.find(42);
if (found != hashSet.end()) {...}::containers::hash_table::Map hashMap;
hashMap[1] = 42;
hashMap.insert(std::pair{42, 42});
for (auto const& [k, v] : hashMap) {...}```
### Reference to other hash tables
There are many other hash tables, but I found that all of them are focused on performance, sacrificing exactly what I need — persistence of pointers and iterators. However, there are some serious attempts. Consider reading the [results](https://martin.ankerl.com/2019/04/01/hashmap-benchmarks-01-overview/) of benchmark made by people behind `robin_hood::hash_table`. I also recommend this [reading](https://greg7mdp.github.io/parallel-hashmap/) from Greg Popovitch.### Benchmark
See ./benchmark folder for code.
Here are the results for 1 million ints.
```
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
Insertion std::unordered_set /min_time:5.000 0.144 us 0.143 us 50408238
Insertion containers::hash_table::Set /min_time:5.000 0.129 us 0.128 us 53205182
Access std::unordered_set /min_time:5.000 0.072 us 0.072 us 96885052
Access containers::hash_table::Set /min_time:5.000 0.067 us 0.066 us 105449693
Erase std::unordered_set /min_time:5.000 0.009 us 0.009 us 618141538
Erase containers::hash_table::Set /min_time:5.000 0.089 us 0.089 us 69120837```
### Implementation details
* Open addressing and linear probing as a collision resolution, in case anybody cares.* It is allowed to throw, you are the one who should catch.
* Indeed, to nail down all the data, hash table should use a linked list as an underlying structure. The problem is that random memory placement turns out to be bad for cache locality.
But thanks to [Bloomberg](https://github.com/bloomberg) and their contribution to committee work, we have `std:pmr` namespace and polymorphic allocators.
Long story short, this hash table is built upon `std::pmr::list` that allows to place list nodes in memory in an array-like fashion, thus making such a container more cache-friendly.Another issue that comes from usage of `std::pmr` is a strong requirement to follow "rule of five". Specifically, see ./test/pmr.cpp — copy ctor and assignment fail if implemented by default. That is a reason why HashTable is a move-only. Implementation of allocator aware linked list may be a good solution here, ie see the root source of everything, [Pablo Halpern](https://github.com/phalpern)'s CppCon2017 [report](https://www.youtube.com/watch?v=v3dz-AKOVL8). But it is beyond my needs, so feel free to contribute :smirk:
* My hypothesis is that after some insert / remove cycles this hash table will deteriorate in performance — "pogrom is a pogrom", list is a list, appearance of "holes" in that initial array-like placement is inevitable.
### License
MIT### Disclosure
Despite heavy testing performed (see ./tests/test_11.txt), no guarantees of any kind are given whatsoever. Use it at your own risk.