The Policy Hash Table has 3-6x faster insertion/deletion and 3-10x increase for writes/reads. As far as I can tell, there are no downsides. The policy hash table (specifically the open-addressing version), beats out unordered_map in all my benchmarks.
I've often been irritated by how slow unordered_map is in C++. Too often, I have something that runs fast enough in terms of complexity, but the constant factor from unordered_map slows down the solution too much. Yesterday though, after using the useful order statistics tree from https://codeforces.com/blog/entry/11080, I was curious if there were any other useful data structures hiding in the Policy STL. And lo and behold, I found a hash table.
Well, enough backstory, let's look at some numbers. All benchmarks below are compiled with C++14 -O2.
unordered_maplinear insertion: 0.689846 cc_hash_tablelinear insertion: 0.408233 gp_hash_tablelinear insertion: 0.256131 unordered_maplinear read/write: 1.69783 cc_hash_tablelinear read/write: 0.202474 gp_hash_tablelinear read/write: 0.26842 unordered_maprandom insertion: 2.90184 cc_hash_tablerandom insertion: 3.15129 gp_hash_tablerandom insertion: 0.56553 unordered_maprandom read/write: 2.02336 cc_hash_tablerandom read/write: 0.333415 gp_hash_tablerandom read/write: 0.403486
While for linear insertions, the policy hash table gives modest improvements, the policy hash table blows the unordered_map out of the water when it comes to reads/writes. These are order of magnitude improvements that make hash tables usable when they previously weren't.
Benchmarks of course, don't always reflect the real world. So here's an example of it allowing a solution to be accepted that TLE's with unordered_map.
Example problem (5000 ms time limit): http://codeforces.com/contest/264/problem/C
Solution with unordered_map: http://codeforces.com/contest/264/submission/40542899 (TLE on test case 8)
Solution with policy hash table directly substituted in: http://codeforces.com/contest/264/submission/40573491 (TLE on test case 26)
Solution with unordered_map, rewritten to not use clears: http://codeforces.com/contest/264/submission/40590437 (TLE on test case 26) Solution with policy hash table and rewritten to not use clears: http://codeforces.com/contest/264/submission/40574196 (AC with max time of 3180 ms)
To use this data structure:
#include <ext/pb_ds/assoc_container.hpp> using namespace __gnu_pbds; gp_hash_table<int, int> table;
From there, the API seems almost exactly the same.
PS: In other posts for unordered_map, I've seen people claim that reserve and max_load_factor could increase performance drastically. They didn't seem to do much for me. However, if you want to do something similar for these hash tables, use
typedef cc_hash_table< int, int, hash<int>, equal_to<int>, direct_mask_range_hashing<int>, hash_standard_resize_policy<hash_exponential_size_policy<>, hash_load_check_resize_trigger<true>, true>> ht;, and you should be able to resize and set load factor manually as well.
Code for the benchmarks can be found here: https://ideone.com/ZkNMbH
EDIT: I realized that gp_hash_table is the way to go, not cc_hash_table. gp_hash_table sacrifices ~10% reading/writing speed to gain 3-6x in insertion/deletion/clearing. I updated the post to reflect the new numbers.