[Tutorial] Everything about unordered_map

#	User	Rating
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

#	User	Contrib.
1	maomao90	174
2	awoo	164
2	adamant	164
4	TheScrasse	159
4	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	147
9	orz	146
10	pajenegod	145

UPD: Tricks to make unordered_map faster added.

Hi!

What is `unordred_map`?

It is a data structure like map but it is more than 4 times faster than map.you can use it in C++11 with including #include<unordered_map>.for example:

#include<unordered_map>
using namespace std;
int main(){
  unordered_map<int,int>mp;
  mp[5]=12;
  mp[4]=14;
  cout<<mp[5]<<' '<<mp[4]<<endl;//prints: 12 14
}

Lets explain it more.

How it works?

Focus on unordered_set for simplify.You can imagine that it has vector of vector like vector<vector<type> > mp. Every time you insert value V in that, it calculate its hash(I will explain how it works), let hash(V)=K; it inserts V into mp[K] (formally it calls mp[K].push_back(V)).When you call mp.count(V) it searchs for V in mp[K].

`map` VS `unordered_map` (and `set` VS `unordered_set`)

1-unordered_map is more than 4 times faster

Focus on problem 527E - Data Center Drama, it seems that it is good to use unordered_map instead of map.

My submission with map: 14269000 Time:484 MS.

My submission with unordered_map: 14269381 Time:358 MS.

Another good sample to show how fast is unordered_map is problem 178C3 - Smart Beaver and Resolving Collisions:

My submission with map: 15781810 Time limit exceeded on test 36 .

My submission with unordered_map: 15774685 Accepted (Time:966 MS).

2-unordered_set (and unordered_map) is not sorted

Please note that set (and map) is sorted, for example *(s.begin()) is always smallest number in the set; but in unordered_set it isn't. similarly there isn't lower_bound and upper_bound in unordered_set (and unordered_map similarly).

Creating hash function for structs

Now, one other problem remains, you can try to compile this code:

unordered_map<pair<int,int>,int>mp;

You will get Compilation Error! Why? See this page for unordered_map supported types. For unsupported types, you have to create your own hash function for use. For example lets see how we can create a hash function for pair<int,int>.

As you know any int value is between -2^31+1 to 2^31-1.so if we create function that for every pair<int,int> returns distinct value in type size_t(alias of unsigned int), it will be done. It is pretty easy: x.first^(x.second<<32) is good. but be careful about overflow ;for having good hash function we use hash<long long>.The code is looks like this:

struct HASH{
  size_t operator()(const pair<int,int>&x)const{
    return hash<long long>()(((long long)x.first)^(((long long)x.second)<<32));
  }
};
unordered_map<pair<int,int>,int,HASH>mp;

Now you have a unordered_map of pair<int,int> (it isn't problem what is second member, in example it was int).Creating hash function for other structs is same.

How to test `unordered_map` is faster than `map`?

You can test that for N=10^6, unordered_set(unordered_map) is more than 4 times faster than set(map) ;with this code:(compile code with command: g++ file.cpp -std=c++11 -O2)

#include <bits/stdc++.h>
using namespace std;
unordered_set<int>s;//replace it with set for test.
int main(){
  auto T=clock();
  s.reserve(32768); //updated !
  s.max_load_factor(0.25); //updated !
  for(int i=0;i<1000000;i++)
    s.insert(i);
  cout<<double(clock()-T)/CLOCKS_PER_SEC<<'\n';
}

Note1: Let your hash function H(V), it is better that H(V) returns distinct value for every distinct V, it makes unordered_map faster; but if you can't do that it doesn't problem. The only problem is that unordered_map becomes slower(however it is better than map).

Note2: Please be careful about your hash function time complexly! it is fun to use this hash function:

struct HASH{
  size_t operator()(const pair<int,int>&x)const{
    size_t ans=0;
    for(int i=0;i<x.first;i++)
      ans+=x.second;
    return ans;
  }
};

UPD I will explain hash<type> in my next post.

UPD It seems that sometimes unordered_map becames so slow.but it can improve with this two lines of code:

unordered_map<int,int>mp;
mp.reserve(1024);
mp.max_load_factor(0.25);

With this two lines unordered_map become about 10 times faster. You can replace 1024 with another suitable power of two.(it depends on number of insert s you will do).

Rev.	By	When	Δ	Comment
en16	Arpa	2020-08-08 04:58:21	1	Tiny change: ' is `unordred_map`?\' -> ' is `unordered_map`?\'
en15	Arpa	2017-02-24 19:05:47	18	post name updated
en14	Arpa	2016-02-07 21:16:53	71
en13	Arpa	2016-02-07 21:11:47	162
en12	Arpa	2016-02-07 21:09:22	74
en11	Arpa	2016-02-07 21:06:53	4	Tiny change: '____UPD: T' -> 'UPD: T'
en10	Arpa	2016-02-07 21:06:17	331
en9	Arpa	2016-02-07 20:54:44	438
en8	Arpa	2015-12-03 15:07:16	56	Tiny change: 'n};\n~~~~~' -> 'n};\n~~~~~\n\nUPD I will explain `hash<type>` in my next post.'
en7	Arpa	2015-11-30 22:21:36	2	Tiny change: ' `map`).\nNote2:' -> ' `map`).\n\nNote2:'
en6	Arpa	2015-11-30 22:07:16	448
en5	Arpa	2015-11-29 22:40:15	820
en4	Arpa	2015-11-29 19:18:52	4	Tiny change: 'red_map>`.example:\n' -> 'red_map>`.for example:\n'
en3	Arpa	2015-11-29 18:46:44	337
en2	Arpa	2015-11-29 17:21:04	349
en1	Arpa	2015-11-29 15:19:01	2957	Initial revision (published)

What is unordred_map?

How it works?

map VS unordered_map (and set VS unordered_set)

Creating hash function for structs

How to test unordered_map is faster than map?

History

What is `unordred_map`?

`map` VS `unordered_map` (and `set` VS `unordered_set`)

How to test `unordered_map` is faster than `map`?