Arpa's blog

By Arpa, 4 years ago, In English,

Hi!

I thought that unordered_set is faster than set, because set complexity is logarithmic in size and unordered_set complexity is constant time in average.(source: cplusplus.com)

But today I saw this:

TLE(>2000 MS) with unordered_set: 15494816

Accepted(155 MS) with set: 15494846

Also my other solution(witch I submitted during the contest) with unordered_map hacked :'(

Now my question is : Is unordered_set faster than set?

UPD: I have accepted(15497282).unordered_set time was better than set(15494846) : 93<155.

Only with adding this line:s.max_load_factor(0.25);s.reserve(500);.

UPD2: it seems that it is better to use a power of 2 in reserve(i.e. s.reserve(512)).(Thanks from keyvankhademi)

Read more »

 
 
 
 
  • Vote: I like it
  • +9
  • Vote: I do not like it

By Arpa, history, 4 years ago, In English,

Hi!

One of the C++ programmers problems is to work with integers greater than 2^64-1 (we can save 0to 2^64-1 in unsigned long long int). So I want to share the best Bignum implementation I have ever seen (Link) with CodeForces Community.

Its specifications are as follows:

  • Supported operations: + , -, / , * , % , ^(pow) , gcd , lcm , abs.

  • It is able to work with Standard Input/Output streams.

  • It can cast data to long long, string.

  • It uses fast multiplication.

source.(but I have edited that and added pow and size().)

UPD1 (September 2016): Bug in void operator=(long long v) is now fixed. Thanks to AmSen.

Read more »

 
 
 
 
  • Vote: I like it
  • +13
  • Vote: I do not like it

By Arpa, 4 years ago, In English,

Hi!

I am in virtual contest and I'm seeing this in my friends standings.

What is the problem?

P.S. top10 in contest:

Read more »

Tags bug
 
 
 
 
  • Vote: I like it
  • +10
  • Vote: I do not like it

By Arpa, 4 years ago, In English,

Hi!

After my previous post about unordered_map now I want to explain hash functions.

std::hash.

C++ STL has one hash function in library <functional>. You can use it for this data types:

template<> struct hash<bool>;
template<> struct hash<char>;
template<> struct hash<signed char>;
template<> struct hash<unsigned char>;
template<> struct hash<char16_t>;
template<> struct hash<char32_t>;
template<> struct hash<wchar_t>;
template<> struct hash<short>;
template<> struct hash<unsigned short>;
template<> struct hash<int>;
template<> struct hash<unsigned int>;
template<> struct hash<long>;
template<> struct hash<long long>;
template<> struct hash<unsigned long>;
template<> struct hash<unsigned long long>;
template<> struct hash<float>;
template<> struct hash<double>;
template<> struct hash<long double>;

For example:

hash<int>hi;
int x=69;
cout<<hf(x)<<endl;
hash<long double>hld;
long double y=69.6969696969;
cout<<hld(y)<<endl;
hash<vector<int> >hv;
vector<int>v({69,69,69,69});//v→ 69,69,69,69
cout<<hv(v)<<endl;
Hand-made hash functions.

There are many ways to create hash function for your struct. For example:

struct reval{
  vector<int>v;
  int n;
  string s;
  size_t hash(){//size_t is alias of unsigned int
    hash<string>hs;
    hash<long long>hll;
    hash<int>hi;
    hash<vector>hv;
    long long ans=hs(s);
    ans<<=32;
    ans|=hi(n);
    ans=hll(ans);
    ans<<=32;
    ans|=hv(v);
    ans=hll(ans);
    return ans;
  }
}

A simple example:(be careful;it is only a sample and it isn't good hash function of course)

struct reval{
  vector<int>v;
  int n;
  string s;
  size_t hash(){//size_t is alias of unsigned int
    hash<string>hs;
    hash<int>hi;
    hash<vector>hv;
    return hs(s)+hv(v)+hi(n);
  }
}

A trick:

struct S{
  string first_name;
  string last_name;
};
namespace std{
    template<>struct hash<S>{
        size_t operator()(S const& s) const{
            size_t h1=hash<string>()(s.first_name);
            size_t h1=hash<string>()(s.last_name);
            return hash<long long>()( (h1<<32)^h2 );
        }
    };
}
int main(){
    S s;
    s.first_name="MohammadSina";
    s.last_name="PakSeresht";
    cout <<"hash(s) = "<<hash<S>()(s)<<endl;
}

Note1: You can use hash<type>()(variable), like this:

int x=69;
cout<<hash<int>()(x)<<endl;

Instead of:

int x=69;
hash<int>hasher;
cout<<hasher(x)<<endl;

Note2: please be careful about combining two hashes. There are two good ways:

1-x*P+y mod M(when P is prime number (like 43) and M is less than 2^32 and not a power of 2 (like 10^9+7)).

2-hash<long long>()((x<<32)^y) or hash<long long>()(x*1000000007LL+y).

Read more »

 
 
 
 
  • Vote: I like it
  • +5
  • Vote: I do not like it

By Arpa, history, 4 years ago, In English,

UPD: Tricks to make unordered_map faster added.

Hi!

What is unordred_map?

It is a data structure like map but it is more than 4 times faster than map.you can use it in C++11 with including #include<unordered_map>.for example:

#include<unordered_map>
using namespace std;
int main(){
  unordered_map<int,int>mp;
  mp[5]=12;
  mp[4]=14;
  cout<<mp[5]<<' '<<mp[4]<<endl;//prints: 12 14
}

Lets explain it more.

How it works?

Focus on unordered_set for simplify.You can imagine that it has vector of vector like vector<vector<type> > mp. Every time you insert value V in that, it calculate its hash(I will explain how it works), let hash(V)=K; it inserts V into mp[K] (formally it calls mp[K].push_back(V)).When you call mp.count(V) it searchs for V in mp[K].

map VS unordered_map (and set VS unordered_set)

1-unordered_map is more than 4 times faster

Focus on problem 527E - Data Center Drama, it seems that it is good to use unordered_map instead of map.

My submission with map: 14269000 Time:484 MS.

My submission with unordered_map: 14269381 Time:358 MS.

Another good sample to show how fast is unordered_map is problem 178C3 - Smart Beaver and Resolving Collisions:

My submission with map: 15781810 Time limit exceeded on test 36 .

My submission with unordered_map: 15774685 Accepted (Time:966 MS).

2-unordered_set (and unordered_map) is not sorted

Please note that set (and map) is sorted, for example *(s.begin()) is always smallest number in the set; but in unordered_set it isn't. similarly there isn't lower_bound and upper_bound in unordered_set (and unordered_map similarly).

Creating hash function for structs

Now, one other problem remains, you can try to compile this code:

unordered_map<pair<int,int>,int>mp;

You will get Compilation Error! Why? See this page for unordered_map supported types. For unsupported types, you have to create your own hash function for use. For example lets see how we can create a hash function for pair<int,int>.

As you know any int value is between -2^31+1 to 2^31-1.so if we create function that for every pair<int,int> returns distinct value in type size_t(alias of unsigned int), it will be done. It is pretty easy: x.first^(x.second<<32) is good. but be careful about overflow ;for having good hash function we use hash<long long>.The code is looks like this:

struct HASH{
  size_t operator()(const pair<int,int>&x)const{
    return hash<long long>()(((long long)x.first)^(((long long)x.second)<<32));
  }
};
unordered_map<pair<int,int>,int,HASH>mp;

Now you have a unordered_map of pair<int,int> (it isn't problem what is second member, in example it was int).Creating hash function for other structs is same.

How to test unordered_map is faster than map?

You can test that for N=10^6, unordered_set(unordered_map) is more than 4 times faster than set(map) ;with this code:(compile code with command: g++ file.cpp -std=c++11 -O2)

#include <bits/stdc++.h>
using namespace std;
unordered_set<int>s;//replace it with set for test.
int main(){
  auto T=clock();
  s.reserve(32768); //updated !
  s.max_load_factor(0.25); //updated !
  for(int i=0;i<1000000;i++)
    s.insert(i);
  cout<<double(clock()-T)/CLOCKS_PER_SEC<<'\n';
}

Note1: Let your hash function H(V), it is better that H(V) returns distinct value for every distinct V, it makes unordered_map faster; but if you can't do that it doesn't problem. The only problem is that unordered_map becomes slower(however it is better than map).

Note2: Please be careful about your hash function time complexly! it is fun to use this hash function:

struct HASH{
  size_t operator()(const pair<int,int>&x)const{
    size_t ans=0;
    for(int i=0;i<x.first;i++)
      ans+=x.second;
    return ans;
  }
};

UPD I will explain hash<type> in my next post.

UPD It seems that sometimes unordered_map becames so slow.but it can improve with this two lines of code:

unordered_map<int,int>mp;
mp.reserve(1024);
mp.max_load_factor(0.25);

With this two lines unordered_map become about 10 times faster. You can replace 1024 with another suitable power of two.(it depends on number of insert s you will do).

Read more »

 
 
 
 
  • Vote: I like it
  • +47
  • Vote: I do not like it

By Arpa, history, 4 years ago, In English,

Hi!

COCI (CROATIAN OPEN COMPETITION IN INFORMATICS) will be held saturday.

Let's discuss about problems after contest.

Read more »

 
 
 
 
  • Vote: I like it
  • +9
  • Vote: I do not like it

By Arpa, history, 4 years ago, In English,

Hello

it will be good for all programmers :)

https://sercantutar.github.io/infint/

Geek trick: you can paste all InfInt.h to top your code for sending it for CF or another programming sites :D

Read more »

 
 
 
 
  • Vote: I like it
  • +8
  • Vote: I do not like it