Блог пользователя oversolver

Автор oversolver, история, 6 лет назад, По-английски

I've heard more about how vector<bool> is slow, and we need to dodge him or use vector<char>.

But today I ran some tests in "Codeforces custom test" with GNU G++.

First simple erato sieve

//N = 3e7
for(int i=2;i<N;++i) if(!u[i]) for(int j=i*2;j<N;j+=i) u[j] = 1;
int cnt = 0;
for(int i=0;i<N;++i) if(u[i]) ++cnt;
cout<<cnt<<endl;

Depends on u:

bool u[N] : 420 ms, 31204 KB

vector<bool> u(N): 218 ms, 5484 KB

vector<char> u(N): 451 ms, 31164 KB

You can say like "memory constant speed up this code"

Second.

//N = 1e4
double total = 0;
for(int it=0;it<N;++it){
	for(int i=0;i<N;++i){
		int x = rand()%N;
		u[x] = 1;
	}
	for(int i=0;i<N;++i){
		int x = rand()%N;
		u[x] = 0;
	}
	for(int i=0;i<N;++i){
		total+=u[i];
		u[i] = 0;
	}
}
cout<<total/N<<endl;

bool u[N] : 2683 ms, 1860 KB

vector<bool> u(N): 2667 ms, 1832 KB

vector<char> u(N): 2620 ms, 1868 KB

We see its equal!

So maybe its not so bad? Or you have examples where vector<bool> is really slower than alternatives?

  • Проголосовать: нравится
  • +57
  • Проголосовать: не нравится

»
6 лет назад, # |
  Проголосовать: нравится -11 Проголосовать: не нравится

I only read that you shouldn't use vector because it's not a real container, it doesn't really hold bools, but stores each value in one bit, see this on stackoverflow

»
6 лет назад, # |
  Проголосовать: нравится +16 Проголосовать: не нравится

Index operator for return proxy object which can cause confusion and errors if you try to get pointer or reference to element in vector<bool>. Many people are not aware of it. It is easier to tell beginners just not using it instead of trying to explain how it works.

As for performance it takes 8 time less memory. It may help in some tasks. Similar result can be achieved with bitset. The major differences being that bitset is fixed size but vector<bool> can be resized. Bitset also provides operators for performing bit operations with two bitsets.

On one hand accessing bit in vector<bool> takes more instructions as it is necessary to calculate the bit's position in byte and extract it, that may make vector<bool> slower. Additional operations might also make it more difficult to optimize operations accessing it. On the other hand reduced memory consumption can help fitting the vector cache which can improve the performance. Taking all that in to account performance difference between vector<bool> and vector<char> will probably depend on the size of vector and access patterns. The sizes that are more favorable to vector<bool> will depend on CPU as cache sizes are different.

In the end it may seem simpler to just use vector<char> for some people to avoid wasting time deciding which one is better, debugging problems caused by difference in behavior and just use bitset (or write your own bitset) if task requires it. Having a fixed size bitset might be problematic for real world applications but in case of programming contests limits are usually known and there is no point optimizing for tests smaller than max size.

»
6 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Hmmm... why there is no memory saving in the second example?

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится +19 Проголосовать: не нравится

    It is, but the size of array u is much less than the minimum that will be consumed even by the empty program.

    Btw, the second example is really bad, because all the execution time is consumed by rand() operation, and not access to the array.

    • »
      »
      »
      6 лет назад, # ^ |
      Rev. 2   Проголосовать: нравится -24 Проголосовать: не нравится

      Or you have examples where vector<bool> is really slower than alternatives?

      this feeling when you dont want to give examples because you are red

      UPD. this feeling when you get downvoted because you are not red