le9018468's blog

By le9018468, history, 6 years ago, In English

Dear all,

I tried to write a program that called a function that used a temporary vector of size 10 10^6 times. This program ran for around 1 ~ 2 seconds. However, when I switched the vector to a bool array, it ran in 0.18 seconds. Why is there such a big difference?

  • Vote: I like it
  • +11
  • Vote: I do not like it

»
6 years ago, # |
  Vote: I like it +13 Vote: I do not like it

in general vector is a little slower than array but not that much. but in your case you maybe trying to pass vector to function by value which takes a copy from the vector every time you call the function, maybe this is the problem, can you provide more details about what you are trying to do.

»
6 years ago, # |
  Vote: I like it +57 Vote: I do not like it

Creation of small temporary vector is very slow, because it uses dynamic memory allocation. Also access to the small local array is much faster because it is located on the stack.

»
6 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by le9018468 (previous revision, new revision, compare).

»
6 years ago, # |
Rev. 7   Vote: I like it +8 Vote: I do not like it

was that a vector<bool>?

https://en.cppreference.com/w/cpp/container/vector_bool

in libstdc++ (g++) at least, vector<bool> is 8 times more space-efficient than it may seem to be, but storing each bool as a bit makes vector<bool> slower. use vector<char> for faster write-access.

benchmarking: https://privatebin.net/?60dfb7b8ccbba335#1sAIKHl7q6epmIuSqYgASYc0Y6boMbgcDhfdY7tQhaY=

compiled with g++ o.cpp -std=c++17, so all compiler optimizations are disabled to prevent compiler from breaking benchmark. my copy's of GCC version is 8.2.0.

these are average results on my PC:

bool a[10]: 0.22284
char a[10]: 0.20345
vector<bool> a(10): 3.44415 /* see EDIT */
vector<char> a(10): 0.46760

is this a sufficient answer?

EDIT:

as people who answered to this comment said, bechmarking without compiler optimizations is a bad idea and it really is.

don't believe the numbers in this benchmark. in short, vector<bool> (with -O1 or more) is much faster than it is in this "benchmark" and compares well to (sometimes is faster than) vector<char>.

making excuses, i must say that i didn't except the case of OP not using any optimizations. honestly, now i can't think of a better reason of OP's time difference than no optimizations + vector<bool>.

  • »
    »
    6 years ago, # ^ |
      Vote: I like it +17 Vote: I do not like it

    compiled with g++ o.cpp -std=c++17

    Very baaaad idea to do any benchmarks without compiler optimizations. STL implementation has a lot of temporary objects, iterators, proxy objects, allocator structures and function calls, and so on. And all this stuff is inlined or omitted with enabled optimizations.

    You can see that at godbolt: Just look at generated asm code, then delete -O2 and you will see the difference =)

    • »
      »
      »
      6 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      My main goal was to show the difference between vector<bool> and vector<char>, and that vector<char> results are comparable with bool []. So there seems to be no reason for me to mess with "benchmark"'s functions to make them work as intended with optimizations.

      -On where n>=1 produces binary that seems to ignore f0() and f1() => breaks the "benchmark":

      -O1:

      bool a[10]: 0.00000
      char a[10]: 0.00000
      vector<bool> a(10): 0.20992
      vector<char> a(10): 0.08911
      

      -O2:

      bool a[10]: 0.00000
      char a[10]: 0.00000
      vector<bool> a(10): 0.15215
      vector<char> a(10): 0.08526
      

      -O3:

      bool a[10]: 0.00000
      char a[10]: 0.00000
      vector<bool> a(10): 0.00995
      vector<char> a(10): 0.00000
      
      • »
        »
        »
        »
        6 years ago, # ^ |
        Rev. 2   Vote: I like it +5 Vote: I do not like it

        If array is not used it can be removed (by optimization) but vector is not. Even more if you have small array it can be optimized to use registers (without memory allocation). But vector produces constructor/destructor calls (side effects) and can't be optimized as good as array. something like that ...

        • »
          »
          »
          »
          »
          6 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          I really think that this is out of the initial question's scope.

          • »
            »
            »
            »
            »
            »
            6 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            This is where difference of execution speed come from.

            • »
              »
              »
              »
              »
              »
              »
              6 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              I will break it down then.

              The question was about why is vector around 10 times slower than bool [].

              Because normally the vector<T> and T [] (where T is some (possibly not) numeric type) working times are of the same order of magnitude, I thought that the OP was using vector<bool> and provided an answer which took that almost as a fact.

              To prove the point, I made a simple "benchmark". The "benchmark" suffers from being able to be optimized by compiler, what would lead to "benchmarking" functions not doing what was intended. That's why I compiled the program without optimizations flags (what is equal to -O0).

              vector<bool> execution time was >> than vector<char> and bool []. Why are we talking about optimizations? In case of a real program (unlike this "benchmark"), optimizations wouldn't make vector<bool> as fast as vector<char> or well-comparing to bool []. (at least I hope so)

              • »
                »
                »
                »
                »
                »
                »
                »
                6 years ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it

                If after optimization program not doing what was intended this is bad optimization. I think in your case this is not a bad optimization but bad benchmarks. Sorry.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  6 years ago, # ^ |
                  Rev. 2   Vote: I like it 0 Vote: I do not like it

                  I have never said compiler optimizations are bad. The "benchmark" is not perfect at all, but is sufficient to prove the point.

              • »
                »
                »
                »
                »
                »
                »
                »
                6 years ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it

                =======================================================================

                In case of a real program (unlike this "benchmark"), optimizations wouldn't make vector bool as fast as vector char or well-comparing to bool []. (at least I hope so)

                This is what my previous comment was about. With -O0 you get 7.5x factor for vector<bool> vs vector<char>. And with -O2 you get only 2x factor. And as I tried to show you, the compiler actually does a lot of optimization for the real program. Benchmarking code with -O0 is like benchmarking it in Python, that is the results do not reflect the real situation in any way for C++ real program (i.e. submission with -O2).

                The "benchmark" suffers from being able to be optimized by compiler.

                In your case, the problem is that your test code is very week. Your code should calculate some check-value and return it. The math under calculations should be simple, but not too much, so that the compiler can not make O(1) from it. Some Fibonacci-like DP or simple algorithms like Eratosthenes sieve will help you to make a good benchmark =)

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  6 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  My "benchmark" giving 2x factor with -O2 is because of that weakness you mentioned.

                  I only needed to prove that vector<bool> is slower because of memory (not as in -On) optimization. There is no need for a stronger test period.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  6 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  As said above, you can prove nothing with -O0. The results with optimization will be vastly different.

                  Also, the statement "vector<bool> is slower" is false. There are situations when it may be faster.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  6 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  Okay.

                  I made an Eratosthenes-sieve-like test (and readed some stuff on the internet). The factor between vectors changed to a much smaller one.

                  bool a[10]: 0.04205
                  char a[10]: 0.03920
                  vector<bool> a(10): 0.04339
                  vector<char> a(10): 0.03941
                  

                  So, yeah you're right.

                  I'll update my answer accordingly.

              • »
                »
                »
                »
                »
                »
                »
                »
                6 years ago, # ^ |
                Rev. 3   Vote: I like it 0 Vote: I do not like it

                Try to increase memory footprint for example take M=20 in your code and you will see vector bool "will shine". The performance depends on many factors and can't be checked with such naive benchmarks.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  6 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it
                  bool a[20]: 0.44639
                  char a[20]: 0.40669
                  vector<bool> a(20): 7.01062
                  vector<char> a(20): 0.89420