Madiyar's blog

By Madiyar, 7 years ago, In English

While solving Andrew Stankevich Contest 32, Problem K, I noticed that __builtin_popcount for long long doesn't work properly, I spent a lot of time to find my mistake, but when I wrote __builtin_popcount by myself it accepted.

__builtin_popcount defined only for int?

 
 
 
 
  • Vote: I like it
  • +31
  • Vote: I do not like it

»
7 years ago, # |
  Vote: I like it +83 Vote: I do not like it
  • __builtin_popcount = int
  • __builtin_popcountl = long int
  • __builtin_popcountll = long long
»
7 years ago, # |
Rev. 2   Vote: I like it -18 Vote: I do not like it

What is " __builtin_popcount"? I found something. But I couldn't find all of them :(

  • »
    »
    7 years ago, # ^ |
      Vote: I like it +6 Vote: I do not like it

    Funny — I could. http://lmgtfy.com/?q=popcount

    About the 3rd result from the top says "popcount returns the number of non-zero bits in x".

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it -9 Vote: I do not like it

      It is easy to find what popcount do. But I couldn't find complexity, C++ or C++11, parameters, and works for any compiler? Also I couldn't find it on cplusplus.com .

      • »
        »
        »
        »
        7 years ago, # ^ |
          Vote: I like it +19 Vote: I do not like it

        Okay, you just said you couldn't find anything.

        • »
          »
          »
          »
          »
          7 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          I changed my reply. But still I don't know "complexity, C++ or C++11, parameters, and works for any compiler?"...

»
7 years ago, # |
Rev. 3   Vote: I like it +22 Vote: I do not like it

__builtin_popcount(x) is a function in C++ returns the number of 1-bits set in an int x. In fact, "popcount" stands for "population count," so this is a function to determine how "populated" an integer is.

For example, say we have an int x with value equal to 12. 12 in binary is just 1100, and the rest of the digits are just 0's. So, __builtin_popcount(12) = 2.

»
7 years ago, # |
Rev. 2   Vote: I like it +12 Vote: I do not like it

Interesting fact : __builtin_popcount is slow. Can anyone tell the complexity? For a constant time bitcount use this : http://pastie.org/9412225 .Please don't ask me how it works exactly! It'd be useful to add it to your library code :)

  • »
    »
    7 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    Is __builtin_popcount slow too much? Or only little?

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I don't know. Maybe you can tell me :)

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it +2 Vote: I do not like it

      Seems that it is the fastest way to count bits in c++ in general case.

      • »
        »
        »
        »
        5 years ago, # ^ |
          Vote: I like it +36 Vote: I do not like it

        No. The fastest way is to precompute all popcounts for 16-bit parts (for int32 case) or 22-bit parts (for int64 case) and calculate the answer in 2 or 3 bit shifts, array element accesses and summations.

        • »
          »
          »
          »
          »
          5 years ago, # ^ |
            Vote: I like it -15 Vote: I do not like it

          Actually __builtin_popcount is faster than so-called array table algorithm (sorry for poor English :), don't know how to express it correctly) is my computer, because the memory reading without cache is slow.

          • »
            »
            »
            »
            »
            »
            5 years ago, # ^ |
              Vote: I like it +3 Vote: I do not like it

            __builtin_popcount should be faster because it compiles into one assembler instruction. But it's not true on codeforces, IIRC MinGW uses bit-shifting algorithm instead.

        • »
          »
          »
          »
          »
          5 years ago, # ^ |
            Vote: I like it +26 Vote: I do not like it
          inline int popcount(int x){
              int count = 0;
              __asm__ volatile("POPCNT %1, %0;"
                  :"=r"(count)
                  :"r"(x)
                  :
              );
              return count;
          }
          
          • »
            »
            »
            »
            »
            »
            18 months ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            How does the above compare to this approach?

            • »
              »
              »
              »
              »
              »
              »
              16 months ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              Well it varies with architecture, compiler flags, SSE enabled/disabled. I don't know of an implementation that dominates the rest all the time. Better to benchmark and see for yourself in the platform of your interest.

  • »
    »
    7 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    __builtin_popcount looks to have O (1) complexity. Here is the proposed improvement [1] and the source code [2], look from line number 808.

    [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36041
    [2] https://gcc.gnu.org/viewcvs/gcc/trunk/libgcc/libgcc2.c?view=markup&pathrev=200506

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      O log k (k is number of bit). Because k is specified, the complexity is O(1).

      • »
        »
        »
        »
        18 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        So, the complexity for a number $$$n$$$ is $$$\mathcal{O}(\log\lfloor{\log_2(n)}\rfloor)$$$ ? I can't see where is the first $$$\log$$$ coming from. Can anybody explain this?

  • »
    »
    11 months ago, # ^ |
      Vote: I like it +34 Vote: I do not like it

    It's slow unless you instruct the compiler to use the popcnt instruction.

    #pragma GCC optimize("O3")
    #pragma GCC target("popcnt")
    #include <bits/stdc++.h>
    using namespace std;
    
    typedef long long ll;
    
    int main() {
      ios::sync_with_stdio(!cin.tie(0));
    
      int n;
      ll y = 0;
      cin >> n;
      
      for (int x=0; x<n; x++) {
          y += __builtin_popcountll(x);
      }
      
      cout << y << '\n';
    }
    

    In all tests, $$$n=10^9$$$.

    • Without pragmas, running time is 2245 ms.
    • With just O3, it's 2261 ms.
    • With just popcnt, it's 623 ms.
    • With both, it's 421ms.
  • »
    »
    8 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    The link is dead, someone who knows what was in it can please update/comment it.

    Thanks

»
22 months ago, # |
Rev. 2   Vote: I like it -25 Vote: I do not like it

There is NO __builtin_popcount in c++, it's a built in function of GCC. The function prototype is as follows: int __builtin_popcount(unsigned int) It returns the numbers of set bits in an integer (the number of ones in the binary representation of the integer).

  • »
    »
    22 months ago, # ^ |
      Vote: I like it +10 Vote: I do not like it

    Wow such a fast answer

    • »
      »
      »
      22 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Hmmm,,,hahaha.

      • »
        »
        »
        »
        11 months ago, # ^ |
          Vote: I like it -14 Vote: I do not like it

        Can you tell me what is the difference between GNU and GCC compilers. Because when i installed vscode I installed compilers from MinGW which had the GCC compilers i believe. But i get confused by GNU

        • »
          »
          »
          »
          »
          11 months ago, # ^ |
            Vote: I like it -14 Vote: I do not like it

          They refer to the same thing. GCC stands for GNU Compiler Collection.

»
16 months ago, # |
  Vote: I like it +8 Vote: I do not like it

Lol same thing happened with me in a codechef contest .. and i couldn't spot it during the whole contest

  • »
    »
    14 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    You can use just a bitset<32>(x).count() and bitset<64>(x).count() always. It will be optimized on any online judge, you can check assembly there and you will see only call __popcountdi2 in both cases.

»
5 months ago, # |
  Vote: I like it -18 Vote: I do not like it

INTEGER => __builtin_popcount() LONG INTEGER => __builtin_popcounl() LONG LONG INTEGER => __builtin_popcountll()