iamsr's blog

By iamsr, history, 5 years ago, In English

In problem 1167C, this submission gets a TLE. However, changing the line ele.clear(); to ele = unordered_set<lli>(); gets an AC. Why does that happen? I read the docs and std::unordered_set::clear has complexity "linear in the size of the container, i.e., the number of elements". We know that the container ele's size won't exceed n, the total number of elements. Thus, the total cost of clear operations does not exceed n. So, it just adds an extra complexity of n to my original code. At max, my code execution time should double, however, my AC code takes 499ms, so the code with .clear() shouldn't have taken more than around 1sec in any circumstances.

Could anyone help me understand the problem here?

  • Vote: I like it
  • +55
  • Vote: I do not like it

| Write comment?
»
5 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Your logic is correct, but you forgot that unordered_set is a hash table and to clear a hash table you need more time than creating a new one. I believe, if you will replace unordered_set with set you will get AC with .clear().

  • »
    »
    5 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Hi, I tried and you're right that set gets AC while unordered_set TLEs. However, I do not exactly understand how clearing a hash table takes more time than, say, clearing the red black tree a set uses?

    • »
      »
      »
      5 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Well, just try to implement a hash table yourself with all operations (insert, delete, clear) and you will understand what is the complexity... Or you can google it...

»
5 years ago, # |
Rev. 4   Vote: I like it +16 Vote: I do not like it

A simple solution is to insert ele.rehash(0) right after ele.clear().
Probably, TLE is due to two facts related to std::unordered_set::clear. First, its time complexity depends on the size of buckets as well as the number of elements. Second, it doesn't change the size of buckets while it erases all the elements. You can find these facts by running the following code.

int main()
{
    std::unordered_set<int> ust;
    for(int i = 0; i < 200000; ++i)
    {
        ust.insert(i);
    }
    ust.clear();
    for(int i = 0; i < 200000; ++i)
    {
        ust.clear();
    }
    return 0;
}

It's so time-consuming against our expectations. The bucket size of ust doesn't change even after the first ust.clear(), and is 218971 in my environment. Then ust.clear() is called 200000 times, that is, the amount of calculation is 218971x200000!!
As I mentioned, you can avoid this tricky behavior by rehashing right after the first ust.clear().

»
5 years ago, # |
  Vote: I like it +8 Vote: I do not like it

This is a known libstdc++ bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67922

Previously there was a related bug with erase at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41975 ; however it's fixed.