Блог пользователя kobus

Автор kobus, история, 5 лет назад, По-английски

I was looking for a practical guide on how to actually apply time complexity to know what will TLE or not and found nothing. Mostly what's out there are charts on what time complexities are reasonable given the value of n, but that's a heavy oversimplification. In this guide I will try to explain how to estimate the running time of your code.

First of all, let's understand what time complexity actually means. Formal definitions aside, we can say that if a code is O(f(n)), the time consumption of that code should be something like C*f(n)+S where C is a constant and S is something small compared to the rest.

So let's say your code consists of reading n integers and then iterating through all pairs of them. Your time consumption should be something like n + n^2, for reading and going through the pairs, respectively. The notion that time complexity gives us is that if your code is too slow, it is probably because of the n^2 bit, not the n one. That's why we will mostly focus on the "bigger" part of the running time function.

Now that we have isolated the slow part of your algorithm, we want to know how much time that actually takes. And here comes an important notion: not all O(1) operations are the same. A modulo operation, for example, can be four times slower than a sum.

    int a;
    
    for(int i = 1; i <= 1000000000; i++)
        a+=i;

This takes 2.73s in my computer.

    int a;
    
    for(int i = 1; i <= 1000000000; i++)
        a%=i;

This takes 8.14s.

Having different underlying constants is why, for example, using a static array can be faster than using a vector. And for more subtle reasons, certain ways to transverse matrices are faster than others. Not all constant operations are the same and you've got to be able to estimate constants for those O(1) operations, and that either means knowing how C++ works or just try a bunch of stuff yourself and get a feel for it.

Now for the algorithm itself, you very often can't just ignore the constants within them. Let's say you want to iterate through all possible combinations of 5 elements in an array. The naive solution is to make 5 nested for loops, giving us a total time of n^5*(whatever you are doing with those combinations).

    int a;
    for(int i1 = 0; i1 < 100; i1++){
        for(int i2 = 0; i2 < 100; i2++){
            for(int i3 = 0; i3 < 100; i3++){
                for(int i4 = 0; i4 < 100; i4++){
                    for(int i5 = 0; i5 < 100; i5++){
                        a++;
                    }
                }
            }
        }
    }

This runs 19.96s in my computer.

Now, if the order of those elements does not matter, we can always say that the second element will be bigger or equal than the first one, the third bigger or equal than the second one and so on. Just by adjusting the for loops to do that, now our running time is around 100 times faster! That's a constant we can't just ignore.

    int a;
    for(int i1 = 0; i1 < 100; i1++){
        for(int i2 = i1; i2 < 100; i2++){
            for(int i3 = i2; i3 < 100; i3++){
                for(int i4 = i3; i4 < 100; i4++){
                    for(int i5 = i4; i5 < 100; i5++){
                        a++;
                    }
                }
            }
        }
    }

The time is now 0.18s.

That's a trick that appears very often in dynamic programming or matrix exponentiation. Just by cutting your dimensions in half, you can get a code than runs in less a tenth the time it did before.

On the other hand, sometimes in lazy segment trees I have huge O(1) unlazy functions, that can take 20 operations to be done with. We can't just ignore that either.

Having that, we are ready to estimate the running time of our code.

We take our algorithm and isolate the part or parts that could give us trouble. We assign constants both to the C++ operations we use and how many of those operations we actually execute. Now we can just plug in the values and have the expected number of operations our code does.

In C++ we can do approximately 4*10^8 operations per second. If your code does not have some huge constant and the number you get when you plug in the values to the complexity is something less than 10^7 times the seconds you have, I would code it without giving it a second thought. If it is a relatively "clean" code, up until 10^8 is fine. More than that and you should probably start doing things more carefully.

I know it is all much more complex than this, but I hope I can help people that are starting to know what is worth coding, please comment any improvements I can make in this. :)

Полный текст и комментарии »

  • Проголосовать: нравится
  • +121
  • Проголосовать: не нравится