(prologue) Some time ago this crossed my mind, but I only recalled it now and I think it could be worth a small blog post. This is nothing big and rarely useful but nevertheless, I found it interesting so hopefully you will too (don't expect to find this enriching).
It is widely known that the time complexity to compute the GCD (greatest common divisor) of two integers a, b, using the euclidean algorithm, is .
First notice that for arbitrary integers x, y such that x ≥ y, we get . This is due to:
- x % y < y
- x % y ≤ x - y
During the process of the algorithm, suppose we have our current integers a, b, a ≥ b. In the following 2 steps, both of them are being taken modulo a value not greater than them. Using the above observation, after these 2 steps we get 2 integers at most respectively. Once one of them becomes 0 the algorithm terminates, so the algorithm terminates in at most , which implies .
This bound is nice and all, but we can provide a slightly tighter bound to the algorithm:
We show this bound by adding a few sentences to the above proof: once the smaller element becomes 0, we know that the larger element becomes the resulting gcd. Therefor we can first bound by , and lastly notice that we can change the maximum to minimum, since after one step of the algorithm the current maximum is the previous minimum; min(a, b) = max(b, a % b) when a ≥ b.
This bound is of course negligible... for a single gcd computation. It turns out to be somewhat useful when used multiple times in a row. I'll explain with an example "problem":
(1) Given an array A of n integers in range [1, M], compute their greatest common divisor.
The solution is of course, we start with the initial answer G = A0, and iterate over all the remaining elements, assigning to G the value . The known time complexity analysis gives us the bound of , for computing gcd n times over values of order M. The tighter analysis actually gives a bound that is asymptotically better: (for practical values of M, you can refer to this as ). Why is that so? again, we can determine the time complexity more carefully:
The iteration over the array gives us the factor of n, while the remaining is recieved from gcd computations, which we analyze now; Let the value Gi be equal to G after the i-th iteration, that is:
- G0 = A0
On the i-th iteration, the gcd computation starts with 2 values Gi - 1, Ai, and results with Gi, so the time complexity of it is , which is worstcase , so we will assume it's the latter.
The total gcd iterations (differing by a constant factor) is:
And generally speaking, this analysis sometimes allows us to show that the solution is quicker by a factor of .
As a last note, we can use (1) to show more such improvements. For example, (2): suppose the problem of gcd queries for ranges, and point updates. Of course, we solve this by a segment tree. The known analysis gives us a bound of per query or update. We can use (1) to give ; an update consists of a starting value, and repeatedly for steps we assign to it its gcd with some other value. Following (1), this takes the desired complexity. The same analysis is done for queries.
If the constraints were n, q ≤ 5·105, Ai ≤ 1018, then this shows that instead of doing around 6·108 operations, we do around 4·107 operations (if we follow the big O notation and ignore the constant), which is close to the well known bitset optimization (factor of 1/32).
Thanks for reading :). As a question to the reader: What other tasks can utilize this, like (2) can?