Runtime of finding the GCD of an array?

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
2	maomao90	160
3	adamant	157
4	maroonrk	154
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	TheScrasse	145
10	nor	144

I started thinking about this during my livestream of the solution to round 731, and discussed this in a comment a bit earlier, but it seems novel enough and might require enough debate that I thought it makes sense to be it's own blog post:

Suppose I have an array $$$a$$$ of $$$n$$$ integers, each up to a million. What is the worst-case runtime of the following code?

int gcd(int a, int b) {
   return b==0?a:gcd(b, a%b);
}

//in main
int gcd=a[0];
for (int i:a) gcd=gcd(i, gcd);

Conventionally, I have always thought it was $$$n*log(MAXVAL)$$$. However, each time the gcd call runs, the answer decreases by an expected factor of about 1.6 (the golden ratio). But really, we only do all $$$log(n)$$$ calls if the gcd gets quite small. In that case, future gcd calls will be $$$O(1)$$$, not $$$O(log)$$$. So do we have a clean potential function like: the number of prime factors in gcd, which can bound the total GCD recursion? I know that this is some very hand-wavey logic, but it seems almost legit enough for me to believe it. Or is this totally not true? I'm secretly wishing it's true because it would be cool, but also I feel like I would have seen this before if it was both commonly known and true.

TLDR: is the above code really $$$O(n + log(MAXVAL))$$$, instead of $$$O(n * log(MAXVAL))$$$?

Comments (8)

Write comment?

12tqian

3 years ago, # |

← Rev. 2 →

+83

I think it is $$$\mathcal O(n + \log(MAXVAL))$$$. If the $$$\gcd$$$ doesn't change, calling $$$\gcd$$$ takes $$$\mathcal O(1)$$$ time. Otherwise, say on the $$$k$$$ times it changes, let's say we recurse $$$r_i$$$ times into $$$\gcd$$$. Then $$$\mathcal O(n + \sum r_i)$$$ is the number of times we recurse into $$$\gcd$$$, which also should be the time complexity. Let $$$g_0 = a[0]$$$, and let $$$g_i$$$ be the $$$\gcd$$$ after the ith time we change the $$$\gcd$$$. So note that $$$\frac{g_i}{g_{i-1}} \geq 2^{r_i}$$$, because each call into $$$\gcd$$$ reduces the maximum of $$$a, b$$$ by a factor of at least $$$2$$$. Then taking the product of this all for all $$$i$$$, we get that $$$\frac{g_0}{g_k} \geq 2^{\sum r_i}$$$, and taking the $$$\log$$$ of this gives our desired bound.

→ Reply

Nson

+14

For me, this is both commonly known and true. I think I first saw this for segtree of gcd.

clyring

+18

In short, yes. The runtime of a single call $$$gcd(a, b) = x$$$ can be bounded above by a constant times $$$1+\log{\frac{\min \{a, b\}}{x}}$$$ arithmetic operations by (for example) noticing that the size of smaller argument must be reduced by at least half every two iterations. From there, it's easy to bound the total runtime by a telescoping series to get $$$O(n + log(MAXVAL))$$$ runtime for the loop.

tch1cherin

-13

Deleted

SecondThread

3 years ago, # ^ |

Yes, but this is only for one gcd call. Since the gcd of two consecutive Fibonacci numbers is 1, all following calls will run in constant time.

Savior-of-Cross

It is $$$O(n + log(maxval))$$$. I remember it being discussed in cf blogs a few times. If you build gcd-segment tree, the query complexity is also $$$O(log(n) + log(maxval))$$$ for the same reason.

hbarp

← Rev. 3 →

Is this theory of gcd applicable to build for something like sparse tables / Segtrees too ? If not, what is the actual complexity on these structures ?

SPyofcode

Yet I think sparse table is $$$O(n \times (\log n + \log a))$$$ whereas $$$n$$$ is the size and $$$a$$$ is maximum value

Since the gcd will be calculated independently to the size of $$$n$$$. And there are $$$n$$$ such value. Then the gcd part should not be calculated more then $$$O(n \times \log a)$$$ as there are not thing more to reduce the gcd

And since the sparse table itself having $$$O(n \times \log n)$$$ complexity. This give us the total of $$$O(n \times (\log n + \log a))$$$ but I think it can be some what smaller as for random cases

SecondThread's blog