Runtime of finding the GCD of an array?

№	Пользователь	Рейтинг
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

№	Пользователь	Вклад
1	maomao90	174
2	awoo	164
2	adamant	164
4	TheScrasse	159
4	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	147
9	orz	146
10	pajenegod	145

I started thinking about this during my livestream of the solution to round 731, and discussed this in a comment a bit earlier, but it seems novel enough and might require enough debate that I thought it makes sense to be it's own blog post:

Suppose I have an array $$$a$$$ of $$$n$$$ integers, each up to a million. What is the worst-case runtime of the following code?

int gcd(int a, int b) {
   return b==0?a:gcd(b, a%b);
}

//in main
int gcd=a[0];
for (int i:a) gcd=gcd(i, gcd);

Conventionally, I have always thought it was $$$n*log(MAXVAL)$$$. However, each time the gcd call runs, the answer decreases by an expected factor of about 1.6 (the golden ratio). But really, we only do all $$$log(n)$$$ calls if the gcd gets quite small. In that case, future gcd calls will be $$$O(1)$$$, not $$$O(log)$$$. So do we have a clean potential function like: the number of prime factors in gcd, which can bound the total GCD recursion? I know that this is some very hand-wavey logic, but it seems almost legit enough for me to believe it. Or is this totally not true? I'm secretly wishing it's true because it would be cool, but also I feel like I would have seen this before if it was both commonly known and true.

TLDR: is the above code really $$$O(n + log(MAXVAL))$$$, instead of $$$O(n * log(MAXVAL))$$$?

Комментарии (8)

Написать комментарий?

12tqian

3 года назад, # |

← Rev. 2 →

+83

I think it is $$$\mathcal O(n + \log(MAXVAL))$$$. If the $$$\gcd$$$ doesn't change, calling $$$\gcd$$$ takes $$$\mathcal O(1)$$$ time. Otherwise, say on the $$$k$$$ times it changes, let's say we recurse $$$r_i$$$ times into $$$\gcd$$$. Then $$$\mathcal O(n + \sum r_i)$$$ is the number of times we recurse into $$$\gcd$$$, which also should be the time complexity. Let $$$g_0 = a[0]$$$, and let $$$g_i$$$ be the $$$\gcd$$$ after the ith time we change the $$$\gcd$$$. So note that $$$\frac{g_i}{g_{i-1}} \geq 2^{r_i}$$$, because each call into $$$\gcd$$$ reduces the maximum of $$$a, b$$$ by a factor of at least $$$2$$$. Then taking the product of this all for all $$$i$$$, we get that $$$\frac{g_0}{g_k} \geq 2^{\sum r_i}$$$, and taking the $$$\log$$$ of this gives our desired bound.

→ Ответить

Nson

+14

For me, this is both commonly known and true. I think I first saw this for segtree of gcd.

clyring

+18

In short, yes. The runtime of a single call $$$gcd(a, b) = x$$$ can be bounded above by a constant times $$$1+\log{\frac{\min \{a, b\}}{x}}$$$ arithmetic operations by (for example) noticing that the size of smaller argument must be reduced by at least half every two iterations. From there, it's easy to bound the total runtime by a telescoping series to get $$$O(n + log(MAXVAL))$$$ runtime for the loop.

tch1cherin

-13

Deleted

SecondThread

3 года назад, # ^ |

Yes, but this is only for one gcd call. Since the gcd of two consecutive Fibonacci numbers is 1, all following calls will run in constant time.

Savior-of-Cross

It is $$$O(n + log(maxval))$$$. I remember it being discussed in cf blogs a few times. If you build gcd-segment tree, the query complexity is also $$$O(log(n) + log(maxval))$$$ for the same reason.

hbarp

2 года назад, # |

← Rev. 3 →

Is this theory of gcd applicable to build for something like sparse tables / Segtrees too ? If not, what is the actual complexity on these structures ?

SPyofcode

2 года назад, # ^ |

Yet I think sparse table is $$$O(n \times (\log n + \log a))$$$ whereas $$$n$$$ is the size and $$$a$$$ is maximum value

Since the gcd will be calculated independently to the size of $$$n$$$. And there are $$$n$$$ such value. Then the gcd part should not be calculated more then $$$O(n \times \log a)$$$ as there are not thing more to reduce the gcd

And since the sparse table itself having $$$O(n \times \log n)$$$ complexity. This give us the total of $$$O(n \times (\log n + \log a))$$$ but I think it can be some what smaller as for random cases

Блог пользователя SecondThread