Recently I learned about FFT algorithm, and I was confused because of it's precision errors.
Suppose that I want to multiply two polynomials A(x) and B(x) modulo 10^9 + 7. (max degree <= 10^6) Then I have two options :
use NTT with modulo 119 * 2^23 + 1. With these I don't have to worry about precision errors, but I have to split the coefficients in something like (c0 + c1 * 32 + c2 * 32^2 + c3 * 32^3 .... c5 * 32^5). This is the best we can use, since if we use something over 32, (33-1) * (33-1) * 1000000 > modulo. So I have to split it into at least 6 pieces, which is horrible..
use FFT with complex numbers : Well... we don't have modulos, so I have to guess here. I've experimented with 10^6-digit long integers with all digits in 9. (basically I calculated (10 ** 1000000 — 1) ** 2) In first try, I grouped two digits together (coeff < 100), and it worked fine. In second try, I grouped three digits together(coeff < 1000) , which had precision errors. So I can guess that it will be okay to split under 100. Now 5 pieces. Which is... eh, improvement, but still horrible.
Regarding problem F in here or this (spoiler alert), It's clear that there is a demand for big modulos. However, grouping into something like 30 or 100 is not only dumb, but very slow and constant-heavy, and it should be avoided.
I concluded that NTT won't help me in this problem, and arrived in following questions :
- How can we estimate the precision error of FFT? (or, is there some kind of rule of thumb in here?)
- What is the common way to avoid this situation? Isn't there any other way than grouping with number lesser than sqrt(mod), or karatsuba?
FYI, this is the code used in my experiment : https://gist.github.com/koosaga/ca557138cabe9923bdaacfacd9810cb1