smax's blog

By smax, 16 months ago, In English
 
 
 
 
  • Vote: I like it
  • +248
  • Vote: I do not like it

»
16 months ago, # |
Rev. 2   Vote: I like it +91 Vote: I do not like it

Thank you for the blog! The text is very easy to read in my opinion. I would like to add something.

You provided one algorithm of finding the $$$k$$$-th term of a linear recurrence, which works in $$$O(M(n)\log{k})$$$, where $$$M(n)$$$ is the time complexity of polynomial multiplication. It is actually not the fastest algorithm (anymore), check out this paper: what you described seems to be Fiduccia's algorithm, and it takes $$$3M(n)\log{k} + O(n\log{k})$$$ time to evaluate (according to this paper). Its authors propose another algorithm, which runs in $$$2M(n)\log(k+1) + M(n)$$$. It works as follows:

Let $$$Q(x)$$$ be the characteristic polynomial of our recurrence, and $$$F(x) = \sum_{i=0}^{\infty}a_ix^i$$$ be the generating formal power series of our sequence. Then it can be seen that all nonzero terms of $$$F(x)Q(x)$$$ are of at most $$$(n-1)$$$-st power. This means that $$$F(x) = P(x)/Q(x)$$$ for some polynomial $$$P(x)$$$. Moreover, we know what $$$P(x)$$$ is: it is basically the first $$$n$$$ terms of $$$F(x)Q(x)$$$, that is, can be found in one multiplication of $$$a_0 + \ldots + a_{n-1}x^{n-1}$$$ and $$$Q(x)$$$, and then trimming to the proper degree.

So what remains is to find $$$[x^k]\dfrac{P(x)}{Q(x)}$$$ (recall that $$$[x^k]G(x)$$$ is the $$$k$$$-th coefficient of a formal power series or a polynomial $$$G(x)$$$). We do this recursively:

$$$[x^k]\dfrac{P(x)}{Q(x)} = [x^k]\dfrac{P(x)Q(-x)}{Q(x)Q(-x)} = [x^k]\dfrac{S(x)}{T(x^2)} = [x^k]\dfrac{S_0(x^2) + xS_1(x^2)}{T(x^2)}.$$$

Here $S(x)$ is $$$P(x)Q(-x)$$$, $$$S_0(x)$$$ and $$$S_1(x)$$$ are polynomials composed of only even/odd-indexed terms of $$$S(x)$$$, and we notice that $$$Q(x)Q(-x)$$$ is an even function, thus its polynomial representation contains only even-indexed terms, which represent a polynomial $$$T(x)$$$.

Now, if $$$k$$$ is even, then

$$$[x^k]\dfrac{S_0(x^2) + xS_1(x^2)}{T(x^2)} = [x^k]\dfrac{S_0(x^2)}{T(x^2)} = [x^{k/2}]\dfrac{S_0(x)}{T(x)},$$$

otherwise

$$$[x^k]\dfrac{S_0(x^2) + xS_1(x^2)}{T(x^2)} = [x^k]\dfrac{xS_1(x^2)}{T(x^2)} = [x^{(k-1)/2}]\dfrac{S_1(x)}{T(x)}.$$$

Thus we divided $k$ by $$$2$$$ with the cost of only two multiplications. Moreover, one can probably even use fewer FFTs to compute $$$Q(x)Q(-x)$$$ and maybe apply other tricks to speed this up.

  • »
    »
    16 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Woah that's a neat trick, thanks for sharing!

  • »
    »
    14 months ago, # ^ |
    Rev. 2   Vote: I like it +19 Vote: I do not like it

    Interestingly, this approach (per Schönhage's article, inspired by Graeffe's method) also applies to reciprocals.

    $$$\frac{1}{Q(x)} = \frac{Q(-x)}{Q(-x)Q(x)} = \frac{Q(-x)}{T(x^2)}$$$

    If we want to calculate the result $$$\bmod x^{n+1}$$$, only first $$$\lfloor n/2\rfloor$$$ coefficients of $$$\frac{1}{T(x)}$$$ matter, which allows half-sized reduction for calculating $$$\frac{1}{Q(x)}$$$. Seemingly no significant constant-time improvements here, but might be simpler to comprehend than typical Newton iteration.

»
16 months ago, # |
  Vote: I like it +23 Vote: I do not like it

In the first code shown you write that if $$$n$$$ is $$$1$$$ (which means $$$f(x) = x - c_0$$$) then $$$a = [c[0]]$$$ without mentioning it. For some of us it might not be trivial that $$$x^k \equiv c_0^k \; mod(x - c_0)$$$. If someone else didn't get that, notice that $$$x^k - r^k = (x-r)(x^{k-1} + x^{k-2}r + x^{k-3}r^2 + \dots + xr^{k-2} + r^{k-1})$$$. It can be extended to arbitrary polynomials to get $$$f(x) \equiv f(k) \; mod (x-k)$$$, which is known as polynomial remainder theorem.

  • »
    »
    16 months ago, # ^ |
      Vote: I like it +14 Vote: I do not like it

    Good catch, I did forget to explain that part of the code. You can also think of it as $$$x^k \bmod (x - c_0) = (x \bmod (x - c_0))^k \bmod (x - c_0) = c_0^k \bmod (x - c_0)$$$. $$$x \bmod (x - c_0) = c_0$$$ by doing one round of long division.

»
10 months ago, # |
  Vote: I like it 0 Vote: I do not like it

In the end of proof shouldn't $$$L_i = i - m + L_m$$$ be $$$L_i = i - m + L_{m - 1}$$$ ?

because we are taking the previous version of linear recurrence relation, right?

  • »
    »
    10 months ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    It's been a while since I've looked at this proof and I've forgotten a lot of the fine details, but I think you're right since I wrote earlier that $$$L_{m-1} = m + 1 - L_{i-1}$$$, but then I substitute that for $$$L_m$$$ instead of $$$L_{m-1}$$$. I'll make the change. Indexing in this proof is a pain D:

»
10 months ago, # |
Rev. 2   Vote: I like it +10 Vote: I do not like it

btw, to prove a linear recurrence for $$$A^m$$$ signifies a linear recurrence for a specific entry of $$$dp[m]$$$ (let say we want $$$dp[m][i]$$$), maybe we can just manipulate Cayley-Hamilton theorem a bit like this?

($$$A$$$ is a n times n matrix, $$$b_i$$$ is a n times 1 vector whose entry are all 0 except i-th row is 1)

$$$ $$$
$$$A^{m} = \sum_{j = 1}^{n}c_j A^{m - j}$$$
$$$ $$$
$$$A^{m}dp[0] = \sum_{j = 1}^{n}c_j A^{m - j}dp[0] = dp[m]$$$
$$$ $$$
$$$b_i^TA^{m}dp[0] = \sum_{j = 1}^{n}c_j b_i^TA^{m - j}dp[0] = b^T_idp[m] = dp[m][i]$$$
$$$ $$$
$$$\sum_{j = 1}^{n}c_j dp[m - j][i]= dp[m][i]$$$
  • »
    »
    10 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Hi, I saw your other blog earlier containing an explanation for how we get a linear recurrence of the $$$dp$$$ from a linear recurrence on the transition matrix, and I just realized you commented the same explanation under this blog.

    Anyways, the math seems to check out to me. Not sure how I didn't think of just doing more algebra on top of the result from Cayley-Hamilton when I wrote this article a while back. Do you mind if I link to your comment in the article for future reference?