hly1204's blog

By hly1204, history, 21 month(s) ago, In English

A simple way to understand the transposed algo of multi-eval

A simple way to understand the transposed algorithm of multi-point evaluation of polynomials in Tellegen’s Principle into Practice.

I learned this from Bernstein's paper Scaled Remainder Trees. If there are any mistakes, please tell me, thanks! (Sorry for my poor English)

Let $$$\mathbb{R}((x))$$$ denote the ring of formal Laurent series, $$$\mathbb{R}\lbrack x\rbrack$$$ denote the ring of polynomial. $$$f(x)\bmod 1:=f_{-1}x^{-1}+f_{-2}x^{-2}+\cdots$$$.

Bernstein showed that division in $$$\mathbb{R}\lbrack x\rbrack$$$ is division in $$$\mathbb{R}((x^{-1}))$$$. Here is a concrete example. Let $$$A(x):=1+x+4x^2+5x^3+x^4+4x^5\in\mathbb{R}\lbrack x\rbrack$$$ and $$$B(x):=1+9x+8x^2+x^3\in\mathbb{R}\lbrack x\rbrack$$$, we want to compute $$$Q(x),R(x)\in\mathbb{R}\lbrack x\rbrack$$$ such that $$$A(x)=B(x)Q(x)+R(x)$$$ and $$$\deg R(x)\lt \deg B(x)$$$ (with $$$\deg 0=-\infty$$$). Let's do this in $$$\mathbb{R}((x^{-1}))$$$. First, we should find the multiplicative inverse of $$$B(x)$$$ which is

$$$ B^{-1}(x)=x^{-3}-8x^{-4}+55x^{-5}-369x^{-6}+2465x^{-7}-16454x^{-8}+109816x^{-9}-\cdots $$$

and compute

$$$ A(x)B^{-1}(x)=4x^2-31x+217-1457x^{-1}+9735x^{-2}-64983x^{-3}+433706x^{-4}-\cdots $$$

We could find that $$$Q(x)$$$ is exactly $$$A(x)B^{-1}(x)-(A(x)B^{-1}(x)\bmod 1)$$$, and the leading coefficient of $$$R(x)$$$ equals $$$[x^{-1}]A(x)B^{-1}(x)$$$. That's not a coincidence. Consider the simplest case

$$$ (x-v)^{-1}=x^{-1}+vx^{-2}+v^2x^{-3}+\cdots=\sum_{n\geq 0}v^nx^{-(n+1)} $$$

So we have $$$f(v)=[x^{-1}]f/(x-v)$$$ cause

$$$ \begin{aligned} f(x)/(x-v)&=f_0x^{-1}+f_1+f_2x+\cdots +f_nx^{n-1}\\ &+f_0vx^{-2}+f_1vx^{-1}+f_2v+\cdots +f_nvx^{n-2}\\ &+\cdots \end{aligned} $$$

The problem of multi-point evaluation is as follows: Given polynomial $$$f(x)=\sum_{i=0}^nf_ix^i$$$ and $$$v_0,v_1,\dots ,v_n$$$, print $$$f(v_0),f(v_1),\dots ,f(v_n)$$$.

The traditional algorithm for multi-point evaluation is that we make an almost balanced binary (sub)product tree with leaves are polynomials $$$p_i(x)=x-v_i$$$ and non-leaf nodes are the product of its children. Then we compute $$$f\bmod (p_1\cdots p_n)$$$ and then compute $$$f\bmod (p_1\cdots p_{n/2})$$$ and $$$f\bmod (p_{n/2+1}\cdots p_n)$$$ cause $$$(f\bmod (p_1\cdots p_n))\bmod (p_1\cdots p_{n/2})=f\bmod (p_1\cdots p_{n/2})$$$ (suppose $$$n$$$ is even) and then ... The transposed algorithm is that we compute $$$f/(p_1\cdots p_n)$$$ and $$$f/(p_1\cdots p_n)\cdot (p_{n/2+1}\cdots p_{n})=f/(p_1\cdots p_{n/2})$$$, finally we will have $$$f/p_i$$$ at the leaf of the tree, the coefficient of $$$x^{-1}$$$ is exactly what we want. The only remaining problem is how many terms should we track when doing the computation. It's clear that we don't care about the coefficients of $$$x^0,x^1,\dots$$$ in $$$f/(p_1\cdots p_n)$$$ cause they do nothing with the coefficient of $$$x^{-1},x^{-2},\dots$$$ when $$$f/(p_1\cdots p_n)$$$ was multiplied by a polynomial like $$$g_0+g_1x+g_2x^2+\cdots$$$. Another observation is that for $$$f/P\bmod 1=a_{-1}x^{-1}+a_{-2}x^{-2}+\cdots +a_{-\deg P}x^{\deg P}+\cdots$$$, only $$$a_{-1},\dots ,a_{-\deg P}$$$ are enough. Consider a very simple case that we have $$$f/(p_1p_2)\bmod 1=c_{-1}x^{-1}+c_{-2}x^{-2}+c_{-3}x^{-3}+\cdots$$$ and want to compute $$$f/p_1\bmod 1$$$ via multiplying $$$f/(p_1p_2)\bmod 1$$$ by $$$p_2$$$, the coefficients of $$$x^{-3},x^{-4},\dots$$$ have no effect on the coefficient of $$$x^{-1}$$$ (it's more complicated for the integer case). So we could almost halve the size of problem after one multiplication.

About how to compute $$$(p_1\cdots p_n)^{-1}$$$ in $$$\mathbb{R}((x^{-1}))$$$, we could simply use the method that we compute the multiplicative inverse of formal power series $$$x^{\deg(p_1\cdots p_n)}p_1(x^{-1})\cdots p_n(x^{-1})$$$ in $$$\mathbb{R}\lbrack \lbrack x\rbrack \rbrack$$$.

  • Vote: I like it
  • +60
  • Vote: I do not like it

| Write comment?
21 month(s) ago, # |
  Vote: I like it +16 Vote: I do not like it

This seems like a nice (and obvious-in-hindsight) optimization over the more widely known mod-only recursion. Using a higher level of precision for the initial multiplication by $$$\left(\prod p_i\right)^{-1}$$$ seems like a very small price to pay for making every later step in the algorithm use a multiplication instead of a mod.

But I don't understand in what sense it is transposed: It calculates the same things in the same order as that more widely known algorithm (albeit using a different representation), whereas transposing an algorithm (as I understand it) involves calculating related quantities but in exactly the opposite order. Am I missing something here?