### grhkm's blog

By grhkm, history, 4 weeks ago,

# Lagrange Interpolation by grhkm

No polynomial stuff because I don't know $O(n\log{n})$ polynomial multiplication :(

## Main results:

Given $f(x_1)=y_1, f(x_2)=y_2, ..., f(x_n)=y_n$ we can calculate $f(X)$ where $f$ is the unique $(n-1)$-degree polynomial

For general {$x_i$} we can do so in $O(n^2)$

For consecutive {$x_i$} i.e. $x_j=x_1+(j-1)$, we can do so in $O(n)$ excluding binary exponentiation

## Why useful?

Calculate $1^n+2^n+...+X^n$ in $O(n)$

DP optimization

With polynomials you can do cool things but not gonna cover this blog

Let's get started :)

Assume all polynomials below are $(n-1)$ degree unless specified.

disclaimer
Why (n-1) degree?

## Idea:

To start let's deal with first problem:

Given $f(x_1)=y_1, f(x_2)=y_2, ..., f(x_n)=y_n$ and an integer $X$ we can calculate $f(X)$ where $f$ is the unique $(n-1)$-degree polynomial

Let's consider an easier version:

Find polynomial satisfying $f(x_1)=y_i, f(x_2)=0, ..., f(x_n)=0$

As we learn in school, $f(r)=0 \implies (x-r)$ is divisor

We can write this down:

$f(x)=(x-x_2)(x-x_3)\cdots (x-x_n)$

Now notice that $f(x_1)=\prod_{i=2}^n (x_1-x_i)$ (a constant), but we want it to be $y_1$

We can simply scale the polynomial by a correct factor i.e.

$f_1(x)= \prod_{i=2}^n (x-x_i) \cdot \frac{y_1}{\prod_{i=2}^n (x_1-x_i)}=y_1\prod_{i=2}^n \frac{x-x_i}{x_1-x_i}$

Similarly, we can find a similar polynomial such that $f_i(x_i)=y_i$ and $f_i(x_j)=0 \forall j \neq i$

$f_i(x) = y_i \prod_{j=1,j\neq i}^n \frac{x-x_i}{x_i-x_i}$

Okay now finally to answer the original question, we can notice that $f(x)=\sum_{i=1}^n f_i(x)$ satisfies the constraints. So there we have it! Lagrange interpolation:

$f(x)=\sum_{i=1}^n y_i \prod_{j=1,j\neq i}^n \frac{x-x_j}{x_i-x_j}$

$O(n^2)$ excluding inverse for division

## Optimization:

Let's try to optimise it now! Assume that we're given $f(1), f(2), \cdots, f(n)$, how can we calculate $f(X)$ quickly?

Notice that here, $x_i=i$ and $y_i=f(i)$. Substitute!

$f(x) = \sum_{i=1}^n f(i) \prod_{j=1,j\neq i}^n \frac{x-j}{i-j} = \sum_{i=1}^n f(i) \frac{\prod_{j=1,j\neq i}^n x-j}{\prod_{j=1,j\neq i}^n i-j}$

Let's consider the numerator and denominator separately

### Numerator

$\prod_{j=1,j\neq i}^n x-j = [(x-1)(x-2)\cdots (x-(i-1))] \cdot [(x-(i+1))(x-(i+2))\cdots (x-n)]$

We can pre-calculate prefix and suffix product of x, then we can calculate the numerator (for each $i$) in $O(1)$

### Denominator

$\prod_{j=1,j\neq i}^n i-j = [(i-1)(i-2)(i-3)\cdots(i-(i-1))] \cdot [(i-(i+1))(i-(i+2))\cdots (i-n)]$
$=(i-1)! \cdot (-1)^{n-(i+1)+1} \cdot (n-i)!$
$=(-1)^{n-i} (n-i)! (i-1)!$

So we can precompute factorials (preferably their inverse directly) then $O(1)$ calculation

So now we can calculate $f(X)$ in $O(n)$

### Example 0

Find_ $\sum_{i=1}^N i^k$, $k\leq 1e6, N\leq 1e12$

Solution: Notice that the required sum will be a degree $(k+1)$ polynomial, and thus we can interpolate the answer with $(k+2)$ data points (Remember, we need $deg(f)+1$ points)

To find the data points simply calculate $f(0)=0, f(x)=f(x-1)+x^k$

Code can be found below

### Example 1

(BZOJ 3453, judge dead): Find_ $\sum_{i=0}^n \sum_{j=1}^{a+id} \sum_{k=1}^j k^x \mod 1234567891, x \leq 123, a, n, d \leq 123456789$

Hint

(Luogu P4463 submit here): Given $n \leq 500, k \leq 1e9$, we call a sequence ${a_n}$ nice if $1\leq a_i \leq k \forall i$ and $(a_i\neq a_j) \forall i\neq j$. Find the sum of (products of elements in the sequence) over all nice sequences. MY code below

Hint
Hint2
Hint3
Hint4

Problem https://www.codechef.com/problems/XETF : Find $\sum_{i=1,gcd(i,N)=1}^N i^k, N \leq 1e12, k \leq 256$. My code below

Hint

If you have any questions feel free to ask below

## Practice question:

CF 622F (Example 0)

Luogu P4463 (Example 2)

Codechef XETF (Example 3)

Not in order of difficulty:

SPOJ ASUMEXTR

SPOJ ASUMHARD which is harder than EXTREME

Codechef COUNTIT

Atcoder ARC33D (Japanese)

Atcoder S8PC3G

• Unfortunately you can't get full score, but you can get up to subtask 4

• Still not trivial, do first 3 subtasks would give insight

• For full solution please wait for my next blog (next year)

HDU 4059

SPOJ SOMESUMS very interesting

PE 487 Direct application

• +229

 » 4 weeks ago, # | ← Rev. 4 →   +38 Preview for next post that's coming next year or something:Interpolate MULTIPLE values in O(nlog^2(n))and moreAlso if you know any more problems using this technique please tell me I'll put it in and credit you too thanks EDIT 1: sorry if the 'and more' sounds depressing I don't have enough energy :(EDIT 2: The O(n) optimization can also be done if the data points given are $f(a),f(a+1),\cdots,f(a+k-1)$ instead. Left as exercise for the reader :)EDIT 3: After staring at the ceiling for a few seconds I believe that $O(n)$ is achievable for all arithmetic sequence input, as you can factor out the common difference and multiply/divide by the power of it directly. Left as an exercise for the reader :) I'll implement when I'm free
•  » » 4 weeks ago, # ^ |   +8 UPDATE: I believe for arithmetic sequence input, you can simply apply transformation g(x)=f(a+dx), then the input will be g(0), g(1), ..., g(k-1), which you can use the techniques in the blog to do. You can also directly find $f(x)=g(\frac{x-ad}{d})$ since you don't need to find the polynomial
•  » » » 4 weeks ago, # ^ |   0 This reminded me of a problem I set long ago where the input is an arithmetic sequence: https://www.spoj.com/problems/BTCODE_E/
 » 4 weeks ago, # |   +9 Great blog!I learnt about this for the first time when I couldn't solve 622F.This problem is exactly what is taught in this blog.
•  » » 4 weeks ago, # ^ | ← Rev. 2 →   +6 2600hmmWell I guess a lot of people will solve their first 2600 today :D Great that this blog helps
 » 4 weeks ago, # |   +13 Very Very Nice Blog.. Never thought of this much simple explanation of lagrange interpolation .Thanks a lot .. I got this thing for the first time.
•  » » 4 weeks ago, # ^ |   +8 Glad that this is helpful(Tbh the explanation given is standard but I hope I explained it more detailedly and with clean code too :))
•  » » » 4 weeks ago, # ^ |   +8 Yaa I know .. Even it seemed like very basic school maths to me , But the point is, u explained it using only basic terms without much technical terms .. and tbh i have always skipped this because of its name.
•  » » » » 4 weeks ago, # ^ | ← Rev. 2 →   +8 Glad that it helped :)
 » 4 weeks ago, # |   +8 Nice blog! but I have a question, on the idea part, shouldn't the degree of $f$ be $n - 1$? since you only had $n$ points at the start (from $(x_1,y_1),(x_2,y_2),...(x_n,y_n)$)
•  » » 4 weeks ago, # ^ |   +11 Oops, you're correct. n points -> (n-1) degree, n degree -> (n + 1) points required
 » 4 weeks ago, # |   0 Just a note on spoj ASUMHARD. There are other methods with higher complexity that run faster than Lagrange interpolation for that particular problem (since k is small and number of test cases is large). For e.g. using a Newton divided difference table (O(k^2) pre-computation) with O(k) per test case runs super fast. See difference in run times here: https://www.spoj.com/status/ASUMHARD,suh_ash2008/ (only ID:26436058 uses Lagrange interpolation).
•  » » 4 weeks ago, # ^ |   0 Interesting. Tbf I haven't Read them carefully so you're probably correct.
•  » » » 4 weeks ago, # ^ | ← Rev. 3 →   0 The complexity of the solution using Lagrange interpolation is actually O(k*log k) and not O(k) since the computation of f(0), f(1), f(2), ..., f(k) involves modular exponentiation. Even for the Lagrange interpolation, I've seen people do a modulo inverse (to divide by (x-i) instead of prefix and suffix products, which makes it O(k*log k). For ASUMHARD, let g[i][j] = i^j. We can write this as g[i][j] = g[i][j-1] * i and since k is small (321), this table can be pre-computed easily in O(k^2) and each test case can be answered in O(k) instead of O(k*log k). This seems to make a huge difference due to the large number of test cases. There are other methods (for e.g. http://oeis.org/A080779) which involve pre-computing the polynomial coefficients in O(k^2) or O(k^3) and answer each test case in O(k). :)
•  » » » » 3 weeks ago, # ^ |   0 Shouldn't the k*log k be k*log p instead, if I understand properly? It is from exponentiation right?If so, I wrote a blog that explains how to optimise to O(k+log p) lol feel free to check it outOr maybe I'm misunderstanding what you meanThank you for the information though :)
•  » » » » » 3 weeks ago, # ^ | ← Rev. 2 →   +1 No, I did mean O(k*log k) due to modular exponentiation (and not due to inverse computation). In your Example 0, you mention "to find the data points, simply calculate f(0) = 0, f(x) = f(x-1) + x^k". This involves computation of x^k for x = 0 to k, each of which is O(log k). Hence, total complexity of computing f(0), f(1), ..., f(k) is O(k*log k) and not O(k).
 » 4 weeks ago, # |   0 One more nice problem to try out if anyone wants to play around with their Lagrange interpolation code and try to optimise it: https://www.spoj.com/problems/PWSUMC/