Tutorial on FFT/NTT — The tough made simple. ( Part 1 )

№	Пользователь	Рейтинг
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

№	Пользователь	Вклад
1	maomao90	174
2	awoo	164
3	adamant	161
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	147
9	orz	146
10	pajenegod	145

Aim — To multiply 2 n-degree polynomials in $\text{[math]}$ instead of the trivial O(n²)

I have poked around a lot of resources to understand FFT (fast fourier transform), but the math behind it would intimidate me and I would never really try to learn it. Finally last week I learned it from some pdfs and CLRS by building up an intuition of what is actually happening in the algorithm. Using this article I intend to clarify the concept to myself and bring all that I read under one article which would be simple to understand and help others struggling with fft.

Let’s get started $\text{[math]}$
$\text{[math]}$
Here A(x) and B(x) are polynomials of degree n - 1. Now we want to retrieve C(x) in $\text{[math]}$

So our methodology would be this $\text{[math]}$

Convert A(x) and B(x) from coefficient form to point value form. (FFT)
Now do the O(n) convolution in point value form to obtain C(x) in point value form, i.e. basically C(x) = A(x) * B(x) in point value form.
Now convert C(x) from point value from to coefficient form (Inverse FFT).

Q) What is point value form ?
Ans) Well, a polynomial A(x) of degree n can be represented in its point value form like this $\text{[math]}$ A(x) = {(x₀, y₀), (x₁, y₁), (x₂, y₂), ..., (x_n - 1, y_n - 1)} , where y_k = A(x_k) and all the x_k are distinct.
So basically the first element of the pair is the value of x for which we computed the function and second value in the pair is the value which is computed i.e A(x_k).
Also the point value form and coefficient form have a mapping i.e. for each point value form there is exactly one coefficient representation, if for k degree polynomial, k + 1 point value forms have been used at least.
Reason is simple, the point value form has n variables i.e, a₀, a₁, ..., a_n - 1 and n equations i.e. y₀ = A(x₀), y₁ = A(x₁), ..., y_n - 1 = A(x_n - 1) so only one solution is there.
Now using matrix multiplication the conversion from coefficient form to point value form for the polynomial $\text{[math]}$ can be shown like this $\text{[math]}$

$\text{[math]}$ $\text{[math]}$ $\text{[math]}$

And the inverse, that is the conversion from point value form to coefficient form for the same polynomial can be shown as this $\text{[math]}$

$\text{[math]}$ $\text{[math]}$ $\text{[math]}$

Now, let's assume A(x) = x² + x + 1 = {(1, 3), (2, 7), (3, 13)} and B(x) = x² - 3 = {(1, - 2), (2, 1), (3, 6)}, where degree of A(x) and B(x) = 2
Now as C(x) = A(x) * B(x) = x⁴ + x³ - 2x² - 3x - 3
C(1) = A(1) * B(1) = 3 * - 2 = - 6, C(2) = A(2) * B(2) = 7 * 1 = 7, C(3) = A(3) * B(3) = 13 * 6 = 78

So C(x) = {(1, - 6), (2, 7), (3, 78)} where degree of C(x) = degree of A(x) + degree of B(x) = 4
But we know that a polynomial of degree n - 1 requires n point value pairs, so 3 pairs of C(x) are not sufficient for determining C(x) uniquely as it is a polynomial of degree 4.
Therefore we need to calculate A(x) and B(x), for 2n point value pairs instead of n point value pairs so that C(x)’s point value form contains 2n pairs which would be sufficient to uniquely determine C(x) which would have a degree of 2(n - 1).

Now if we had performed this algorithm naively it would have gone on like this $\text{[math]}$

Note — This is NOT the actual FFT algorithm but I would say that understanding this would layout framework to the real thing.
Note — This is actually DFT algorithm, ie. Discrete fourier transform.

We construct the point value form of A(x) and B(x) using x₀, x₁, ..., x_2n - 1 which can be made using random distinct integers. So point value form of A(x) = {(x₀, α₀), (x₁, α₁), (x₂, α₂), ..., (x_2n - 1, α_2n - 1)} and B(x) = {(x0, β₀), (x1, β₁), (x2, β₂), ..., (x2n - 1, β_2n - 1)} - (1) Note — The x₀, x₁, ..., x_2n - 1 should be same for A(x) and B(x). This conversion takes O(n²).
As C(x) = A(x) * B(x), then what would have been the point-value form of C(x) ?
If we plug in x₀ to all 3 equations then we see that $\text{[math]}$
C(x₀) = A(x₀) * B(x₀)
C(x₀) = α₀ * β₀
So C(x) in point value form will be C(x) = {(x₀, α₀ * β₀), (x₁, α₁ * β₁), (x₂, α₂ * β₂), ..., (x_2n - 1, α_2n - 1 * β_2n - 1)}
This is the convolution, and it’s time complexity is O(n)
Now converting C(x) back from point value form to coefficient form can be represented by using the equation 2. Here calculating the inverse of the matrix requires LU decomposition or Lagrange’s Formula. I won’t be going into depth on how to do the inverse, as this wont be required in the REAL FFT. But we get to understand that using Lagrange’s Formula we would’ve been able to do this step in O(n²).

Note — Here the algorithm was performed wherein we used x₀, x₁, ..., x_2n - 1 as ordinary real numbers, the FFT on the other hand uses roots of unity instead and we are able to optimize the O(n²) conversions from coefficient to point value form and vice versa to $\text{[math]}$ because of the special mathematical properties of roots of unity which allows us to use the divide and conquer approach. I would recommend to stop here and re-read the article till here until the algorithm is crystal clear as this is the raw concept of FFT.

A math primer on complex numbers and roots of unity would be a must now.

Q) What is a complex number ?
Answer — Quoting Wikipedia, “A complex number is a number that can be expressed in the form a + bi, where a and b are real numbers and i is the imaginary unit, that satisfies the equation i² = - 1. In this expression, a is the real part and b is the imaginary part of the complex number.” The argument of a complex number is equal to the magnitude of the vector from origin (0, 0) to (a, b), therefore arg(z) = a² + b² where z = a + bi.

Q) What are the roots of unity ?
Answer — An nth root of unity, where n is a positive integer (i.e. n = 1, 2, 3, ...), is a number z satisfying the equation zⁿ = 1.
In the image above, n = 2, n = 3, n = 4, from LEFT to RIGHT.
Intuitively, we can see that the nth root of unity lies on the circle of radius 1 unit (as its argument is equal to 1) and they are symmetrically placed ie. they are the vertices of a n — sided regular polygon.

The n complex nth roots of unity can be represented as e^{2πik / n} for k = 0, 1, ..., n - 1
Also $\text{[math]}$ Graphically see the roots of unity in a circle then this is quite intuitive.

If n = 4, then the 4 roots of unity would’ve been e^{2πi * 0 / n}, e^{2πi * 1 / n}, e^{2πi * 2 / n}, e^{2πi * 3 / n} = (e^2πi / n)⁰, (e^2πi / n)¹, (e^{2πi / n /})², (e^2πi / n)³ where n should be substituted by 4.
Now we notice that all the roots are actually power of e^2πi / n. So we can now represent the n complex nth roots of unity by w_n⁰, w_n¹, w_n², ..., w_n^n - 1, where w_n = e^2πi / n

Now let us prove some lemmas before proceeding further $\text{[math]}$

Note — Please try to prove these lemmas yourself before you look up at the solution :)

Lemma 1 — For any integer n ≥ 0, k ≥ 0 and d ≥ 0, w_dn^dk = w_n^k

Proof — w_dn^dk = (e^{2πi / dn})^dk = (e^2πi / n)^k = w_n^k

Lemma 2 — For any even integer n > 0, w_n^n / 2 = w₂ = - 1

Proof — w_n^n / 2 = w_{2 * (n / 2)}^n / 2 = w_d * 2^d * 1 where d = n / 2

w_d * 2^d * 1 = w₂¹ — (Using Lemma 1)

w₂¹ = e^iπ = cos(π) + i * sin(π) = - 1 + 0 = - 1

Lemma 3 — If n > 0 is even, then the squares of the n complex nth roots of unity are the (n/2) complex (n/2)th roots of unity, formally (w_n^k)² = (w_n^{k + n / 2})² = w_n / 2^k

Proof — By using lemma 1 we have (w_n^k)² = w_{2 * (n / 2)}^2k = w_n / 2^k, for any non-negative integer k. Note that if we square all the complex nth roots of unity, then we obtain each (n/2)th root of unity exactly twice since,

$\text{[math]}$ (Proved above)

Also, (w_n^{k + n / 2})² = w_n^2k + n = e^{2πi * k' / n}, where k' = 2k + n

e^{2πi * k' / n} = e^{2πi * (2k + n) / n} = e^{2πi * (2k / n + 1)} = e^{(2πi * 2k / n) + (2πi)} = e^{2πi * 2k / n} * e^2πi = w_n^2k * (cos(2π) + i * sin(2π))

$\text{[math]}$ (Proved above)

Therefore, (w_n^k)² = (w_n^{k + n / 2})² = w_n / 2^k

Lemma 4 — For any integer n ≥ 0, k ≥ 0, w_n^{k + n / 2} = - w_n^k

Proof — w_n^{k + n / 2} = e^{2πi * (k + n / 2) / n} = e^{2πi * (k / n + 1 / 2)} = e^{(2πi * k / n) + (πi)} = e^{2πi * k / n} * e^πi = w_n^k * (cos(π) + i * sin(π)) = w_n^k * ( - 1) = - w_n^k

1. The FFT — Converting from coefficient form to point value form

Note — Let us assume that we have to multiply 2 n — degree polynomials, when n is a power of 2. If n is not a power of 2, then make it a power of 2 by padding the polynomial's higher degree coefficients with zeroes.
Now we will see how is A(x) converted from coefficient form to point value form in $\text{[math]}$ using the special properties of n complex nth roots of unity.

y_k = A(x_k)
$\text{[math]}$

Let us define $\text{[math]}$

A^even(x) = a₀ + a₂ * x + a₄ * x² + ... + a_n - 2 * x^{n / 2 - 1}, A^odd(x) = a₁ + a₃ * x + a₅ * x² + ... + a_n - 1 * x^{n / 2 - 1}

Here, A^even(x) contains all even-indexed coefficients of A(x) and A^odd(x) contains all odd-indexed coefficients of A(x).

It follows that A(x) = A^even(x²) + x * A^odd(x²)

So now the problem of evaluating A(x) at the n complex nth roots of unity, ie. at w_n⁰, w_n¹, ..., w_n^n - 1 reduces to $\text{[math]}$

Evaluating the n/2 degree polynomials A^even(x²) and A^odd(x²). As A(x) requires w_n⁰, w_n¹, ..., w_n^n - 1 as the points on which the function is evaluated.

Therefore A(x²) would’ve required (w_n⁰)², (w_n¹)², ..., (w_n^n - 1)².

Extending this logic to A^even(x²) and A^odd(x²) we can say that the A^even(x²) and A^odd(x²) would require (w_n⁰)², (w_n¹)², ..., (w_n^{n / 2 - 1})² ≡ w_n / 2⁰, w_n / 2¹, ..., w_n / 2^{n / 2 - 1} as the points on which they should be evaluated.

Here we can clearly see that evaluating A^even(x²) and A^odd(x²) at w_n / 2⁰, w_n / 2¹, ..., w_n / 2^{n / 2 - 1} is recursively solving the exact same form as that of the original problem, i.e. evaluating A(x) at w_n⁰, w_n¹, ..., w_n^n - 1. (The division part in the divide and conquer algorithm)
Combining these results using the equation A(x) = A^even(x²) + x * A^odd(x²). (The conquer part in the divide and conquer algorithm).

Now, A(w_n^k) = A^even(w_n^2k) + w_n^k * A^odd(w_n^2k), if k < n / 2, quite straightforward

And if k ≥ n / 2, then A(w_n^k) = A^even(w_n / 2^{k - n / 2}) - w_n^{k - n / 2} * A^odd(w_n / 2^{k - n / 2})

Proof — A(w_n^k) = A^even(w_n^2k) + w_n^k * A^odd(w_n^2k) = A^even(w_n / 2^k) + w_n^k * A^odd(w_n / 2^k) using (w_n^k)² = w_n / 2^k

A(w_n^k) = A^even(w_n / 2^k) - w_n^{k - n / 2} * A^odd(w_n / 2^k) using w_n^{k' + n / 2} = - w_n^k' i.e. (Lemma 4), where k' = k - n / 2.

So the pseudocode (Taken from CLRS) for FFT would be like this $\text{[math]}$

1.RECURSIVE-FFT(a)
2. n = a.length()
3. If n = = 1 then return a //Base Case
4. w_n = e^2πi / n
5. w = 1
6. a^even = (a₀, a₂, ..., a_n - 2)
7. a^odd = (a₁, a₃, ..., a_n - 1)
8. y^even = RECURSIVE - FFT(a^even)
9. y^odd = RECURSIVE - FFT(a^odd)
10. For k = 0 to n / 2 - 1
11.     y_k = y_k^even + w * y_k^odd
12.     y_{k + n / 2} = y_k^even - w * y_k^odd
13.     w * = w_n
14. return y;

2. The Multiplication OR Convolution

This is simply this $\text{[math]}$
1.a = RECURSIVE-FFT(a), b = RECURSIVE-FFT(b) //Doing the fft.
2.For k = 0 to n - 1
3. c(k) = a(k) * b(k) //Doing the convolution in O(n)

3. The Inverse FFT

Now we have to recover c(x) from point value form to coefficient form and we are done. Well, here I am back after like 8 months, sorry for the trouble. So the whole FFT process can be show like the matrix $\text{[math]}$

$\text{[math]}$ $\text{[math]}$ $\text{[math]}$

The square matrix on the left is the Vandermonde Matrix (V_n), where the (k, j) entry of V_n is w_n^kj
Now for finding the inverse we can write the above equation as $\text{[math]}$

$\text{[math]}$ $\text{[math]}$ $\text{[math]}$

Now if we can find V_n^- 1 and figure out the symmetry in it like in case of FFT which enables us to solve it in NlogN then we can pretty much do the inverse FFT like the FFT. Given below are Lemma 5 and Lemma 6, where in Lemma 6 shows what V_n^- 1 is by using Lemma 5 as a result.

Lemma 5 — For n ≥ 1 and nonzero integer k not a multiple of n, $\text{[math]}$ = 0

Proof — $\text{[math]}$ Sum of a G.P of n terms.
$\text{[math]}$

We required that k is not a multiple of n because w_n^k = 1 only when k is a multiple of n, so to ensure that the denominator is not 0 we required this constraint.

Lemma 6 — For j, k = 0, 1, ..., n - 1, the (j, k) entry of V_n^- 1 is w_n^- kj / n

Proof — We show that V_n^- 1 * V_n = I_n, the n * n identity matrix. Consider the (j, j') entry of V_n^- 1 * V_n and let it be denoted by [V_n^- 1 * V_n]_jj'
So now $\text{[math]}$
$\text{[math]}$

Now if j' = j then w_n^{k(j' - j)} = w_n⁰ = 1 so the summation becomes 1, otherwise it is 0 in accordance with Lemma 5 given above. Note here that the constraints fore Lemma 5 are satisfied here as n ≥ 1 and j' - j cannot be a multiple of n as j' ≠ j in this case and the maximum and minimum possible value of j' - j is (n - 1) and - (n - 1) respectively.

So now we have it proven that the (j, k) entry of V_n^- 1 is w_n^- kj / n.

Therefore, $\text{[math]}$
The above equation is similar to the FFT equation $\text{[math]}$

The only differences are that a and y are swapped, we have replaced w_n by w_n^- 1 and divided each element of the result by n
Therefore as rightly said by Adamant that for inverse FFT instead of the roots we use the conjugate of the roots and divide the results by n.

That is it folks. The inverse FFT might seem a bit hazy in terms of its implementation but it is just similar to the actual FFT with those slight changes and I have shown as to how we come up with those slight changes. In near future I would be writing a follow up article covering the implementation and problems related to FFT.

Part 2 is here

References used — Introduction to Algorithms(By CLRS) and Wikipedia

Feedback would be appreciated. Also please notify in the comments about any typos and formatting errors :)

Комментарии (56)

Показать архивные | Написать комментарий?

AlexandruValeanu

8 лет назад, # |

+24

How can you do a CONVULSION using FFT?

→ Ответить

sidhant

8 лет назад, # ^ |

← Rev. 2 →

Sorry it is a typo, thanks for pointing out.

Auto comment: topic has been updated by sidhant (previous revision, new revision, compare).

mirceadino

+13

This is gold to me, I finally start to understand how FFT works. I'm looking forward to the 3rd part!

Another great resource : http://www.cs.cmu.edu/afs/cs/academic/class/15451-s10/www/lectures/lect0423.txt

xuanquang1999

+18

Finally, a topic on FFT that a high school student can understand. Very useful topic, thank you a lot!

Animadversion

6 лет назад, # ^ |

Hi xuanquang1999 what topics do you need to understand fft?

I'm not sure if I understand your question correctly, but I meant this topic (the one that you're commenting on).

Mmm I think I was not clear enough, My question is that if you need to know some topics (math's, computer science's or some algorithms) before studying FFT?

Errichto

polynomials, matrices, complex numbers

mbrc

+10

Great tutorial! :D

Thanks! :D

zscoder

Finally an understandable tutorial on FFT!

Example of DFT elaborated.

rajat1603

Waiting for inverse FFT.

Coming in a week!

Diego1149

+25

Still waiting :)

downvoteplz

still waiting dogo

Sorry I was a bit busy the past few months, would be finishing off this article within next week!!

Such busy dogo much wow!

sampriti

← Rev. 3 →

SarvagyaAgarwal

7 лет назад, # ^ |

-11

This is his real account NibNalin

adamant

Spoiler: use FFT with conjugation to root instead of root itself. And divide values by n after this.

tiagomontalvao

Nice tutorial xD

Just two observations:

1) When you explain the roots of unity and give the example of n=4, you wrote that e^{2π i * 0 / n} = e¹, whereas it should be e⁰

2) In the proof of lemma 3, you used twice the expression (cos(2) + i * sin(2)), instead of (cos(2π) + i * sin(2π))

Thanks for pointing out, will fix it :)

javacoder1

Any questions on codeforces using this concept?

TouchMe

Is that a typo?

C(x) = {(1, 6), (2, 7), (3, 78)} should be C(x) = {(1, -6), (2, 7), (3, 78)}???

Thanks for pointing out, fixed :)

advitiyabrijesh

Please do post Implementation and problems too! BTW Awesome Tutorial (y).

tttayushag10004

-10

nice editorial

i_love_emilia_clarke

7 лет назад, # |

+11

"for each point value form there is exactly one coefficient representation and vice — versa". Above statement is not true, for each coefficient representation there may be many point value form, point value form is not unique for a polynomial, you can choose any N point to uniquely identify a n-bounded polynomial.

Yes, my bad. I fill fix it.

baukaman

good job sidhant!

Russian speaking contestants use e-maxx. Very comprehensive explanation with codes. As bonus you get optimization tricks + references to base problems.

lifecodemohit

Nice tutorial.

I think, there is a typo here: Q) What is point value form ? Ans) Well, a polynomial A(x) of degree n can be represented in its point value form like this

Degree of A(x) should be n-1.

choutii

6 лет назад, # |

is really a good TV drama, I mean season 1, season 2 confused me

YoyOyoYOy000y000

is it possible in fft.. all possible x^y minimum value...

like , A = a1x^3 + b1x^2+ c1x + d1 B = a2x^3 + b2x^2+ c2x + d1 now possible possible way to create x^3= (a1*d1),(b1*c2),(c1*b2),(d1*a2) now i want min((a1*d1),(b1*c2),(c1*b2),(d1*a2))

if any other algorithm what is it?

JoaoBapt

Hi sidhant! I would just like to point that you mixed a concept in the article! Where you said "The argument of a complex number is equal to the magnitude of the vector from origin" you are actually talking of the modulus of the complex number, not its argument. Aside from that, excellent article! EDIT: and the modulus is actually the square root of both terms squared, I think you forgot the square root.

GeorgeRaouf

5 лет назад, # |

problems on FFT?

pleasant

5 лет назад, # ^ |

Here.

thanks :D

hellojarvis

a̶r̶g̶u̶m̶e̶n̶t̶ modulus. argument is the angle.

nj099

4 года назад, # |

This is a great source to learn fft — http://web.cecs.pdx.edu/~maier/cs584/Lectures/lect07b-11-MG.pdf

TheEpicCowOfLife

In lemma 6 there was a typo that confused me a bit. I know this blog post is 4 years old but the fist summation on line 2 of lemma 6 has the /n in the wrong place. The first term in the summation should be (1/n) * w_n^(-kj), not w_n^(-kj/n). Either way, great blog! Time to see how this stuff is impled.

spookywooky

4 года назад, # ^ |

it should look like $$$\frac{w_n^{-kj}}{n}$$$ or $$$\frac{1}{n}\cdot w_n^{-kj}$$$

code026

3 года назад, # |

An excellent complimentary video https://www.youtube.com/watch?v=h7apO7q16V0

Parvej

I have started learning FFT a few days ago.

Suppose, I have a vector <.Complex> A.

Will anyone please tell me

1.What does FFT(A) really mean?

2.What is the physical meaning of it?

Killever

2 года назад, # |

sidhant I don't see this anymore it says Unable to parse markup [type=CF_MARKDOWN]

QuangBuiCPP

2 года назад, # ^ |

https://codeforces.cc/topic/43659/en14

This is the last viewable version I can find!

-QuangBui(YT/CP)

mychecksdead

11 месяцев назад, # |

-16

This tutorial is too good to exist.

abhishek97.emp

10 месяцев назад, # |

Can someone please tell me, in the conquer part, why A(Wn ^k) is different for k<n/2 and k>=n/2?

Luci_badea1000

6 месяцев назад, # |

Why do you need to divide by n for the roots of the inverse matrix? Doesn't it also work without the division by n? Can someone explain please? (I mean that I 'think' the sum from lemma 6 is true without the /n...)

6 месяцев назад, # ^ |

Nvm, I took a closer look and I understood..Sorry for spam

okgourav

7 недель назад, # |

checkout this video for hindi explanation: https://www.youtube.com/watch?v=tXpzsSLxx3Q

Блог пользователя sidhant

Aim — To multiply 2 n-degree polynomials in instead of the trivial O(n2)

Now if we had performed this algorithm naively it would have gone on like this

A math primer on complex numbers and roots of unity would be a must now.

1. The FFT — Converting from coefficient form to point value form

2. The Multiplication OR Convolution

3. The Inverse FFT

Aim — To multiply 2 n-degree polynomials in $\text{[math]}$ instead of the trivial O(n²)

Now if we had performed this algorithm naively it would have gone on like this $\text{[math]}$