Hi, this is going to be a solution to a "harder" version of the last problem of the recent div 2 round that I coordinated. We can actually solve the problem for $$$|s| \leq 60$$$ but we decided to lower the constraint as forcing this solution is not interesting and also way too hard for the last problem of a div 2. I would like to thank the authors for allowing me to write this blog and shamelessly farm contribution from their contest.

Firstly, I would like to share a solution (credits to Sexpert) which solves $$$|s| \leq 60$$$ without explictly factoring the entire polynomial.

**spoiler**

Now, let's get into the algorithm to factorize polynomials. We will use Berlakamp's algorithm, which is an algorithm to factorise polynomials where the coefficients of polynomials are elements of the field $$$GF(q)$$$, where $$$q=p^k$$$. I am writing this blog because the original paper was pretty hard to follow for me. So, I hope writing this blog would make understanding this algorithm from a competitive programming standpoint easy.

Before we get to the algorithm we have to prove $$$2$$$ things first:

$$$f(x)^q-f(x)=\prod\limits_{u \in GF(q)} (f(x)+u)$$$

**proof**

$$$f(x)^q=f(x^q)$$$

**proof**

### The actual algorithm

Let's say we found some polynomial $$$h(x)^q-h(x) = 0 \pmod {f(x)}$$$ where $$$1 \leq \deg h<\deg f$$$ (ignore how we have done this for now).

We know that $$$h(x)^q-h(x)= \prod\limits_{u \in GF(q)} (h(x)+u)$$$ which is a multiple of $$$f(x)$$$.

$$$\gcd$$$ is a multiplicative function in the sense that for coprime $$$a,b$$$, we have $$$\gcd(a,c) \cdot \gcd(b,c) = \gcd(a \cdot b,c)$$$. It is obvious that $$$h(x)+a$$$ is coprime from $$$h(x)+b$$$ when $$$a \neq b$$$. So $$$\prod\limits_{u \in GF(q)} \gcd(h(x)+u,f(x)) = f(x)$$$.

We are almost done, we just need to show that the product above gives us a useful factorization of $$$f(x)$$$, that is it doesn't tell us that $$$f(x)=f(x) \cdot 1 \cdot 1 \ldots$$$. But we have $$$\deg h<\deg f$$$, so it is impossible that $$$\gcd(h(x)+a,f(x))=f(x)$$$.

### Finding $$$h$$$

We have $$$h(x)^q-h(x) = h(x^q)-h(x) = 0 \pmod{f(x)}$$$. Therefore, we can reduce this to finding $$$h_0,h_1,\ldots$$$ such that $$$h_0(x^{0q}-x^0)+h_1(x^{1q}-x^1)+\ldots = 0 \pmod{f(x)}$$$. This is some linear algebra as we want to find the null-space of polynomial of $$$x^{iq}-x^i \mod f(x)$$$. This can be done using XOR basis Since we are finding $$$h(x)$$$ under modulo $$$f(x)$$$, we can reduce any non-trivial solution to $$$h(x)$$$ to one with $$$1 \leq \deg h < \deg f$$$.

Now, the only problem is figuring out when such $$$h$$$ exists. Suppose that $$$f(x)=f_0(x)f_1(x)$$$ for some **coprime** non-constant polynoimials $$$f_0(x)$$$ and $$$f_1(x)$$$.

By Chinese remainder theorem, we know that for any $$$(c_0(x),c_1(x))$$$ with $$$\deg c_i < \deg f_i$$$, we can find a $$$h(x)$$$ that satisfies $$$h(x) \pmod{f_i(x)}=c_i(x)$$$.

Let us look at the particular case of $$$(c_0(x),c_1(x))=(1,1)$$$.

We have $$$h(x) = 1 = 1^q = h(x)^q \pmod{f_i(x)}$$$.

Since we have $$$h(x)=h(x)^q \pmod{f_i(x)}$$$, it follows that $$$h(x)=h(x)^q \pmod{f(x)}$$$.

However, a shortcoming of this process is that $$$h(x)$$$ only exists when we can find appropriate $$$f_0(x)$$$ and $$$f_1(x)$$$. In particular, we will have to seperately handle the case where $$$f(x)=g(x)^k$$$ where $$$g(x)$$$ is an irreducible polynomial and $$$k>1$$$.

### Handling Repeated Powers

Notice that $$$\gcd(f(x),f'(x)) = \gcd(g(x)^k,k g(x)^{k-1})$$$.

- If $$$k$$$ is a multiple of $$$p$$$, then we can use the identity of $$$f(x^p)=f(x)^p$$$ to recurse to a smaller polynomial.
- Otherwise, $$$\gcd(f(x),f'(x))=g(x)^{k-1}$$$ and we immediately have $$$g(x)=\frac{f(x)}{\gcd(f(x),f'(x))}$$$.

## Time Complexity

Here is my implementation: 162171925. Note that it does not actually solve $$$|s| \leq 60$$$ quickly because it does not do factorize on integers quickly.

Let the degree of the polynomial be want to factorize be $$$d$$$. The each call of `factor`

requires us to build a xor basis and also find the $$$\gcd$$$ of $$$q$$$ polynomials. They respectively take $$$O(d^2 \cdot A(d))$$$ and $$$O(q \cdot d \cdot A(d))$$$ time, where $$$O(A(d))$$$ is the time complexity required to add two polynomials. `factor`

is called at most $$$d$$$ times.

In $$$GF(2)$$$, we can assume that $$$A(d)=1$$$ because it is literally XOR, so the time complexity is $$$O(d^3)$$$.

In $$$GF(q)$$$, it does not make sense to assume that $$$A(d)=1$$$, so we will use $$$A(d)=d$$$. The time complexity balloons to $$$O(d^4 + q d^2)$$$.

As I understand, Berlekamp's original algorithm is not polynomial-time (as in, you won't be able to use it for larger $$$p$$$).

I was looking into it several years ago, and it appears that all polynomial-time algorithms are randomized. For example, to factorize polynomials over $$$\mathbb Z_p$$$ in polynomial time, one can use Cantor-Zassenhaus algorithm (see Knuth's Seminumerical algorithms 3rd Ed., p. 448, section "Distinct-degree factorization" for reference):

## Getting rid of repeated factors

Algorithm takes a polynomial $$$f(x)$$$ to be factored as an input and expects it to be square-free (i.e. not have repeated irreducible divisors). To get rid of repeated factors, you'd need to first divide $$$f$$$ with $$$\gcd(f, f')$$$.

Note that it is possible that $$$f'=0$$$, in which case factorization should proceed as

per the identity in the article.

## Filtering by degrees

Over $$$\mathbb Z_p$$$, any irreducible polynomial $$$q(x)$$$ of degree $$$d$$$ is a divisor of $$$x^{p^d}-x$$$ and

nota divisor of $$$x^{p^t} - x$$$ for $$$t < d$$$.Say, we want to factorize $$$f(x)$$$ of degree $$$n$$$, then we may start by factoring it as

where $$$h_i(x)$$$ collects all irreducible divisors of $$$f(x)$$$ that have degree $$$i$$$. To do this, we use

where $$$f_1(x) = f(x)$$$ and $$$f_i(x) = \frac{f_{i-1}(x)}{h_{i-1}(x)}$$$.

## Fixed-degree factorization

Now, each individual $$$h_i(x)$$$ has $$$\frac{\deg h_i}{i}$$$ irreducible divisors of degree $$$i$$$ each.

David Cantor and Hans Zassenhaus proved that

for any polynomial $$$t(x)$$$, so to factorize $$$h_i(x)$$$ we may try to compute these $$$\gcd$$$ with different random $$$t(x)$$$, until all the divisors are found. It can be proven that the probability of finding a non-trivial divisor is at least $$$\frac{1}{2}$$$ on each iteration.

Note that executing it just with $$$i=1$$$ it allows to find all roots of $$$q(x)$$$.

## Notes

The algorithm expects odd $$$p$$$. Otherwise, one may use another divisor $$$d$$$ of $$$p^i-1$$$ and compute $$$\gcd(h_i, t^d - 1)$$$.

Why is irreducible polynomial $$$q(x)$$$ of degree $$$d$$$ not a divisor of $$$x^{p^t}-x$$$ for $$$t<d$$$?

I would assume it is because magically $$$ord(x)$$$ is same as the order of the field, but it is not the case. For example, $$$q(x)=1+x^3+x^4+x^5+x^8$$$ we have $$$ord(x)=17$$$.

It follows from Lemma 2.3 in Berlekamp's article Factoring polynomials over large finite fields:

in $$$GF(q)$$$, $$$x^{q^m} - x$$$ factors into the product of all monic irreducible polynomials of degrees dividing $$$m$$$.

So, after we trimmed smaller degrees with $$$f_i$$$, we're only left with degrees that are exactly $$$m$$$.

If you allow $$$q(x)$$$ to contain repeated divisors, then $$$h_i(x)$$$ will not contain all irreducible divisors of $$$q(x)$$$ as $$$\gcd$$$ only takes a single copy of each polynomial. I think you might want to explictly state that the first step of this algorithm is to remove repeated divisors to work with square-free polynomials.

This is true, algorithm as a whole expects that repeated factors are removed from $$$f(x)$$$ (which is doable in the same way you do it in the article). I amended the comment to reflect it.

And here I am trying to understand how this problem is even connected to polynomial factorization. The answer is... it is not, really? You only use the factorization to get some multiple of the answer so that you can then check its divisors. For that purpose, you can use $$$2^{n} \prod_{i=1}^{n} (2^i - 1)$$$. It is not even important that this number doesn't fit in long long, you only need its factorization.

If we know that $$$x^N=1$$$ for $$$N = p_1^{d_1} \dots p_m^{d_m}$$$, is it correct that the order of $$$x$$$ is something like

where $$$k_i$$$ is the largest possible such that $$$x^{N / p_i^{k_i}} = 1$$$? Or is there some other way to find the order?

That's right