Linear Recurrence and Berlekamp-Massey Algorithm

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
2	maomao90	160
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	SecondThread	148
5	atcoder_official	148
8	Petr	147
9	nor	144
9	TheScrasse	144

#IjustWantContribution

It seems there isn't any blog about Berlekamp-Massey Algorithm around here, so I decided to go on a try. :P

Acknowledgement: Hats off to matthew99 for introducing this algorithm.

What is 'linear recurrence'?

Assuming there is a (probably infinity) sequence a₀, a₁...a_n - 1, we call this sequence satisfies a linear recurrence relation p₁, p₂...p_m, iff $\text{[math]}$ . (Obviously, if m ≥ n any p can do :P)

How to calculate k-th term of a linear recurrence? (Bonus)

For a polynomial $\text{[math]}$ , we define $\text{[math]}$ .

Obviously G satisfies G(f) ± G(g) = G(f ± g).

Because $\text{[math]}$ , if we let $\text{[math]}$ , then G(f) = 0. Also G(fx), G(fx²)... = 0. So G(fg) = 0 (g is any polynomial).

What we want is G(x^k). Because G(f⌊ x^k / f⌋) = 0, then $\text{[math]}$ .

We can calculate $\text{[math]}$ in a binary-exponentiation manner, then calculate $\text{[math]}$ . Time complexity is $\text{[math]}$ or $\text{[math]}$ (if you use fft etc.)

How to find (the shortest) linear recurrence relation?

It's Berlekamp-Massey Algorithm to the rescue! For a given sequence x₀, x₁...x_n - 1, it can calculate one shortest linear recurrence relation for every prefix in O(n²) time.

Let's define the value of a relation sequence p₁, p₂...p_m evaluated at position t: $\text{[math]}$ (t ≥ m). A valid linear recurrence relation is a relation sequence with correct value evaluated at every position ≥ m.

Let's consider the numbers from left to right. Start from {}, we evaluate the current relation sequence at current position t (from 1 to n). If we got a_t, then it's still good, go on. Assume we've got value v, if we somehow got some relation sequence x that evaluated as 1 at position t, and evaluated as 0 (or undefined) at positions < t, then minus current sequence with (v - a_t)x, we're done.

If this is not first non-zero position, we have run into this situation before. Let's say s = {s₁, s₂...s_m} evaluated as x_t' + v' at position t' and correct at positions before t', then {1, - s₁, - s₂... - s_m} should evaluated as v' at position t' + 1 and 0 otherwise. Divide it with v' and add proper (t - t' - 1) zeroes in front, we've got the x we need!

If we run into this situation several times before, we can choose the one that is shortest after filling zeroes.

a sample (in case you didn't understand clearly)

A not-so-elegant proof-of-concept

#include <bits/stdc++.h>
using namespace std;
#define pb push_back
typedef double ld;
int n,pn=0,fail[2333];
typedef vector<ld> vld;
vld ps[2333]; ld x[2333],delta[2333];
int main()
{
    scanf("%d",&n);
    for(int i=1;i<=n;i++) scanf("%lf",x+i);
    int best=0;
    for(int i=1;i<=n;i++)
    {
        ld dt=-x[i];
        for(int j=0;j<(int)ps[pn].size();j++)
            dt+=x[i-j-1]*ps[pn][j];
        delta[i]=dt;
        if(fabs(dt)<=1e-7) continue;
        fail[pn]=i; if(!pn) {ps[++pn].resize(i); continue;}
        vld&ls=ps[best]; ld k=-dt/delta[fail[best]];
        vld cur; cur.resize(i-fail[best]-1); cur.pb(-k);
        for(int j=0;j<(int)ls.size();j++) cur.pb(ls[j]*k);
        if(cur.size()<ps[pn].size()) cur.resize(ps[pn].size());
        for(int j=0;j<(int)ps[pn].size();j++) cur[j]+=ps[pn][j];
        if(i-fail[best]+ps[best].size()>=ps[pn].size()) best=pn;
        ps[++pn]=cur;
    }
    for(int g=0;g<(int)ps[pn].size();g++)
        cout<<ps[pn][g]<<" "; puts("");
}

Combine the above two section, we can acquire a handy weapon for these kind of problems :)

#include <bits/stdc++.h>
using namespace std;
#define pb push_back
typedef long long ll;
#define SZ 233333
const int MOD=1e9+7;
ll qp(ll a,ll b)
{
	ll x=1; a%=MOD;
	while(b)
	{
		if(b&1) x=x*a%MOD;
		a=a*a%MOD; b>>=1;
	}
	return x;
}
namespace linear_seq {
inline vector<int> BM(vector<int> x)
{
	vector<int> ls,cur; int lf,ld;
	for(int i=0;i<int(x.size());++i)
	{
		ll t=-x[i]%MOD;
		for(int j=0;j<int(cur.size());++j)
			t=(t+x[i-j-1]*(ll)cur[j])%MOD;
		if(!t) continue;
		if(!cur.size()) {cur.resize(i+1); lf=i; ld=t; continue;}
		ll k=-t*qp(ld,MOD-2)%MOD;
		vector<int> c(i-lf-1); c.pb(-k);
		for(int j=0;j<int(ls.size());++j) c.pb(ls[j]*k%MOD);
		if(c.size()<cur.size()) c.resize(cur.size());
		for(int j=0;j<int(cur.size());++j)
			c[j]=(c[j]+cur[j])%MOD;
		if(i-lf+(int)ls.size()>=(int)cur.size())
			ls=cur,lf=i,ld=t;
		cur=c;
	}
	vector<int>&o=cur;
	for(int i=0;i<int(o.size());++i)
		o[i]=(o[i]%MOD+MOD)%MOD;
	return o;
}
int N; ll a[SZ],h[SZ],t_[SZ],s[SZ],t[SZ];
inline void mull(ll*p,ll*q)
{
	for(int i=0;i<N+N;++i) t_[i]=0;
	for(int i=0;i<N;++i) if(p[i])
		for(int j=0;j<N;++j)
			t_[i+j]=(t_[i+j]+p[i]*q[j])%MOD;
	for(int i=N+N-1;i>=N;--i) if(t_[i])
		for(int j=N-1;~j;--j)
			t_[i-j-1]=(t_[i-j-1]+t_[i]*h[j])%MOD;
	for(int i=0;i<N;++i) p[i]=t_[i];
}
inline ll calc(ll K)
{
	for(int i=N;~i;--i) s[i]=t[i]=0;
	s[0]=1; if(N!=1) t[1]=1; else t[0]=h[0];
	for(;K;mull(t,t),K>>=1) if(K&1) mull(s,t); ll su=0;
	for(int i=0;i<N;++i) su=(su+s[i]*a[i])%MOD;
	return (su%MOD+MOD)%MOD;
}
inline int work(vector<int> x,ll n)
{
	if(n<int(x.size())) return x[n];
	vector<int> v=BM(x); N=v.size(); if(!N) return 0;
	for(int i=0;i<N;++i) h[i]=v[i],a[i]=x[i];
	return calc(n);
}
}
using linear_seq::work;
int main()
{
	cout<<work({1,1,2,3,5,8,13,21},10)<<"\n";
}

Applications

Or, in other words, where can we find linear recurrences?

From the point of generating function, let A and P be the generating function of a and p, then A = AP + A₀ (A₀ depends on the first terms of a), then A = A₀ / (1 - P). Moreover, if A = B / C and the constant term of C is 1 then there is a linear recurrence relation for a. So, provided with the generating function of a, one can tell if it's a linear recurrence easily.

If we have some kind of dynamic-programming f[i][j] (i ≤ n, j ≤ m), we want to find f[n][1]. The transitions of f is something like $\text{[math]}$ . In old days, we may use matrix-multiplications. But things have changed! Calculate f[1][1], f[2][1]...f[m + m + m][1] and plug in the above code, we're done!

Why? Consider f[i] as a vector and v as a matrix, then f[i] = f[i - 1]v, so f[n] = f[1]v^n - 1. Consider the minimal polynomial of v, it's degree must be ≤ m and obviously there's a corresponding linear recurrence relation with length ≤ m. With a prefix of length m + m + m it's enough to figure out a correct relation.

Why is it better than matrix multiplication? Besides it's $\text{[math]}$ instead of $\text{[math]}$ , sometimes it's hard to acquire the exact transition matrix (or maybe just you're lazy enough), and this algorithm makes life better.

Try your hands

http://codeforces.com/contest/506/problem/E Write a naive dynamic-programming for small n, plug in BM, you're done! Life has never been so easy.

https://loj.ac/problem/2463 A chinese problem: Let n be a power of 2, you're given a directed graph with n nodes, from i to j there're A_i, j directed edges. Little Q is going on several trips, for every trip he will start from some node, make at least one step (i.e. go through at least one edge) and end at some node. He is wondering the number of ways if he's going on several travels, making x steps at total, and the bitwise-and of all start nodes and end nodes equals to y. For every $\text{[math]}$ , $\text{[math]}$ , you need to find the way modulo 998244353. To reduce output size, only output the bitwise-xor of all m × n answers. 2 ≤ n ≤ 64, 1 ≤ m ≤ 20000.

There're many more problems that can be solved in this way, but since posting them here is already spoiling I'm not going to post more :)