Editorial of Educational Codeforces Round 11

#	User	Rating
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

#	User	Contrib.
1	maomao90	174
2	adamant	164
2	awoo	164
4	TheScrasse	160
5	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	148
9	pajenegod	145
9	orz	145

660C - Hard Process

The problem was suggested by Mohammad Amin Raeisi Smaug.

Let's call the segment [l, r] good if it contains no more than k zeroes. Note if segment [l, r] is good than the segment [l + 1, r] is also good. So we can use the method of two pointers: the first pointer is l and the second is r. Let's iterate over l from the left to the right and move r while we can (to do that we should simply maintain the number of zeroes in the current segment).

C++ solution

const int N = 1200300;

int n, k;
int a[N];

bool read() {
	if (!(cin >> n >> k)) return false;
	forn(i, n) assert(scanf("%d", &a[i]) == 1);
	return true;
}

void solve() {
	int ansl = 0, ansr = 0;
	int j = 0, cnt = 0;
	forn(i, n) {
		if (j < i) {
			j = i;
			cnt = 0;
		}

		while (j < n) {
			int ncnt = cnt + !a[j];
			if (ncnt > k) break;
			cnt += !a[j];
			j++;
		}
		
		if (j - i > ansr - ansl)
			ansl = i, ansr = j;

		if (cnt > 0) cnt -= !a[i];
	}

	cout << ansr - ansl << endl;
	fore(i, ansl, ansr) a[i] = 1;
	forn(i, n) {
		if (i) putchar(' ');
		printf("%d", a[i]);
	}
	puts("");
}

Complexity: O(n).

660D - Number of Parallelograms

The problem was suggested by Sadegh Mahdavi smahdavi4.

It's known that the diagonals of a parallelogram split each other in the middle. Let's iterate over the pairs of points a, b and consider the middle of the segment $\text{[math]}$ : $\text{[math]}$ . Let's calculate the value cnt_c for each middle. cnt_c is the number of segments a, b with the middle c. Easy to see that the answer is $\text{[math]}$ .

C++ solution

const int N = 2020;

int n;
int x[N], y[N];

bool read() {
	if (!(cin >> n)) return false;
	forn(i, n)
		assert(scanf("%d%d", &x[i], &y[i]) == 2);
	return true;
}

inline li C2(li n) { return n * (n - 1) / 2; }

void solve() {
	map<pti, int> cnt;

	forn(i, n)
		forn(j, i) {
			int cx = x[i] + x[j];
			int cy = y[i] + y[j];
			cnt[{cx, cy}]++;
		}

	li ans = 0;
	for (const auto& p : cnt)
		ans += C2(p.y);
	cout << ans << endl;
}

Complexity: O(n²logn).

660E - Different Subsets For All Tuples

The problem was suggested by Lewin Gan Lewin.

Let's consider some subsequence with the length k > 0 (the empty subsequences we will count separately by adding the valye mⁿ at the end) and count the number of sequences that contains it. We should do that accurately to not count the same sequence multiple times. Let x₁, x₂, ..., x_k be the fixed subsequence. In the original sequence before the element x₁ can be some other elements, but none of them can be equal to x₁ (because we want to count the subsequence exactly one time). So we have m - 1 variants for each of the elements before x₁. Similarly between elements x₁ and x₂ can be other elements and we have m - 1 choices for each of them. And so on. After the element x_k can be some elements (suppose there are j such elements) with no additional constraints (so we have m choices for each of them). We fixed the number of elements at the end j, so we should distribute n - k - j numbers between numbers before x₁, between x₁ and x₂, \ldots, between x_k - 1 and x_k. Easy to see that we have $\text{[math]}$ choices to do that (it's simply binomial coefficient with allowed repititions). The number of sequences x₁, x₂, ..., x_k equals to m^k. So the answer is $\text{[math]}$ . Easy to transform the last sum to the sum $\text{[math]}$ . Note the last inner sum can be calculating using the formula for parallel summing: $\text{[math]}$ . So the answer equals to $\text{[math]}$ . Also we can get the closed formula for the last sum to get logarithmic solution, but it is not required in the problem.

C++ solution

int n, m;

bool read() {
	return !!(cin >> n >> m);
}

const int N = 1200300;

const int mod = 1000 * 1000 * 1000 + 7;

int gcd(int a, int b, int& x, int& y) {
	if (!a) {
		x = 0, y = 1;
		return b;
	}
	int xx, yy, g = gcd(b % a, a, xx, yy);
	x = yy - b / a * xx;
	y = xx;
	return g;
}

inline int inv(int a) {
	int x, y;
	assert(gcd(a, mod, x, y) == 1);
	x %= mod;
	return x < 0 ? x + mod : x;
}

inline int mul(int a, int b) { return int(a * 1ll * b % mod); }
inline int add(int a, int b) { return a + b >= mod ? a + b - mod : a + b; }
inline int sub(int a, int b) { return a - b < 0 ? a - b + mod : a - b; }

inline void inc(int& a, int b) { a = add(a, b); }

int fact[N], ifact[N];

inline int C(int n, int k) {
	if (k < 0 || k > n) return 0;
	return mul(fact[n], mul(ifact[k], ifact[n - k]));
}

int pm[N], pm1[N];

void solve() {
	const int N = n + 1;

	fact[0] = 1; fore(i, 1, N) fact[i] = mul(fact[i - 1], i);
	forn(i, N) ifact[i] = inv(fact[i]);

	pm[0] = 1; fore(i, 1, N) pm[i] = mul(pm[i - 1], m);
	pm1[0] = 1; fore(i, 1, N) pm1[i] = mul(pm1[i - 1], sub(m, 1));

	int ans = pm[n];
	fore(s, 1, n + 1) {
		int cur = 1;
		cur = mul(cur, pm[s]);
		cur = mul(cur, pm1[n - s]);
		cur = mul(cur, C(n, s - 1));
		inc(ans, cur);
	}
	cout << ans << endl;
}

Complexity: O((n + m)log MOD), где MOD = 10⁹ + 7.

660F - Bear and Bowling 4

The problem was prepared by Kamil Debowski Errichto. The problem analysis is also prepared by him.

The key is to use divide and conquer. We need a recursive function f(left, right) that runs f(left, mid) and f(mid+1, right) (where mid = (left + right) / 2) and also considers all intervals going through mid. We will eventually need a convex hull of lines (linear functions) and let's see how to achieve it.

For variables L, R ( $\text{[math]}$ , $\text{[math]}$ ) we will try to write the score of interval [L, R] as a linear function. It would be good to get something close to a_L·x_R + b_L where a_L and b_L depend on L, and x_R depends on R only.

$\text{[math]}$

For each L we should find a linear function f_L(x) = a_L·x + b_L where a_L, b_L should fit the equation ( * ):

$\text{[math]}$

Now we have a set of linear functions representing all possible left endpoints L. For each right endpoint R we should find x_R and const_R to fit equation ( * ) again. With value of x_R we can iterate over functions f_L to find the one maximizing value of b_L + a_L·x_R. And (still for fixed R) we should add const_R to get the maximum possible score of interval ending in R.

Brute Force with functions

#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
const int nax = 1e6 + 5;
ll ans;
ll t[nax];

struct Fun {
	ll a, b;
	ll getValue(ll x) { return a * x + b; }
};

void rec(int first, int last) {
	if(first == last) {
		ans = max(ans, t[first]);
		return;
	}
	int mid = (first + last) / 2;
	
	rec(first, mid); // the left half is [first, mid]
	rec(mid+1, last); // the right half is [mid+1, last]
	
	// we must consider all intervals starting in [first,mid] and ending in [mid+1, last]
	
	vector<Fun> functions;
	ll sum_so_far = 0; // t[i]+t[i+1]+...+t[mid]
	ll score_so_far = 0; // t[i]*1 + t[i+1]*2 + ... + t[mid]*(mid-i+1)
	for(int i = mid; i >= first; --i) {
		sum_so_far += t[i];
		score_so_far += sum_so_far;
		functions.push_back(Fun{mid - i + 1, score_so_far});
	}
	
	sum_so_far = 0;
	score_so_far = 0;
	for(int i = mid+1; i <= last; ++i) {
		sum_so_far += t[i];
		score_so_far += t[i] * (i - mid);
		for(Fun & f : functions) {
			ll score = score_so_far + f.getValue(sum_so_far);
			ans = max(ans, score);
		}
	}
}

int main() {
	int n;
	scanf("%d", &n);
	for(int i = 1; i <= n; ++i) scanf("%lld", &t[i]);
	rec(1, n);
	printf("%lldn", ans);
	return 0;
}

Now let's make it faster. After finding a set of linear functions f_L we should build a convex hull of them (note that they're already sorted by slope). To achieve it we need something to compare 3 functions and decide whether one of them is unnecessary because it's always below one of other two functions. Note that in standard convex hull of points you also need something similar (but for 3 points). Below you can find an almost-fast-enough solution with a useful function bool is_middle_needed(f1, f2, f3). You may check that numbers calculated there do fit in long long.

Almost fast enough

#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
const int nax = 1e6 + 5;
ll ans;
ll t[nax];

struct Fun {
	ll a, b;
	ll getValue(ll x) { return a * x + b; }
};

bool is_middle_needed(const Fun & f1, const Fun & f2, const Fun & f3) {
	// we ask if for at least one 'x' there is f2(x) > max(f1(x), f3(x))
	assert(0 < f1.a && f1.a < f2.a && f2.a < f3.a);
	
	// where is the intersection of f1 and f2?
	// f1.a * x + f1.b = f2.a * x + f2.b
	// x * (f2.a - f1.a) = f1.b - f2.b
	// x = (f1.b - f2.b) / (f2.a - f1.a)
	ll p1 = f1.b - f2.b;
	ll q1 = f2.a - f1.a;
	// and the intersection of f1 and f3
	ll p2 = f1.b - f3.b;
	ll q2 = f3.a - f1.a;
	assert(q1 > 0 && q2 > 0);
	// return p1 / q1 < p2 / q2
	return p1 * q2 < q1 * p2;
}

void rec(int first, int last) {
	if(first == last) {
		ans = max(ans, t[first]);
		return;
	}
	int mid = (first + last) / 2;
	
	rec(first, mid); // the left half is [first, mid]
	rec(mid+1, last); // the right half is [mid+1, last]
	
	// we must consider all intervals starting in [first,mid] and ending in [mid+1, last]
	
	vector<Fun> functions;
	ll sum_so_far = 0; // t[i]+t[i+1]+...+t[mid]
	ll score_so_far = 0; // t[i]*1 + t[i+1]*2 + ... + t[mid]*(mid-i+1)
	for(int i = mid; i >= first; --i) {
		sum_so_far += t[i];
		score_so_far += sum_so_far;
		Fun f = Fun{mid - i + 1, score_so_far};
		while(true) {
			int s = functions.size();
			if(s >= 2 && !is_middle_needed(functions[s-2], functions[s-1], f))
				functions.pop_back();
			else
				break;
		}
		functions.push_back(f);
	}
	
	sum_so_far = 0;
	score_so_far = 0;
	for(int i = mid+1; i <= last; ++i) {
		sum_so_far += t[i];
		score_so_far += t[i] * (i - mid);
		for(Fun & f : functions) {
			ll score = score_so_far + f.getValue(sum_so_far);
			ans = max(ans, score);
		}
	}
}

int main() {
	int n;
	scanf("%d", &n);
	for(int i = 1; i <= n; ++i) scanf("%lld", &t[i]);
	rec(1, n);
	printf("%lldn", ans);
	return 0;
}

Finally, one last thing is needed to make it faster than O(n²). We should use the fact that we have built a convex hull of functions (lines). For each R you should binary search optimal function. Alternatively, you can sort pairs (x_R, const_R) and then use the two pointers method — check the implementation in my solution below. It gives complexity $\text{[math]}$ because we sort by x_R inside of a recursive function. I think it's possible to get rid of this by sorting prefixes $\text{[math]}$ in advance because it's equivalent to sorting by x_R. And we should use the already known order when we run a recursive function for smaller intervals. So, I think $\text{[math]}$ is possible this way — anybody implemented it?

Intended solution with two pointers

// O(n log^2(n))
#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
const int nax = 1e6 + 5;
ll ans;
ll t[nax];

struct Fun {
	ll a, b;
	ll getValue(ll x) { return a * x + b; }
};

bool is_middle_needed(const Fun & f1, const Fun & f2, const Fun & f3) {
	// we ask if for at least one 'x' there is f2(x) > max(f1(x), f3(x))
	assert(0 < f1.a && f1.a < f2.a && f2.a < f3.a);
	
	// where is the intersection of f1 and f2?
	// f1.a * x + f1.b = f2.a * x + f2.b
	// x * (f2.a - f1.a) = f1.b - f2.b
	// x = (f1.b - f2.b) / (f2.a - f1.a)
	ll p1 = f1.b - f2.b;
	ll q1 = f2.a - f1.a;
	// and the intersection of f1 and f3
	ll p2 = f1.b - f3.b;
	ll q2 = f3.a - f1.a;
	assert(q1 > 0 && q2 > 0);
	// return p1 / q1 < p2 / q2
	return p1 * q2 < q1 * p2;
}

void rec(int first, int last) {
	if(first == last) {
		ans = max(ans, t[first]);
		return;
	}
	int mid = (first + last) / 2;
	
	rec(first, mid); // the left half is [first, mid]
	rec(mid+1, last); // the right half is [mid+1, last]
	
	// we must consider all intervals starting in [first,mid] and ending in [mid+1, last]
	
	vector<Fun> functions;
	ll sum_so_far = 0; // t[i]+t[i+1]+...+t[mid]
	ll score_so_far = 0; // t[i]*1 + t[i+1]*2 + ... + t[mid]*(mid-i+1)
	for(int i = mid; i >= first; --i) {
		sum_so_far += t[i];
		score_so_far += sum_so_far;
		Fun f = Fun{mid - i + 1, score_so_far};
		while(true) {
			int s = functions.size();
			if(s >= 2 && !is_middle_needed(functions[s-2], functions[s-1], f))
				functions.pop_back();
			else
				break;
		}
		functions.push_back(f);
	}
	
	vector<pair<ll, ll>> points;
	sum_so_far = 0;
	score_so_far = 0;
	for(int i = mid+1; i <= last; ++i) {
		sum_so_far += t[i];
		score_so_far += t[i] * (i - mid);
		points.push_back({sum_so_far, score_so_far});
		/*for(Fun & f : functions) {
			ll score = score_so_far + f.getValue(sum_so_far);
			ans = max(ans, score);
		}*/
	}
	
	sort(points.begin(), points.end());
	int i = 0; // which function is the best
	for(pair<ll, ll> p : points) {
		sum_so_far = p.first;
		score_so_far = p.second;
		while(i + 1 < (int) functions.size()
			&& functions[i].getValue(sum_so_far) <= functions[i+1].getValue(sum_so_far))
				++i;
		ans = max(ans, score_so_far + functions[i].getValue(sum_so_far));
	}
}

int main() {
	int n;
	scanf("%d", &n);
	for(int i = 1; i <= n; ++i) scanf("%lld", &t[i]);
	rec(1, n);
	printf("%lldn", ans);
	return 0;
}

Complexity: O(nlog²n).

for(int i=n; i>=1; i--) { sum += ts + a[i], ts+=a[i]; if (sum < 0) { sum=0, ts=0; if(a[i] < 0) bef[i-1]=1; else bef[i]=1; } }

import sys n, m = map(int, sys.stdin.readline().split()) MOD = int(1e9+7) first_term = pow(2*m-1, n, MOD) rr = first_term + ( first_term - pow(m, n, MOD) ) * pow(m-1, MOD-2, MOD) print rr % MOD if m > 1 else n+1

Comments (43)

Show archived | Write comment?

P_Nyagolov

8 years ago, # |

← Rev. 3 →

+26

We don't need divide and conquer in F. We can use only convex hull trick. This way the solution has better complexity and is easier to implement.

Let's say that S1 is a normal prefix sum array, that is S1[i]=A[1]+A[2]+...+A[i] and S2 is again a prefix sum array but this time every element is multiplied by its index, that is S2[i]=A[1]+2*A[2]+3*A[3]+...+i*A[i]. Let's choose some R which will be the right end of our chosen interval. Now we are looking for the L that minimizes S2[R]-S2[L-1]-(L-1)*(S1[R]-S1[L-1]) which is a standard use of CHT — http://codeforces.com/contest/660/submission/17245184.

The complexity is O(NlogN).

→ Reply

Errichto

8 years ago, # ^ |

Nice! And is it possible to implement it with integers only?

Ah yes, since we need to compare A1/B1 and A2/B2 where A1, B1, A2 and B2 are integers. Thanks if you asked to make me think about it, I will know that in future! :)

But I think your B may be up to N²·10⁷ and A is up to N·10⁷ so long long's won't be enough to multiply them. So maybe the intended solution was valuable anyway because it allowed to use integers only.

Oh yeah, they are too big, sorry. But don't think that I want to say it's not valuable, of course it is.

forest

+15

In a previous educational round we've learnt how to compare fractions of long longs in integers http://codeforces.com/blog/entry/21588?#comment-262867

kalimm

sum_so_far += t[i];
score_so_far += sum_so_far;
Fun f = Fun{mid - i + 1, score_so_far};

sum_so_far can be N * 10⁷

score_so_far can be N² * 10⁷

When we will calculate f.a*f.b it can be N³ * 10⁷, which is bigger than 10²².

Am I missing something?

← Rev. 2 →

f.a is up to N and f.b is up to N²·10⁷ but we don't multiply some random values. Values of f.a increase by 1 and values of f.b increase by N·10⁷, as we move from f_i to f_i + 1 (and we multiply differences like f1.a - f2.a). I came up with a proof that it amortizes but I don't see that proof right now. I will try to get it again (and I hope it exists).

Mhammad1

My approach for Problem F is as follow (It didn't work, but I can't find the bug in this algorithm):

We can determine the stop point (deleted suffix) by looping back from n to 1:

then I implemented this for loop to get the best stop point for every i:

bef[n]=1;
    int last = 1; 
    for(int i=1; i<=n; i++) {
        if(bef[i]) {
            for(int j=last; j<=i; j++) go[j] = i; 
            last = i+1; 
        }
    }

So now go[i]+1---> n will be the deleted suffix for item i.

I think if this part of code works , I think this problem can be solved in O(n) Complexity, And if it doesn't work, my approach will fail entirely.

So, what is wrong with my code?

hellman_

In C problem, why complexity O(n + k)? even if k > n it will be O(n).

-Morass-

Well technically you are right.

Anyway considering the fact, that (by the statement) K<=N, then O(n+k) == O(n) (the complexities are equal)

So → you are right, but so is Edvard (at least in asymptote) ^_^

But I agree that it might be slightly misleading, considering, that the "k" really does not have to be used for counting of the complexity :)

Edvard

Thanks. Fixed.

For D, another interpretation is to count pairs (dx, dy) for all pairs of points and then for each such pair add to answer count * (count - 1) / 2. Since the parallel sides are parallel and has same length. But we will count each parallelogram twice, so divide the answer by 2.

MStrechen

+13

It's just a building vectors on each parallelogram's side, isn't it?

smahdavi4

an easier aproach and easy to implement is to find miidle of each line then use the fact that in every parallogram diagonals intersect at the middle.

It will be not working if more than 3 points can lie on the same line. For example, {(2,2),(4,4),(-2,-2),(-4,-4)} is not a parallelogram. Yes, I know, following by problem statement it`s impossible, but the fact remains.

exactly as you say!

Build vectors, merge them (count number of same vectors) and use Gauss's formula!

^_^

fifiman

How can you solve Problem E?

aayushr

I could not understand the editorial. Can someone please explain?

I_love_Captain_America

Out of A,B,C,D guess which one I found the hardest? That's right! A. FML ;_;

-8

There is another method of solving B with sorting. Some people might find it easier and shorter to code

17235532

Fadeaway

For Problem F you can just make a form of slope such that (p[j] — p[k]) / (j — k) <= s[i], where p[i] = i * sigma(a[i]) — sigma(i * a[i]), s[i] = sigma(a[i]). then you can make a convex hull, for each i, you just to use binary search to find the best choice and update the answer. this complexity is O(nlogn)

20140355

for E ,in 5th line ,there should be m - 1 choices for each of them but not k-1 choices.

pqhuy98

-36

Here's just a funny story I want to tell you guys.

I solved problem F by a weird O(N) algorithm, which is not correct.

http://codeforces.com/contest/660/submission/17264773 (Accepted)

Maybe test cases are weak, so my solution passed 52/54 tests (failed on 2 tests, I had to write "if n=... cout..." in order to AC). Seem crazy right ?

EvgeniSergeev

An O(n) alternative for E:

    LL powr = 1;
    LL current = 1;

    ii (n) {
        current = ((2*m)*current - (current - powr)) % MOD;
        powr = (powr * m) % MOD;
    }

    cout << current << '\n';

This is more direct from the statement of the problem. We build the set of all sequences, element by element, from left to right, and current tracks the count of distinct subsequences within all the sequences (visualise a different room for each sequence maybe). At each step for each room we create m new rooms. Also, in each room, each subsequence splits into two: one with the newly added element, and one without. Except we have just created some subsequences that were already there; all of them, in fact (except the empty ones), so we subtract them (the number of empty ones is powr, which is m**i).

Seems to work: 17242031.

It seems I forgot how to solve recurrences. But Wolfram Alpha hasn't. Edit: for some reason that link doesn't work. Here it is: https://www.wolframalpha.com/input/?i=solve+recurrence+a%5Bi%5D+%3D+a%5Bi-1%5D*K+%2B+M%5E%28i-1%29

So there's a O(lg (n+m+MOD)) solution. Logarithmic time means we can solve it in Python!

Complete solution:

Apparently it works: 17273456

Nams

Hey,this may be a bit naive question but please can you explain how the answer for n=2 & m=2 for this problem is 14.

There are 4 sequences and in each we need to count unique subsequences:

00: [], [0], [0,0]

01: [], [0], [1], [0,1]

10: [], [0], [1], [1,0]

11: [], [1], [1,1]

That's 3+4+4+3.

Aviously

For B, you say "There are no tricks." What about 17246985?

zas97

I don't understand why the problem D is complexity O(n^2*logn), I know that the n^2 is there because we have to compare every segment with every other segment but I don't understand why the log(n).

abelramos

The log(n) comes up from the complexity of the data structure needed to handle cnt, such as a C++ map. Note that you need to count how many times a point appears as a middle point.

himanshujaju

I didnt understand the samples in the question for E, could someone help me out here?

ivanzuki

7 years ago, # |

Can someone elaborate on the following from problem E's editorial?

lior5654

3 years ago, # ^ |

← Rev. 6 →

4 years later, you finally get your answer! :D

Claim 1:

$$$\sum_{k=1}^{n}{\sum_{j = 0}^{n - k}{m^km^j(m - 1)^{n - j - k}{n - j - 1 \choose k - 1}}} = \sum_{s = 1}^{n}{m^s(m - 1)^{n - s}\sum_{k = 0}^{s - 1}{n - s + k \choose k}}$$$

Proof:

We have:

$$$\sum_{k=1}^{n}{\sum_{j = 0}^{n - k}{m^km^j(m - 1)^{n - j - k}{n - j - 1 \choose k - 1}}} = \sum_{k = 1}^{n}{\sum_{j = 0}^{n - k}{m^{k + j}(m - 1)^{n - (k + j)}{n - j - 1 \choose k - 1}}}$$$

Observe that the quantity $$$k + j$$$ appears in most of the members of the inner multiplication, so summing over it in the outer sigma would simply the sum. Let $$$s = j + k$$$. Instead of summing over $$$k$$$ in the outer and $$$j$$$ in the inner, let's sum over all of the values of $$$s$$$ in the outer and $$$j$$$ in the inner, and the corresponding value of $$$k$$$ will be $$$s - j$$$.

This is useful because $$$m^{k + j}(m-1)^{n - (k + j)}$$$ will turn into $$$m^{s}(m - 1)^{n - s}$$$ and we can take that quantity outside of the inner sum because it's independent of $$$j$$$. Also, note that the minimum value for $$$s$$$ is $$$1 + 0 = 1$$$, and the maximum value for $$$s$$$ is simply $$$n$$$. Finally, note that when fixing $$$s$$$, $$$j$$$'s possible range is $$$[0, s - 1]$$$, as $$$k$$$'s possible range is $$$[1, s]$$$ and $$$j = s - k$$$.

So we get: $$$\sum_{k = 1}^{n}{\sum_{j = 0}^{n - k}{m^{k + j}(m - 1)^{n - (k + j)}{n - j - 1 \choose k - 1}}} = \sum_{s = 1}^{n}{\sum_{j = 0}^{s - 1}{m^s(m - 1)^{n - s}{n - j - 1 \choose s - j - 1}}} = \sum_{s = 1}^{n}{m^s(m - 1)^{n - s}\sum_{j = 0}^{s - 1}{n - j - 1 \choose s - j - 1}}$$$

Now we have to deal with the inner creature, $$$\sum_{j = 0}^{s - 1}{n - j - 1 \choose s - j - 1}$$$. Let $$$k$$$ (note that this is NOT the $$$k$$$ from before, this is simply bad variable name choice) be $$$s - j - 1$$$. we want to sum over all values of this $$$k$$$ instead all values of $$$j$$$. obsevre that based on the definition we gave for $$$k$$$, $$$j = s - k - 1$$$, and because the range for $$$j$$$ is $$$[0, s - 1]$$$, the range for $$$k$$$ will be $$$[0, s-1]$$$ (because when $$$j$$$ is $$$0$$$ $$$k$$$ will be $$$s - 1$$$ and when $$$j$$$ is $$$s - 1$$$, $$$k$$$ will be $$$0$$$ (we did a similar thing when converting to summing over $$$s$$$)).

We get: $$$\sum_{j = 0}^{s - 1}{n - j - 1 \choose s - j - 1} = \sum_{k = 0}^{s - 1}{n - (s - k - 1) - 1 \choose k} = \sum_{k = 0}^{s - 1}{n - s + k \choose k}$$$

Finally, substituting and combining everything: $$$\sum_{s = 1}^{n}{m^s(m - 1)^{n - s}\sum_{j = 0}^{s - 1}{n - j - 1 \choose s - j - 1}} = \sum_{s = 1}^{n}{m^s(m - 1)^{n - s}\sum_{k = 0}^{s - 1}{n - s + k \choose k}}$$$.

To conclude, coming back to the initial equality we found: $$$\sum_{k=1}^{n}{\sum_{j = 0}^{n - k}{m^km^j(m - 1)^{n - j - k}{n - j - 1 \choose k - 1}}} = \sum_{s = 1}^{n}{m^s(m - 1)^{n - s}\sum_{k = 0}^{s - 1}{n - s + k \choose k}}$$$, as desired. $$$\blacksquare$$$

Claim 2:

$$$\sum_{k = 0}^{s - 1}{n - s + k \choose k} = {n \choose s-1}$$$

Introducing: The Hockey-Stick Identity

Our intuition is to use the Hockey-Stick Identity. The Mirror-Image Hockey-Stick Identity is stated as follows (specifically MHS & RHS of it in Wikipedia): $$$\sum_{k = 0}^{n - r}{k + r \choose k} = {n+1 \choose n-r}$$$ (note that $$$n \ge r$$$ must be satisfied).

We have $$$k$$$ in the bottom part of the choose, so the thing added to it, $$$r$$$, should be the upper part of the choose in the summation minus $$$k$$$, namely, let $$$r = n - s$$$. also, let the $$$n$$$ from the Hockey-Stick identity $$$n - 1$$$ (where this $$$n$$$ is the length of the sequence). Note that $$$n - 1 \ge n - s$$$ therefore the $$$n \ge r$$$ condition is satisfied.

Substituting: (HockeyStick($$$r = n - s$$$, $$$n = n - 1$$$))

$$$\sum_{k = 0}^{n - 1 - (n - s)}{k + n - s \choose k} = {n - 1 + 1 \choose n - 1 - (n - s)} \implies \sum_{k = 0}^{s - 1}{n - s + k \choose k} = {n \choose s-1}$$$, as desired. $$$\blacksquare$$$

Substituting into Claim 1:

$$$\sum_{k=1}^{n}{\sum_{j = 0}^{n - k}{m^km^j(m - 1)^{n - j - k}{n - j - 1 \choose k - 1}}} = \sum_{s = 1}^{n}{m^s(m - 1)^{n - s}{n \choose s-1}}$$$

And now, the only thing left is to include the empty sequence, we should count every sequence of length $$$n$$$, and there are $$$m$$$ options for each position therefore $$$m^n$$$ such sequences therefore we arrive at our final answer:

$$$m^n + \sum_{s = 1}^{n}{m^s(m - 1)^{n - s}{n \choose s-1}}$$$ $$$\square$$$

I will need to go back to this problem again lol. Thanks for writing this down!

You're welcome! :)

Once you finish reading & upsolving, please update here, I wrote this explanation in 2 AM and error checking would be cool :p

roll_no_1

5 years ago, # |

The closed formula being referred to in problem E is:

$$$ans = m^n + \frac{m}{m-1}((2m-1)^n - m^n)$$$.

It doesn't work when $$$m = 1$$$ (because of division by 0). But in that case, since there is only a single sequence of length $$$n$$$ comprising of all $$$1$$$-s, the answer is simply $$$(n + 1)$$$.

3 years ago, # |

A note for Problem E: Different Subsets For All Tuples

What got me stuck on the problem is that I initially considered the sequence x1, x2 .. xk but I said xi cant appear in the range (position of xi-1, position of xi+1), but that completely ignored cases in which the value appears multiple times in the same segment.

So usually, in combinatorics we like to rephrase things and identify things, so the idea for cases in which the sequence appears multiple times is to only consider the sequence with the smallest lexicographical position vector. this is what allows for the described approach.

Olympia

2 years ago, # |

Since math is hard, I have a solution for $$$E$$$ which requires basically no math and is really easy to implement! I can't prove the correctness of it in any nice way.

So first is to try out small $$$(n, m)$$$ and brute force the answers, probably using some computer program or something. That way, we can try to find a pattern.

Code

#include <iostream>
#include <vector>
#include <cassert>
#include <cmath>
#include <cstdio>
#include <map>
#include <algorithm>
#include <climits>
#include <cstring>
#include <set>
#include <queue>
#include <stack>
#include <list>
#include <cstring>
#include <random>
#include <array>
#include <chrono>

using namespace std;

vector<vector<int>> vec;

int ds (vector<int> v) {
    int dp[v.size() + 1];
    dp[0] = 1;
    map<int,int> last;
    for (int i = 1; i <= v.size(); i++) {
        dp[i] = 2 * dp[i - 1];
        if (last.count(v[i - 1])) {
            dp[i] = dp[i] - dp[last[v[i - 1]]];
        }
        last[v[i - 1]] = i - 1;
    }
    return dp[(int)v.size()];
}

void rec (int n, int m, vector<int> v) {
    if (v.size() == n) {
        vec.push_back(v);
        return;
    }
    for (int i = 1; i <= m; i++) {
        v.push_back(i);
        rec(n, m, v);
        v.pop_back();
    }
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);
    int N, M;
    cin >> N >> M;
    rec(N, M, {});
    int a = 0;
    for (auto& v: vec) {
        //for (int i: v) {
            //cout << i << ' ';
        //}
        //cout << '\n';
        //cout << ds(v) << '\n';
        a += ds(v);
        //cout << '\n';
    }
    cout << a;
}

If you put this in a table, we get:

2 4 6 8 10 12

3 14 33 60 95 138

4 46 174 436 880 1554

5 146 897 3116 8045 17310

6 454 4566 22068 73030 191706

I index the table such that the column index represents $$$m$$$ and the row index represent $$$n$$$.

Let's read the table column by column. The third column is:

6 33 174 897 4566

Notice that everything seems to be around $$$5$$$ times bigger than the previous number. There's a little bit of an offset, though. The offsets are powers of $$$3$$$.

The fourth column:

8 60 436 3116 22068

Notice that everything seems to be around $$$7$$$ times bigger than the previous number. There's a bit of an offset, by powers of $$$4$$$.

And that's the main observation.

$$$\boxed{f(n, m) = f(n - 1, m) \cdot (2 \cdot m + 1) + m^{n - 1}}.$$$

Easy to constructive recursive solution from there.

inszva

DP solution for Problem E:

For a specific sequence $$$a_1, a_2, a_3, ... $$$, it's known that $$$dp[i] = dp[i-1] * 2 - dp[j-1]$$$ in which $$$j$$$ is the latest position that $$$a_j = a_i$$$.

Now, let $$$dp[i]$$$ represent the sum of $$$dp[i]$$$ sequence whose last element is specific $$$x$$$. $$$x$$$ may be any number between $$$1$$$ to $$$m$$$, but it doesn't matter. $$$dp[i] = dp[i-1] * m * 2$$$ is hold for every sequence. And we should subject some $$$dp[j]$$$ from it. We can iterator the latest $$$j$$$, count the number, so we can get:

$$$dp[i] = dp[i-1] * 2 - \sum_j dp[j-1] * m * {(m-1)}^{i-j-1} $$$

And the sum is very easy to maintain when we loop it.

146751249

hugo4CF

22 months ago, # |

-10

Here's insight into "parallel summing" in problem e solution: https://codeforces.com/blog/entry/104172

Edvard's blog

660A - Co-prime Array

660B - Seating On Bus

660C - Hard Process

660D - Number of Parallelograms

660E - Different Subsets For All Tuples

660F - Bear and Bowling 4