How to solve this expected value problem, from Winter Petrozavodsk 2015? - Codeforces

→ Pay attention

Before contest
Codeforces Round 940 (Div. 2) and CodeCraft-23
46:02:55
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	awoo	164
2	adamant	164
4	TheScrasse	159
4	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	147
9	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

Errichto's blog

How to solve this expected value problem, from Winter Petrozavodsk 2015?

By Errichto, 8 years ago, In English

In English

I'm not able to find any link. I think that this problem was used in the last day of the camp (Petrozavodsk, Winter 2015). I didn't manage to solve it during the contest and I still don't see a solution.

Given n ≤ 10⁶, find the standard deviation (or variance) of a_n, with allowed precision 10^- 6. A sequence a₁, a₂, ..., a_n is generated as follows:

a[1] = 1
for(int i = 2; i <= n; i++) {
	int j = rand(1, i-1); // random integer from interval [1, i-1]
	int k = rand(1, i-1);
	a[i] = a[j] + a[k];
}

I'm not sure about constraints but I think it doesn't matter (I would appreciate any polynomial solution). Also, maybe the answer was required modulo some number, instead of a real number with some precision — I don't know, sorry. And yes, the expected value is n.

Tags

petrozavodsk, help, expected value, swistakkdidntwanttohelp

+66

Errichto
8 years ago
15

Comments

Comments (15)

Write comment?

»

8 years ago, # |

Vote: I like it

0

Vote: I do not like it

Like for the tag "swistakkdidntwanttohelp" :)). I googled what standard deviation means and it gave me a headache. I'm more than curious how to solve this problem and I'll think at it. Thanks for sharing it with us. I hope someone will be able to solve it

→ Reply

»

»

8 years ago, # ^ |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

Yeah, googling it may be scary. The most important formula to know is:

Var(X) = E[(X - E[X])²] = E[X²] - (E[X])²

So, it's enough to calculate E[X²]. In other words, the task is to find the expected value of a_n².

→ Reply

»

»

»

8 years ago, # ^ |

Vote: I like it

-20

Vote: I do not like it

Sorry for asking this, but can you, please, be more thoroughly? I really don't understand what Var (X) or E[X] means. Can you define them? I think it may be helpful (for me, at least) to know exactly what you are saying, because this is the kind of topic that may be met in a lot of problems.

→ Reply

»

»

»

»

8 years ago, # ^ |

Vote: I like it

+10

Vote: I do not like it

E[X] is the expected value of X, and Var(X) is something not very important to understand. Let's talk via PM, not to spam people.

→ Reply

»

»

»

»

8 years ago, # ^ |

Vote: I like it

-10

Vote: I do not like it

Var(X) is dispersion, I think. We are learning about it on probability classes a lot of. It is something like what is variation between elements and expected value.

For example: if you have function F(x)=x and have numbers 1,101. Expected value is 51 and dispersion is much bigger than if you have numbers 50 and 52 ( they are really close to expected value).

→ Reply

»

8 years ago, # |

← Rev. 2 →

Vote: I like it

+47

Vote: I do not like it

The problem is easier than you think :) It's a shame that we didn't manage to solve it during the actual contest. The solution is just to write down what you want to compute and then, well, compute it. The following description might not be accurate, but you should get the idea. I'll index n from 0, and let ξ_n be the random variable. For some reason \frac and \sum commands do not work, so everything is super-ugly:

d_n = Dξ_n = Eξ_n² - (Eξ_n)²

q_n = Eξ_n² = n^- 2Σ_{i, j < n} [E(ξ_i + ξ_j)²] = n^- 2Σ_{i, j < n} [E(ξ_i²) + 2E(ξ_iξ_j) + E(ξ_j²)]

s_n = Σ_i < n q_i, just partial sums

t_n = Σ_i < n E(ξ_iξ_n)

r_n = Σ_{i < j < n} E(ξ_iξ_j) = Σ_j < n t_j, just partial sums

p_n = Σ_{i, j < n} E(ξ_iξ_j) = 2r_n + s_n

Computing q_n from s_n and p_n should be straightforward, so let's focus on computing t_n.

t_n = Σ_i < n E(ξ_iξ_n) = n^- 2Σ_{i, j, k < n} E(ξ_i(ξ_j + ξ_k)) = n^- 12Σ_{i, j < n} E(ξ_iξ_j) = 2n^- 1p_n

→ Reply

»

»

8 years ago, # ^ |

← Rev. 5 →

Vote: I like it

+10

Vote: I do not like it

Thank you very much! I've managed to implement it and the results are confirmed by the monte carlo method.

~~In the last line you have a small mistake. It should be t_n = ... = n^- 22Σ_{i, j, k < n} E(ξ_iξ_j) = n^- 12Σ_{i, j < n} E(ξ_iξ_j).~~ And btw. r_n turns out to be unnecessary. Anyway, thanks again!

my code

#include<bits/stdc++.h>
using namespace std;

int rand(int a, int b) {
	return a + rand() % (b - a + 1);
}
double monte_carlo(int n) {
	int T = 100000;
	double total = 0;
	for(int rep = 0; rep < T; ++rep) {
		int a[105];
		a[1] = 1;
		for(int i = 2; i <= n; ++i)
			a[i] = a[rand(1,i-1)] + a[rand(1,i-1)];
		total += a[n] * a[n];
	}
	return total / T;
}

const int nax = 105;

double q[nax]; // EV(a[n]^2) - this is the answer we need
double s[nax]; // s[n] = sum(q[i]), for 1 <= i <= n
double t[nax]; // t[n] = sum( EV(a[i]*a[n]) ), for 1 <= i < n
double p[nax]; // p[n] = sum( EV(a[i]*a[j]) ), for 1 <= i, j <= n

double solve(int N) {
	q[1] = s[1] = p[1] = 1;
	for(int n = 2; n <= N; ++n) {
		q[n] = (2 * p[n-1] + 2 * (n - 1) * s[n-1]) / pow(n-1, 2);
		s[n] = s[n-1] + q[n];
		t[n] = 2 * p[n-1] / (n-1);
		p[n] = p[n-1] + 2 * t[n] + q[n]; // equal to 2*r[n]+s[n]
	}
	return q[N];
}

int main() {
	srand(42);
	int n = 20;
	printf("%lfn", monte_carlo(n));
	printf("%lfn", solve(n));
	return 0;
}

EDIT: hah, Codeforces keeps changing \n to n in my code.

→ Reply

»

»

»

8 years ago, # ^ |

Vote: I like it

+10

Vote: I do not like it

Corrected n^- 1, thanks! I used r_n because it was the most straightforward way for me. Good luck with the other problems of this contest :)

→ Reply

»

»

»

»

8 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

I don't want to solve other problems from the contest. It's just that I didn't manage to solve this problem back then (during the camp) and I tried it again a few times since then. Finally, I really wanted to know how to solve it :)

Though, I have many more problems waiting to be upsolved. I'm not good at telling myself to sit and solve some old hard problem. Inventing new problems is so much more interesting.

→ Reply

»

8 years ago, # |

Vote: I like it

0

Vote: I do not like it

Using the law of total variance, it's not hard to get this reccurence for f_n = Var(a_n):

but I'm now sure if there is enough precision for n = 10⁶. Should work fine if the answer is only required modulo prime.

→ Reply

»

»

8 years ago, # ^ |

Vote: I like it

+10

Vote: I do not like it

Could you please share your proof, or correct the formula? Because it seems like this one produces an incorrect output for n = 4 (the correct one is 16/9).

→ Reply

»

»

»

8 years ago, # ^ |

Vote: I like it

+10

Vote: I do not like it

Oops, I thought all a_i's are independent, but they are obviously not. I'm sorry :(

→ Reply

»

»

»

»

8 years ago, # ^ |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

I'm not familiar with this theorem. Can you provide some example where it can be used? In something that could be a competitive-programming problem.

→ Reply

»

8 years ago, # |

Vote: I like it

-23

Vote: I do not like it

After going through post and all comments, I was like

→ Reply

»

»

8 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

why?

→ Reply