### anusatya.choudhary's blog

By anusatya.choudhary, history, 2 months ago,

Find the expected value of the number of segments in a string of length A in a language having alphabet size B.

A segment is defined as the maximum contiguous substring containing the same character. Eg. In string 10011. The segments are 1, 00 and 11. The number of segments will be 3.

Input format: The first argument is A and second argument is B.

Output format: Return the expected value of the number of segments. This can be represented in the form of x/y. Return x.y^(-1)(mid 10^9 + 7).

Constraints : 1<=A,B<=10^9

Example : A=1,B=2. Output is 1. A=2,B=2. Output is 500000005.

• +4

 » 2 months ago, # |   0 Auto comment: topic has been updated by anusatya.choudhary (previous revision, new revision, compare).
 » 2 months ago, # |   0 The test was a week before so I am not cheating.
 » 2 months ago, # |   0 Auto comment: topic has been updated by anusatya.choudhary (previous revision, new revision, compare).
 » 2 months ago, # |   +14 let $p_i$ be probability of answer being i after some stage.After some number of stages answer is $\sum_{p_i \neq 0}(i * p_i)$Then we add character to the end of the string, there are to types — it's the same as last (probability of that = 1/B), it starts new segment (probability = $(B - 1) / B$)So now answer becomes : $\sum_{p_i \neq 0}(i * p_i / B + (i + 1) * p_i * (B - 1) / B)$= $\sum_{p_i \neq 0}(i * p_i + (i * p_i + p_i) * (B - 1)) / B$= $\sum_{p_i \neq 0}(B * i * p_i + B * p_i - p_i) / B$= $\sum_{p_i \neq 0}(i * p_i) + \sum_{p_i \neq 0}(p_i) * (B - 1) / B$= "previous answer" $+ (B - 1) / B$answer for string length 1 is 1so final answer is get by adding A — 1 characters, so answer := $1 + (A - 1)* (B - 1) / B$
•  » » 2 months ago, # ^ |   0 Thank you!
•  » » » 2 months ago, # ^ |   0 You can actually avoid most of the math that you did using the linearity of expectation. Using your observation at the beginning (that a new segment starts with probability $(B-1)/B$), you can notice that your answer increases with probability $(B-1)/B$, because the number of segments is the number of times a segment starts. This basically means you can skip all of the math until the line "previous answer"+$(B-1)/B$, reaching the same answer in a much more intuitive way.