#### Single Round Match 797

Topcoder SRM 797 is scheduled to start at 12:00 UTC-5 on Jan 9, 2021.

Registration is now open for the SRM in the Arena or Applet and closes at 11:55 UTC-5. The coding phase will start at 12:05 UTC-5, so make sure that you are all ready to go. Click here to what time it starts in your area.

This SRM marks the beginning of Stage 3 of TCO21 and also the qualifications for TCO21 Regional Events

**Some Important Links**:: Match Results (match results, rating changes, challenges, individual test case results), Problem Archive, Problem Writing, Algorithm Rankings, Editorials and Older Editorials(SRM 710 and before),

Best of luck!

- the Topcoder Team

I think the date is supposed to be 2021, not 2020.

Thanks for the catch! :)

Gentle Reminder: Round begins in 2 hours :)

How is it possible to get 591/600 pts for med, seroiusly, wtf xD? Summon maroonrk. Has the exact/very similar problem appeared somewhere else already?

On a second thought I admit it is rather standard and I have seen it definitely, but probably never coded it.

comment on cf

I've seen the same idea in a DPRK contest and a recent CCPC, so to me, the problem was just a quick-typing.

I've first seen the idea from the editorial in "A Practical Guide To Quantitative Finance Interviews" (aka green book). The Coin Sequence problem in chapter about martingales.

I've also seen it in "Probability with Martingales" by David Williams, as an application of "Doob's optional stopping theorem". His version is called the "abracadabra problem" and has many articles/blogs covering it:

https://jeremykun.com/2014/03/03/martingales-and-the-optional-stopping-theorem/

https://math.uchicago.edu/~may/VIGRE/VIGRE2011/REUPapers/Ai.pdf

The recent CCPC problem is https://codeforces.com/gym/102832/problem/G. Kevin, Scott and I virtualled this contest recently on stream actually, and Scott told us the solution then.

During the challenge phase I wasn't able to open source of few submission, why is that so?

also draft of editorial: https://docs.google.com/document/d/e/2PACX-1vS1-Zy1xo9jJ363yXpYynBMacuf2mc89WMc_19hyxZRHMgGRfmJcaD87K0rTWD1CwThVL7iYc5KGsXK/pub

Any ideas for the 600 point problem? (div1) The one about rolling a dice till the last few rolled letters match a given goal string?

Specifically, all the states in the DP, (using a KMP as a support for the DP), seem to be kind of circular and interdependent on each other. Is there any way to resolve this?

For each $$$i$$$, find the expected number of steps until you get to the KMP state $$$i+1$$$ from the state $$$i$$$.

I ended up coding a Gauss elimination and it turned out to be fast enough to pass. I suspect it may have been because I chose the right order of traversal and it actually worked in O(N^2) but I didn't prove it on the contest. On the contest I also added some heuristic for the case when there were a lot of zeroes in each row, but I tried to submit it in the practice room without that optimization and it still passed.

WOW!! how exactly did you speed up O(N^3) to O(N^2)?

Gauss elimination will work in O(N^2) since there is nothing above the diagonal in the matrix. You can even solve in O(N*|Dice|) — number of non-zero elements there.

Actually stupid, don't know why it worths 600. Just represent $$$dp[i]$$$ as $$$a\cdot dp[0] + b$$$. Then we get $$$dp[0]$$$ since $$$dp[length(goal)] = 0$$$.

That is pretty much equivalent to actually solving the system of linear equations. So your solution is basically the same as what the other people used. I guess, the challenge was implementing the gaussian elimination along with KMP in the time which was allotted

How to copy paste a list of integers during hacking phase?

At least in the applet the following works. Copy an entire string that looks like this:

paste it into the text box where you normally enter one element, and press the {} button.

Draft of the editorial

It's funny that the formula for the medium can also be discovered and proven using the theorem from the hard about expected time for a Markov chain to return :) (motivated by reading this)

Consider a Markov chain where state is a k-1 character string and transition is adding one character to the end (according to our probabilities) and removing the first character. It's easy to verify that the stationary distribution of this chain has probabilities equal to the product of probabilities comprising the string, and therefore the expected repeat time for any state is one over that.

Now consider our string s and its largest border t. We first need to find the first occurrence of t, and then occurrences of t will occur every 1/(product of probabilities of characters of t) steps on average, and distances between subsequent occurrences are independent random variables.

For each such occurrence, the probability that we will then complete t to become s is (product of probabilities of characters of s-t), and these probabilities are independent for different occurrences of t (here we use the fact that t is the largest border, not just any border).

So the average number of attempts we need to complete s is 1/(product of probabilities of characters of s-t). Since t occurs at the end of s, completing s after the x-th occurrence of t is on average the same as the (x+1)-th occurence of t.

So we have the answer for s as the answer for t plus 1/(product of probabilities of characters of t)*1/(product of probabilities of characters of s-t)=1/(product of probabilities of characters of s).

Actually "Since t occurs at the end of s, completing s after the x-th occurrence of t is on average the same as the (x+1)-th occurence of t." part is a bit shaky. Is there a way to make this step cleaner?

can someone please explain the last third and last-second para?

In last-third I don't understand how we get independent probability and how we used the fact that t is the largest border.

I couldn't get last-second, it's a complete bouncer :-(

also at last para, I am a bit confused (I think I figured it out why it is so, but still can someone give some intuitive explanation regarding why the expression should be as it is, why not $$$E(f(t)s't)$$$ )

Last para I think is something like this? am I right?

\end{align*}$$$

I also don't understand how we get independent probability and how we used the fact that t is the largest border.

Consider $$$s=abaca$$$. It's largest border is $$$t=a$$$. Consider two subsequent occurrences of $$$a$$$, distance between which is two ($$$a?a????$$$). Completing $$$s$$$ after the first $$$a$$$ and after the second $$$a$$$ are dependent.

Could anyone elaborate this please?

I did Div1A same as given in editorial, but on systest it fails on test 6 showing SIGILL ,but on compiling locally with debug flags it shows no error and gives correct output can someone please tell me what's causing this error.

How do you see which test a particular submission fails on? Also, if you are using BFS / Dijkstra, maybe your queue is overflowing somehow, BTW.

There is an option in "practice option" to run system tests.

But debug flags can catch such type of errors?

I use this:

g++ -std=c++17 -Wshadow -Wall -o "%e" "%f" -g -fsanitize=address -fsanitize=undefined -D_GLIBCXX_DEBUG

Maybe only on large tests, the overflow happens.

BTW "There is an option in "practice option" to run system tests." That only shows me pass / fail, not TLE etc or on what test it failed. any ideas?

You can copy the arguments of the test on which your solution fails.

Nice idea!

How do you copy the arguments of the failed system test?

It's my first time using Topcoder and I was able to "Run system tests" yesterday but now the button is grayed out. Is this because I tried too many times?

EDIT: Resubmitting my code fixed the problem of the grayed out "Run system tests button"

EDIT2: Using the applet instead of the web interface I can now see the arguments of the failed system test

Can anyone confirm if 'uncaught exception' means MLE, or is there supposed to be a separate MLE error message? I keep getting 'uncaught exception' for a particular test case, which I suspect to be due to MLE for the div1A problem (TC 30, if that is relevant, anyhow).

I got the verdict "uncaught exception" for problem div1A from this contest. I ran the failing test case locally and noticed that my BFS queue was growing indefinitely. I fixed the problem and the test cases passed. So "uncaught exception" can indeed be because of MLE.

Can anyone suggest an intuitive way to think about the solution of Div. 1 Medium problem RollMe? I understood the solution from the editorial but with my current level of understanding, I won't be able to solve if a similar question appears in a contest again.

Yes please! I had to read and re-read it multiple times, yet my understanding is shaky, in fact very weak. Someone kindly help.

Assume you already performed some dice rolls and got some string. What is the information you want to keep about it? Just the longest suffix of it which is also a prefix of our pattern. That single number (length of this suffix) contains everything that you want to know about it. If you know sth about regular languages you can think of things like this in kind of a Myhill-Nerode type where you distinguish two strings based on all possible futures. That "what information do I want to know about what I already processed?" is also very common in all types of dynamic programming.

Given that, you can represent current state as a single number and consider a graph of transitions and you are good to go

Not a closed form solution as mentioned in editorial but here is my approach — Starting with standard approach to this type of problems — $$$f(i)$$$ denotes expected turns to reach the goal given that we start with prefix of length $$$i$$$ Recurrence — $$$f(i)$$$ = $$$p(s[i+1]).(f(i+1) + 1)$$$ $$$+$$$ $$$sum_{ch!=s[i+1]}p(ch).(f(trans(i,ch)) + 1)$$$ where $$$p(ch)$$$ is probability to get the digit $$$ch$$$ and $$$trans(i,ch)$$$ is normal KMP failure function. Problem is we cannot use $$$DP$$$ to solve since recurrences are cyclic. We need Gauss but it is slow(Can we use some other kind of substitution trick to solve?). After that I thought if we can find the expected value to get from $$$trans(i,ch)$$$ to $$$i$$$ we can get the acyclic recurrence. Let's denote $$$DP[from][to] = EV$$$ to reach $$$to$$$ from $$$from$$$. Assume $$$from = to - 1$$$. We can write $$$DP[to-1][to] = p(s[to]).1 + sum_{ch!=s[to]}p(ch).(DP[trans(to-1,ch)][to-1] + DP[to-1][to])$$$. Open the recurrence accumulate coefficients of $$$DP[to-1][to]$$$ to left side which is simply equal to $$$p(s[to])$$$. When $$$from < to - 1$$$ we need $$$DP$$$ values for $$$DP[from+1][to]$$$ and $$$DP[loc][from]$$$ where $$$loc < from$$$ so process $$$DP[i][j]$$$ in increasing order of $$$j$$$ and decreasing order of $$$i$$$ you have acyclic recurrence now you can use $$$DP$$$ to solve it. Proof from the editorial of the formula is very good.

That's a great way to think and although still not very simple, it seems much more intuitive than the editorial. Thanks for sharing.

With,

$$$dp[i]$$$ = expected rolls to reach $$$i^{th}$$$ place in $$$goal$$$. ($$$dp[0] = 0$$$, 1 based indexing)

$$$f(i, d)$$$ = fallback position for digit $$$d$$$ at the $$$i^{th}$$$ place

$$$p_d$$$ = probability of getting the digit $$$d$$$

Now you can easily get the following recurrence,

Answer is obviously $$$dp[len(goal)]$$$.

This is a linear $$$dp$$$, with the overhead of $$$f(i, d)$$$.

$$$f(i, d)$$$ can be calculated using the KMP array (can be pre-calculated), or do a linear search with Hashing.

The time complexity will vary with the method used to calculate $$$f(i, d)$$$.

The best time complexity I can think of is $$$O(len(goal) \cdot len(die)) \equiv O(len(goal))$$$, with pre-calculated KMP array.

AC Code (linear search)