This post will be a tutorial on 313B - Ilya and Queries, as well as some general python I/O optimizations that provided me with enough of a speedup to have my solution finish in the allotted time.
Solution
As an example, consider the input string #..###
. From it, we create arrays
S
=[ # . . # # # ]
A
=[ 0 1 0 1 1 X ]
=[ a_1 a_2 a_3 a_4 a_5 a_6 ]
where
S
is an array encoding the input string.A[i]
= 1 ifS[i]
==S[i+1]
and 0 otherwise (A[-1]
is undefined).
Let f(l, r)
be a function where f(l, r)
= w_{l}
+ w_{l+1}
+ ... + w{r-1}
. Clearly f
corresponds to the quantity of interest in 313B - Ilya and Queries (convince yourself if you don't see it!).
A naive solution is to compute f(l, r)
iteratively (i.e. by looping from l
to r-1
). However, the runtime of this algorithm is O(nm)
= O(n^2)
as 0 <= m <= n
. Hence, in the worst case we will do (10^5)^2 = 10^10 operations as 0 <= n <= 10^5
, which it too large a number for the allotted time. Hence we need to do better.
One idea is to pre-compute f(l, r)
for 1
<= l
< r
<= n
. However, it turns out this table would have 10^10 elements in it, so it won't do.
It turns out that instead of pre-computing every possible f(l, r)
pair, we can pre-compute a smaller number of values which we can use very quickly to compute any f(l, r)
we want. Consider the following function F(r)
, where F(r)
= w_1
+ w_2
+ ... + w_{r-1}. Notice that
f(l, r)=
F(l)=
F(r-1). To see, this consider
f(2, 5)`:
`f(2, 5)` = `w_2` + `w_3` + `w_4`
= `w_1` + `w_2` + `w_3` + `w_4`
- `w_1`
= F(5)
- F(2)
Hence all that remains is to compute F(r)
for 0 <= r
<= n
. Note that this cuts the runtime down by a factor of m
down to O(n*1)
= O(n)
as F(l)
— F(r)
are constant-time lookups. It turns out this is fast enough!
I/O Optimizations
Even with this optimization, my python solution was still exceeding the time limit. After profiling my code, I deduced that the way I was handling I/O was the bottleneck. I replaced
for _ in range(m):
l, r = [int(num) for num in raw_input().split()]
print A[r] - A[l]
with
lines = sys.stdin.readlines()
inputs = [[int(num)-1 for num in line.split()] for line in lines]
outputs = [str(SA[r]-SA[l]) for l, r in inputs]
print '\n'.join(outputs)
Eliminating the repeated calls to raw_input()
and print
gave me a substantial speedup which was fast enough to have my solution finish within the allotted time!