fofao_funk's blog

By fofao_funk, history, 7 years ago, In English

Hi all.

Can someone give me some tips on how to solve the following problem?

Given a non-empty string S consisting of lowercase letters with length at most 105, calculate the number of distinct sums for every substring of S. The sum of a string is defined as the sum of the values of all the characters in it. The values of the characters are in the range [1, 26], starting from a; i.e., the value of a is 1, the value of b is 2 and so on.

For example, the distinct sums for the string S = acd are 1, 3, 4, 7, 8; in this case, the answer should be 5.

This problem is from brazilian first subregional, which occurred yesterday, and the given time limit was 7 seconds.

Thanks for the help!

  • Vote: I like it
  • +20
  • Vote: I do not like it

»
7 years ago, # |
  Vote: I like it +8 Vote: I do not like it

Can be done with FFT( Fast Fourier Transform )

»
7 years ago, # |
Rev. 2   Vote: I like it +5 Vote: I do not like it

I think one could divide the string in half, solve recursively for the 2 parts and merge the answers.

For the merge part, we would need to have calculated the possible value of prefixes, possible value of sufixes and infixes for every part we have divided, and then update that value in the merge process.

The calculation of values would have to be done using FFT as it seems just like polynomial convolution (you can check a "similar" problem here).

Complexity would be about (or actually as the computation cost is lower when it is not in the root of the recursion tree)

»
7 years ago, # |
Rev. 3   Vote: I like it +13 Vote: I do not like it

Think in the following way:
First, lets calculate the sum of every prefix in the string.
Now, we can reduce the problem to finding the number of different values in the form Sum[i] — Sum[j], where sum[i] is the prefix sum of position i.
It's very close to standard FFT problem, but it's a subtraction, to solve this we can just add an constant making all values bigger than zero, in the case 26 * (max size of the string)
Now, just do FFT, and count how much indexes have value different from zero. You have to see, that with this method the empty string will be counted one time and any other will be counted twice.
The complexity will be O(C * log(C)) where C is the maximum value
My team didn't solved by a little bug and strict TL.

  • »
    »
    7 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    N^2/26 didn't pass?

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I also thought it should have passed, was nothing big, just optimizations. When I have time, I will recalculate everything, the constant factor shiuld be small.

      • »
        »
        »
        »
        7 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Could you explain that solution?

»
7 years ago, # |
Rev. 3   Vote: I like it +10 Vote: I do not like it

Let sum[x] be the sum of the values up until position x — 1. A value is V is counted if there's r > l where sum[r] — sum[l] == V.

Create 2 polynomials:

A[i] == 1 iff there's sum[x] == i, else 0

B[i] == 1 iff there's sum[n] — sum[x] == i, else 0

C = A * B, on position C[k] it has A[i] * B[j] for every i + j == k. Rewriting i and j we have:

k == sum[x1] + sum[n] — sum[x2] == constant + sum[x1] — sum[x2]

So every position C[i] with value V greater than 0 means that you can find the the value V — sum[n] using the difference of 2 prefix sums. Count how many values > 1 starting from C[sum[n] + 1] and that's the answer. The TLE was kinda strict, so the FFT implementation needed to be optimized to pass (we needed to optimize cache).

»
7 years ago, # |
  Vote: I like it +3 Vote: I do not like it

Thanks everyone for the answers, I think I got the idea.