tom's blog

By tom, history, 3 years ago, In English

You are given two strings — text t and pattern p. You need to find index i such that t[i...i + p.size() - 1] and p have the largest "match value". Match value of two strings of equal length a and b is total number of such j that a[j] = b[j].

Can you do this faster than O(n2) (tricks with bitsets do not count)?

 
 
 
 
  • Vote: I like it
  • +15
  • Vote: I do not like it

»
3 years ago, # |
Rev. 3   Vote: I like it +20 Vote: I do not like it

Is the alphabet small?

One can do it in O(A * N * log N) time. For a fixed letter, we need to compute dot products of the corresponding bit vectors (where 1 stands for this letter and 0 stands for the rest) for all values of i. We can do it efficiently be reversing the bit vector for t and multiplying two bit vectors as polynomials using FFT.

Once we know how to compute the answer for one letter, we just need to find the sum over all letters in the alphabet.

Here's a problem that asks to do something like this: Fuzzy Search.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    Thanks, I didn't think of using FFT here.

    Is the alphabet small? — yeah, during thinking about the problem I also assumed that we have small alphabet. I wonder if there is a solution without this assumption.