zed_b's blog

By zed_b, history, 4 years ago, In English

Hello, does anybody know if I can modify KMP for a pattern string with '?' symbols, that match any single character, so that I can find whether a pattern P that may contain '?' is in a text T (without '?') in O(|T|) time? Cheers

  • Vote: I like it
  • +29
  • Vote: I do not like it

»
4 years ago, # |
  Vote: I like it -33 Vote: I do not like it

Why you can't create one if construction, like:

 while(j != 0 && (s[j] != '?' && s[i] != s[j])
  • »
    »
    4 years ago, # ^ |
    Rev. 4   Vote: I like it +125 Vote: I do not like it

    Because algorithms are not just a random code. It is much more about proofs. In case of KMP we use transitivity of the equality, when iterating over the borders.

    If j is a border of S[1..i] and h is a border of S[1..j], then h is a border of S[1..i]

    And with the wildcards equality is not transitive: $$$a= ?$$$, $$$?= b$$$, $$$a\neq b$$$. So with the naive algorithm after appending b to string a?ab we will get $$$p(5)=2$$$ ($$$p(p(4))=1$$$, $$$b=?$$$)


    There are efficient algorithms based on FFT (works in O(|T| log)) for the pattern matching with wildcards.

»
4 years ago, # |
Rev. 2   Vote: I like it +5 Vote: I do not like it

I don't know something very related to KMP that works, but this problem can be solved with bitsets in $$$O(|P|*|T|/32)$$$ (bruteforce complexity but $$$1/32$$$ constant) with bitsets, or in $$$O((|P|+|T|)*log(|P|+|T|))$$$ using FFT. To see how, check the editorial of http://codeforces.com/problemset/problem/754/E and http://codeforces.com/blog/entry/49613?#comment-335977

»
4 years ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

Do you have a link for this problem ?

A year ago I implemented that (in linear time) for buscando-palabras (the constraints allow quadratic solutions but I wanted to challenge myself with a linear solution).