skavurskaa's blog

By skavurskaa, history, 8 years ago, In English

Short problem statement: Given T < 200 binary strings S (|S| ≤ 105), find the size of the shortest pattern that doesn't match S for all input strings.

Examples :

S = 011101001, answer = 3 (doesnt match '000')

S = 11111, answer = 1 (doesnt match '0')

My current solution is building a suffix automaton for S and searching all patterns of size i (i=1, i=2, ...) while the number of matches of this size equals 2i. When i find some k such that matches(k) < 2k, this is the answer. This is O(|S|) for building suffix automata plus for matchings, which i think is always small but still relevant.

This solution gets TLE. Can anyone help me with a faster solution for this problem? Thanks in advance.

EDIT 2: Solved the problem using the strategy below. I will leave the blog here any way because this problem looks interesting so i want to share the solution in case any one is interested :)

Run BFS in suffix automata starting from root node until we find some node that has less than 2 links. Let p be the length of the path from root to this node. This node represents some pattern of length p+1 that doesn't appear in the string S. So the answer is p+1.

  • Vote: I like it
  • -1
  • Vote: I do not like it

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you share the link of the problem?, please. I think that I can solve the problem more easily(without suffix automaton).

»
8 years ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

If you analyze the problem more thoroughly, you may observe the interesting fact that the length of the solution callot exceed log(L)⌉ (why?). This way you can solve the problem much more easily (albeit in a "worse" time bound of O(L * log(L))).

Edit: Also, keep in mind that if you want to keep track of strings of small length, you can as well use a simpler "Rabin Karp" approach (hashing).

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I had this feeling that the answer could never be too big! That's why my first idea was testing all small patterns using suffix automaton pattern matching utility. Unfortunately it got TLE. Maybe a simpler approach like yours, which doesn't build the SA, can pass in the time limits.

    Thanks for the contribution

  • »
    »
    4 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    can we do it by binary-search + Bruteforce(to generate all binary string of particular length) + Rabin-Karp(to find a match)? but it has high time complexity which will not pass in given constraints. can you suggest any better way?

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).