orchidmajumder's blog

By orchidmajumder, 8 years ago, In English

Hi,

I'm trying to solve this problem from SPOJ.

SPOJ-TREEBA

My idea is to run Aho-corasick algorithm to find all intervals [a,b] in the text which contains one word from the dictionary and then to run a greedy interval scheduling to find the maximum number of non overlapping intervals.If this count is <=k ,I'm outputting it as answer.else I am outputting ( k+1) as k noises can accommodate k+1 words at max,So,if we have number of non overlapping intervals > k+1 we can just consider some of those words as noises and can find k+1 words.I think the last greedy decision part is causing me WA.

I'll be very happy if somebody can point out my mistake.This is the first time I'm using Aho-Corasick algorithm,so not much experience as such.

 
 
 
 
  • Vote: I like it
  • 0
  • Vote: I do not like it

»
8 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

I believe, correct me if i am wrong, that since the maximum length of a word is 20, you can apply the rolling hashing algorithhm and generate all hashes of substrings of length( 1, 2, ... , maxL ) on the original string. Since the aho-corasick algorithm is quite tricky to implement, i think you can try this approach.

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I think my aho corasick implementation is not wrong.But the later part has some wrong logic.Also,I don't know rolling hashing well.Will go through it.thanks! :)

    • »
      »
      »
      8 years ago, # ^ |
      Rev. 2   Vote: I like it +1 Vote: I do not like it

      The problem is probably solvable with some sort of dp algorithm. I don't know whether your logic is correct or not, but if you can find a dp algorithm with an appropriate complexity, i think you should go for it.

      Anyway, I wish you good luck and a happy new year :)

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

You can do this by using DP. Consider the state as (n, k, run) , where n is the position in the string where you are currently, k is the remaining number of noises , run is true/false iff the previous character was considered as noise.For the transitions , you can either start a word at this character OR consider this character as noise. For the first transition , a trie or a hash table will work. Complexity:- O(N * K * 20 * 2)