kardeepak's blog

By kardeepak, history, 6 years ago, In English

Can anyone help me with suffix arrays implementation?

https://ideone.com/GN888y I did this by reading the codechef tutorial. https://discuss.codechef.com/questions/21385/a-tutorial-on-suffix-arrays

This problem gives the wrong answer for my code. http://www.spoj.com/problems/DISUBSTR/

  • Vote: I like it
  • 0
  • Vote: I do not like it

»
6 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by kardeepak (previous revision, new revision, compare).

»
6 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I just now solved it with the logic that since SA gives us the suffixes in sorted order I just compare every suffix with the one lexicographically previous to it. The solution is n^2 but it works because of the smaller bounds.

https://ideone.com/MEQrkh

I don't understand the idea behind your implementation. Can you elaborate a little.

  • »
    »
    6 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Basically, for each suffix, I store the rank of first 2^i characters in entry.rank starting at idx and entry.next_rank stores first 2^i characters starting idx+2^i. Then I am sorting them and by that, I can assign tanks to 2^(i+1) characters starting at idx. If idx+2^i > n then I am storing next_rank as -1 so it would come before others.

    I also tried counting Sort separately and it got accepted. But I don't understand what is wrong with my approach.

»
6 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Your value of const LL MAXLG = 10; is to low, it should be 11. Alternatively, you could increase the size of HP by one (HP[MAXLG+1][MAXN]).

(In the k-th iteration, suffixes are sorted according to the first 2k symbols. In the end, 2k > n, so you need , i.e 10 iterations. In the 10-th iteration, you write into HP[10], which is out of bounds for MAXLG = 10.)

The easiest way of finding these bugs is to take the wrong and a correct solution and run then on many random test cases of different sizes until you get a difference. If you haven't got a fast correct solution, a naive brute force usually suffices.