drod's blog

By drod, 10 years ago, In English

Given two strings A and B, your task is to find a substring of A called justice string, which has the same length as B, and only has at most two characters different from B.

Both string A and B contain lowercase English letters from a to z only. And the length of these two strings is between 1 and 100000, inclusive.

  • Vote: I like it
  • +5
  • Vote: I do not like it

| Write comment?
»
10 years ago, # |
  Vote: I like it +17 Vote: I do not like it

Well, the first thing which quickly came to mind is to use FFT. (I guess it's possible to pass TL)
But while I was writing this I understood that there's a much easier solution: let's try each substring of A. Let's denote it C. Firstly, we find largest common prefix of B and C (in O(log(|B|)) using binary search and substring hashing). If it's not the whole string B then we delete this prefix and the next symbols after it from each string (in our imagination, of course) and repeat the whole procedure. Each time we find a pair of different letters, so when we find the third one we stop.

  • »
    »
    10 years ago, # ^ |
      Vote: I like it +1 Vote: I do not like it

    Brute force search and largest common prefix, how can I never thought about it. Thank you!

  • »
    »
    10 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    However, I'm not really sure about FFT here. There was a similar problem but with only 4 possible letters instead of 26 and I had real problem with time limit.

    • »
      »
      »
      10 years ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      Btw, we need not 26+26 FFT. Even not 26+1 FFT. We need only 13+1 FFT. But anyway you will have troubles with TL =) Solution in 3*NlogN with hashes is much faster in work and simpler to implement.

      • »
        »
        »
        »
        10 years ago, # ^ |
        Rev. 2   Vote: I like it 0 Vote: I do not like it

        Yeah, I'm pretty aware of that. Even using 2+1 FFTs in that problem my realisation was very slow.
        Well, there's no doubt that hashes and binary search are easier. On the other hand, FFT is much more interesting and cooler=)

  • »
    »
    10 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Can you give an advice how to solve it with FFT?

    • »
      »
      »
      10 years ago, # ^ |
      Rev. 4   Vote: I like it +3 Vote: I do not like it

      Consider the following (easier) problem: given binary strings A and B for each substring in A of length |B| (call it C) calculate the number of ones in B & C (binary AND of B and C).
      UPD: Oh, yeah, it's much easier to call this number as the scalar product of B and C)
      There's a hint in the rev. 1.

»
10 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Polynomial hashes? We need to check O(n) substrings. You can check number of different letters (we are interested in four cases only: 0, 1, 2 and >2) in unsing binary search. Total time is .

  • »
    »
    10 years ago, # ^ |
    Rev. 2   Vote: I like it +9 Vote: I do not like it

    You also can improve this solution changing hashes to Z-function for string B#A and get linear solution.

    upd: that's wrong solution, see below.

    • »
      »
      »
      10 years ago, # ^ |
        Vote: I like it +9 Vote: I do not like it

      How to detect a difference in two symbols using Z-function?

      • »
        »
        »
        »
        10 years ago, # ^ |
        Rev. 2   Vote: I like it 0 Vote: I do not like it

        Oops, it works for finding the first difference but doesn't work for the second one. My mistake, solution with Z-function is wrong, but hashes still work.

        • »
          »
          »
          »
          »
          10 years ago, # ^ |
          Rev. 3   Vote: I like it 0 Vote: I do not like it

          I remember there was similar task, but you have to find cyclic shift s' of string s such that diff(s', s) <= 2 (or i can be wrong, maybe <= 1). Task was at Izhevsk training camp. (2008-2010?) Author of the problem was AlexSkidanov, and he has O(n) solution.

        • »
          »
          »
          »
          »
          10 years ago, # ^ |
            Vote: I like it +7 Vote: I do not like it

          "it works for finding the first difference but doesn't work for the second one" you can build another z-function for reverted strings. Now it's linear.

          • »
            »
            »
            »
            »
            »
            10 years ago, # ^ |
              Vote: I like it +1 Vote: I do not like it

            Good idea! And it works for 1 difference. But for case of 2 differences how to check substring between 2 letters that are different?

            • »
              »
              »
              »
              »
              »
              »
              10 years ago, # ^ |
              Rev. 3   Vote: I like it +9 Vote: I do not like it

              suffix array + Least Common Prefix in O(NLogN), or polynomial hashes in O(N).

              • »
                »
                »
                »
                »
                »
                »
                »
                10 years ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it

                Well, Z-function + hashes is a bit weird combination, but it gives indeed linear solution.

»
10 years ago, # |
Rev. 2   Vote: I like it +11 Vote: I do not like it

sorry to be off-topic, but this is blog post #12345! :)