Given two strings A and B, your task is to find a substring of A called justice string, which has the same length as B, and only has at most two characters different from B.
Both string A and B contain lowercase English letters from a to z only. And the length of these two strings is between 1 and 100000, inclusive.
Well, the first thing which quickly came to mind is to use FFT. (I guess it's possible to pass TL)
But while I was writing this I understood that there's a much easier solution: let's try each substring of A. Let's denote it C. Firstly, we find largest common prefix of B and C (in O(log(|B|)) using binary search and substring hashing). If it's not the whole string B then we delete this prefix and the next symbols after it from each string (in our imagination, of course) and repeat the whole procedure. Each time we find a pair of different letters, so when we find the third one we stop.
Brute force search and largest common prefix, how can I never thought about it. Thank you!
However, I'm not really sure about FFT here. There was a similar problem but with only 4 possible letters instead of 26 and I had real problem with time limit.
Btw, we need not 26+26 FFT. Even not 26+1 FFT. We need only 13+1 FFT. But anyway you will have troubles with TL =) Solution in 3*NlogN with hashes is much faster in work and simpler to implement.
Yeah, I'm pretty aware of that. Even using 2+1 FFTs in that problem my realisation was very slow.
Well, there's no doubt that hashes and binary search are easier. On the other hand, FFT is much more interesting and cooler=)
Can you give an advice how to solve it with FFT?
Consider the following (easier) problem: given binary strings A and B for each substring in A of length |B| (call it C) calculate the number of ones in B & C (binary AND of B and C).
UPD: Oh, yeah, it's much easier to call this number as the scalar product of B and C)
There's a hint in the rev. 1.
Thanks!
Polynomial hashes? We need to check O(n) substrings. You can check number of different letters (we are interested in four cases only: 0, 1, 2 and >2) in
unsing binary search. Total time is
.
You also can improve this solution changing hashes to Z-function for string B#A and get linear solution.
upd: that's wrong solution, see below.
How to detect a difference in two symbols using Z-function?
Oops, it works for finding the first difference but doesn't work for the second one. My mistake, solution with Z-function is wrong, but hashes still work.
I remember there was similar task, but you have to find cyclic shift s' of string s such that diff(s', s) <= 2 (or i can be wrong, maybe <= 1). Task was at Izhevsk training camp. (2008-2010?) Author of the problem was AlexSkidanov, and he has O(n) solution.
"it works for finding the first difference but doesn't work for the second one" you can build another z-function for reverted strings. Now it's linear.
Good idea! And it works for 1 difference. But for case of 2 differences how to check substring between 2 letters that are different?
suffix array + Least Common Prefix in O(NLogN), or polynomial hashes in O(N).
Well, Z-function + hashes is a bit weird combination, but it gives indeed linear solution.
sorry to be off-topic, but this is blog post #12345! :)