intptr's blog

By intptr, history, 4 years ago, , Paul Masurel offers a great article about the Levenstein automata : link

He offers an algorithm to compute if the levenstein distance between two strings is less than or equal to D.

The naive algorithm is fairly straightforward :

def levenshtein(s1, s2, D=2):
"""
Returns True iff the two string
s1 and s2 is lesser or equal to D
"""
if D == -1:
return False
if len(s1) < len(s2):
return levenshtein(s2, s1)
if len(s2) == 0:
return len(s1) <= D
return (levenshtein(s1[1:], s2[1:], D-1)   # substitution\
or levenshtein(s1, s2[1:], D-1)       # insertion\
or levenshtein(s1[1:], s2, D-1)       # deletion\
or (
# character match
(s1 == s2) and \
levenshtein(s1[1:], s2[1:], D)
))

This basically tries all possibilities which is with exponential complexity if we don't save any of the calls.

He offers a better algorithm but for the last week I have been trying to understand how it works, here it is :

def levenshtein(s1, s2, D=2): """ Returns True iff the edit distance between the two strings s1 and s2 is lesser or equal to D """ if len(s1) == 0: return len(s2) <= D if len(s2) == 0: return len(s1) <= D # assuming s1 is NOT used to build s2, if D > 0: if levenshtein(s1[1:], s2, D - 1): # deletion return True if levenshtein(s1[1:], s2[1:], D - 1): # substitution return True # assuming s1 is used to build s2 for d in range(min(D+1, len(s2))): # d is the position where s1 # might be used. # it is also the number of character # that are required to be inserted before # using s1[d]. if s1 == s2[d]: if levenshtein(s1[1:], s2[d+1:], D - d): return True return False

It would be great if anyone can shed some light on how this works! Thanks!

By intptr, history, 4 years ago, , Looking at the algorithm for the problem C in contest 316 I couldn't figure out how it works so I decided it could be a good idea to ask for an explanation here.

Reference to the solution is here : http://codeforces.com/contest/570/submission/12523751

Let's assume an input of ".b..bz...."

After precomputation the array named 'ok' looks like this

[0,1,0,1,1,0,0,1,1,1] num = 7; group = 3

Now we start queries :

1st one is [1,h] :

We evaluate a = ok(1) ( True ) and b ( False )

thus (a != b) = ( True )

Then num-- => num = 6

How does decreasing the total length by one make sense here?

In general could someone give a good explanation for this implementation? I understand what must happen to solve the problem but my solutions end up being too long/convoluted.