### intptr's blog

By intptr, history, 5 years ago,

Paul Masurel offers a great article about the Levenstein automata : link

He offers an algorithm to compute if the levenstein distance between two strings is less than or equal to D.

The naive algorithm is fairly straightforward :

def levenshtein(s1, s2, D=2):
"""
Returns True iff the two string
s1 and s2 is lesser or equal to D
"""
if D == -1:
return False
if len(s1) < len(s2):
return levenshtein(s2, s1)
if len(s2) == 0:
return len(s1) <= D
return (levenshtein(s1[1:], s2[1:], D-1)   # substitution\
or levenshtein(s1, s2[1:], D-1)       # insertion\
or levenshtein(s1[1:], s2, D-1)       # deletion\
or (
# character match
(s1[0] == s2[0]) and \
levenshtein(s1[1:], s2[1:], D)
))


This basically tries all possibilities which is with exponential complexity if we don't save any of the calls.

He offers a better algorithm but for the last week I have been trying to understand how it works, here it is :

def levenshtein(s1, s2, D=2):
"""
Returns True iff the edit distance between
the two strings s1 and s2 is lesser or
equal to D
"""
if len(s1) == 0:
return len(s2) <= D
if len(s2) == 0:
return len(s1) <= D
# assuming s1[0] is NOT used to build s2,
if D > 0:
if levenshtein(s1[1:], s2, D - 1):
# deletion
return True
if levenshtein(s1[1:], s2[1:], D - 1):
# substitution
return True
# assuming s1[0] is used to build s2
for d in range(min(D+1, len(s2))):
# d is the position where s1[0]
# might be used.
# it is also the number of character
# that are required to be inserted before
# using s1[d].
if s1[0] == s2[d]:
if levenshtein(s1[1:], s2[d+1:], D - d):
return True
return False


It would be great if anyone can shed some light on how this works! Thanks!

• +5

 » 5 years ago, # |   0 Auto comment: topic has been updated by intptr (previous revision, new revision, compare).
 » 5 months ago, # |   0 import java.util.Scanner; public class MED { public static void main(String[] args) { Scanner sc = new Scanner(System.in); int t = sc.nextInt(); sc.nextLine(); for (int i = 0; i < t; i++) { String s1 =sc.nextLine(); String s2 = sc.nextLine(); System.out.println(minimumEditDistance(s1,s2)); } } private static int minimumEditDistance(String s1,String s2) { int[][] dp = new int[s1.length()+1][s2.length()+1]; for (int i = 0; i < s1.length(); i++) { dp[i][0] = i; } for (int i = 0; i < s2.length(); i++) { dp[0][i] = i; } for (int i = 1; i <= s1.length(); i++) { for (int j = 1; j <= s2.length(); j++) { if(s1.charAt(i-1) == s2.charAt(j-1)) dp[i][j] = dp[i-1][j-1]; else dp[i][j] = min(dp[i-1][j-1],dp[i][j-1],dp[i-1][j]) + 1; } } return dp[s1.length()][s2.length()]; } private static int min(int a,int b,int c) { int min = Math.min(a, b); return Math.min(min, c); }}