The Levenshtein distance is a metric for measuring the difference between two sequences. It is one way to calculate what is known as the “edit distance.” (see levenshtein.net for a little more on how it works)

The Levenshtein distance between two strings will give you the smallest amount of character changes (insert, replace, delete) needed to transform string A into string B. Perhaps you’ve seen a game in an online forum, or a puzzle which asks you to change one letter of a word to make it a different, yet still valid, word. Those two words would have a Levenshtein distance of one.

One use for this type of function when it comes to programming is word suggestion in a spell checker. Words with a Levenshtein distance of one or two are most often the expected word mistyped.

$a = "first";
$b = "flirt";
echo levenshtein($a, $b); // 2 (insert "l", delete "s")
echo levenshtein($b, $a); // 2 (delete "l", insert "s")

As you can see, the Levenshtein distance is the same either way. You can, however, apply “costs” to each of the three types of changes. For example, you may want a replace to be twice as expensive as an insert or a delete. Do this by using three additional parameters: insert, replace, delete.

/* we'll use these costs for this example:
insert   => 1
replace => 2
delete  => 5
*/
$a = "watts";
$b = "wits";
echo levenshtein($a, $b, 1, 2, 5); // 7 (replace "a" for 2, delete "t" for 5)
echo levenshtein($b, $a, 1, 2, 5); // 3 (replace "i" for 2, insert "t" for 1)

Note: This function has a limitation of accepting strings with a max of 255 characters. In such a case, it will return a value of -1 and throw a warning.

There are plenty of examples in the manual for this function. Be sure to go there and have a look. One thing you won’t find, however, is a twist on an old classic.

$a = "Robert Purcell";
$b = "Kevin Bacon";
echo levenshtein($a, $b); // 12 (not close enough. oh,well. i'd rather be closer to...)
 
$a = "Robert Purcell";
$b = "John Cusac";
echo levenshtein($a, $b); // 11 (a little better!)