levenshtein

Compute Levenshtein distances between strings using the Wagner-Fischer algorithm.

There are 4 functions, EditDistance() and BufferedEditDistance (and their compact counterparts). If you're going to be computing a lot of distances in a loop, you should use the buffered version to avoid all the allocation/deallocation costs. Buffered functions do not allocate. Here's the performance difference on my machine. Each run is a 1000 pair comparison:


BenchmarkUnbuffered-4                   	    3000	    552195 ns/op	  628480 B/op	    8120 allocs/op
BenchmarkBuffered-4                     	   10000	    161841 ns/op	       0 B/op	       0 allocs/op
BenchmarkUnbufferedCompact-4            	    5000	    237236 ns/op	  128000 B/op	    2000 allocs/op
BenchmarkBufferedCompact-4              	   10000	    172415 ns/op	       0 B/op	       0 allocs/op
BenchmarkBasePairsUnBuffered-4          	     200	   9482169 ns/op	12736000 B/op	   38000 allocs/op
BenchmarkBasePairsBuffered-4            	     200	   7109534 ns/op	      63 B/op	       0 allocs/op
BenchmarkBasePairsUnbufferedCompact-4   	     200	   6832891 ns/op	  640000 B/op	    2000 allocs/op
BenchmarkBasePairsBufferedCompact-4     	     300	   5796142 ns/op	       0 B/op	       0 allocs/op

The BasePair benchmarks use 36x36 character string comparisons (DNA/RNA base pairs). They highlight the benefit of the compact functions over building the whole distance matrix when inputs are large.

The compact versions just use the current and previous rows, so they run in O(m) memory rather than O(mn)

You set edit weights by passing in a Weights{} struct as a configuration parameter to the functions. Benchmarks were run with all weights set to 1.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
levenshtein.go		levenshtein.go
levenshtein_test.go		levenshtein_test.go
pairs_test.go		pairs_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

levenshtein

About

Releases

Packages

Languages

andreiamatuni/levenshtein

Folders and files

Latest commit

History

Repository files navigation

levenshtein

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages