Strsim is an Elixir wrapper for the Rust strsim crate with Rustler.
Strsim is a NIF-based bridge for the strsim Rust library which implements the following string similarity algorithms:
- Levenshtein
- Damerau-Levensthein
- Jaro
- Jaro-Winkler
- Hamming
- Optimal String Alignment
- Sørensen–Dice
The crate offers several functions for both strings and generic sequences, and this library exposes all of them except for the generic Damerau-Levenshtein for now.
All of the functions in the crate have equivalent Elixir functions:
iex(1)> Strsim.damerau_levenshtein("ab", "bca")
{:ok, 2}
iex(2)> Strsim.generic_hamming([1, 2], [1, 3])
{:ok, 1}
iex(3)> Strsim.generic_jaro([1, 2], [1, 3, 4])
{:ok, 0.611111111111111}
iex(4)> Strsim.generic_jaro_winkler([1, 2], [1, 3, 4])
{:ok, 0.6499999999999999}
iex(5)> Strsim.generic_levenshtein([1, 2, 3], [1, 2, 3, 4, 5, 6])
{:ok, 3}
iex(6)> Strsim.hamming("hamming", "hammers")
{:ok, 3}
iex(7)> Strsim.hamming("hamming", "ham")
{:error, :different_length_args}
iex(8)> Strsim.jaro("Friedrich Nietzsche", "Jean-Paul Sartre")
{:ok, 0.39188596491228067}
iex(9)> Strsim.jaro_winkler("cheeseburger", "cheese fries")
{:ok, 0.9111111111111111}
iex(10)> Strsim.levenshtein("kitten", "sitting")
{:ok, 3}
iex(11)> Strsim.normalized_damerau_levenshtein("levenshtein", "löwenbräu")
{:ok, 0.2727272727272727}
iex(12)> Strsim.normalized_levenshtein("kitten", "sitting")
{:ok, 0.5714285714285714}
iex(13)> Strsim.osa_distance("ab", "bca")
{:ok, 3}
iex(14)> Strsim.sorensen_dice("ferris", "feris")
{:ok, 0.8888888888888888}
Everybody loves benchmarks. There are results for all implemented strsim as well as jaro, jaro_winkler, levenshtein and hamming comparing the Rust and various Elixir implementations.
To run the benchmarks:
# run Elixir vs Rust Jaro benchmarks
$ MIX_ENV=bench mix bench.jaro
# run Elixir vs Rust Jaro-Winkler benchmarks
$ MIX_ENV=bench mix bench.jaro_winkler
# run Elixir vs Rust levensthein benchmarks
$ MIX_ENV=bench mix bench.levenshtein
# run Elixir vs Rust hamming benchmarks
$ MIX_ENV=bench mix bench.hamming
# run a benchmark will all of the Rust functions
$ MIX_ENV=bench mix bench.strsim
# run 'em all
$ MIX_ENV=bench mix bench.all
The package can be installed
by adding strsim
to your list of dependencies in mix.exs
:
def deps do
[
{:strsim, "~> 0.1.1"}
]
end
Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/strsim.