Skip to content

Commit

Permalink
fix: switch to JaroWinklerSimilarity from JaroWinklerDistance
Browse files Browse the repository at this point in the history
Distance was mistakenly used before, but due to a bug in commons-text the two returned the same value.
This was fixed in 1.10.0: https://issues.apache.org/jira/browse/TEXT-191
  • Loading branch information
dim5 authored and kamaladafrica committed Nov 14, 2022
1 parent 7333608 commit ce63739
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import java.util.stream.Stream;

import org.apache.commons.lang3.tuple.Pair;
import org.apache.commons.text.similarity.JaroWinklerDistance;
import org.apache.commons.text.similarity.JaroWinklerSimilarity;
import org.apache.commons.text.similarity.SimilarityScoreFrom;

import com.google.common.collect.ImmutableList;
Expand Down Expand Up @@ -63,7 +63,7 @@ public City findByName(String name) {

result = cityByName.get(term);
if (minimumMatchScore != EXACT_MATCH_SCORE && result == null) {
final SimilarityScoreFrom<Double> score = new SimilarityScoreFrom<>(new JaroWinklerDistance(), term);
final SimilarityScoreFrom<Double> score = new SimilarityScoreFrom<>(new JaroWinklerSimilarity(), term);
result = cityByName.entrySet().stream().map(e -> Pair.of(e.getValue(), score.apply(e.getKey())))
.filter(e -> e.getValue() >= minimumMatchScore).max(Comparator.comparing(Entry::getValue))
.map(Entry::getKey).orElse(null);
Expand Down

0 comments on commit ce63739

Please sign in to comment.