Food Items Mapping using Fuzzy Matching
Foodmapping is an R package that enable to compare any two food items, in term of name similarity. Name similarity is computed using Fuzzy Matching (the "Partial Token Sort Ratio" metric, as implemented in the fuzzywuzzyR package).
This package is an interface to the excellent fuzzywuzzyR R package.
It has the following additional dependencies (cf fuzzywuzzyR installation guide):
-
Python (>= 2.4)
-
fuzzywuzzyR (>=1.0.2)
-
fuzzywuzzy (>=0.15.0)
-
python-Levenshtein (>=0.12.0, optional but can enable speed-up)
To install, run the following commands in R:
install.packages("devtools")
devtools::install_github("armandvalsesia/Foodmapping", build_vignettes = TRUE)
require("Foodmapping")
# comparison between two food names:
v_fz_tk_sort_r( "Tomatoes" , "raw. tomatoes" )
# pairwise comparison between a single item and a list of items
item <- "Tomatoes"
query_list <- c( "raw. tomatoes", "Tomatoe soup with basil", "Carot soup" )
df <- expand.grid( A = item, B = query_list, stringsAsFactors = FALSE ) # create pairwise comparison
df$score <- v_fz_tk_sort_r( df$A , df$B ) # compute fuzzy score
df
# pairwise comparison between elements from two list of items
query_list <- c( "raw. tomatoes", "Tomatoe soup with basil", "Carot soup", "chicken" )
query_list2 <- c( "tomato, raw", "soup tomatoes and basil", "tiramisu" )
df <- expand.grid( A = query_list, B = query_list2, stringsAsFactors = FALSE ) # create pairwise comparison
df$score <- v_fz_tk_sort_r( df$A , df$B ) # compute fuzzy score
df
This software uses the GPL v2 license, see LICENSE. Authors and copyright are provided in DESCRIPTION.