String distance

levitate is powered by the stringdist package, which offers a number of string distance measures.

lev_distance()

String distance metrics

String similarity ratio

Similarity ratios are friendlier than distance measures when trying to rank strings by similarity.

Simple ratios

These measures are useful for comparing individual words or other short strings.

lev_ratio()

String similarity ratio

lev_partial_ratio()

Ratio of the best-matching substring

Token-based ratios

Splitting strings into tokens is useful when comparing the similarity of longer phrases.

lev_token_set_ratio()

Matching based on common tokens

lev_token_sort_ratio()

Ordered token matching

Weighted token functions

Functions that allow you to modify the weight given to different tokens.

lev_weighted_token_ratio()

Weighted token similarity measure

lev_weighted_token_sort_ratio()

Weighted version of lev_token_sort_ratio()

lev_weighted_token_set_ratio()

Weighted version of lev_token_set_ratio()

Ranking functions

Functions for comparing multiple candidates to a single string and ranking the results.

lev_score_multiple()

Score multiple candidate strings against a single input

lev_best_match()

Get the best matched string from a list of candidates