| Package | Description |
|---|---|
| com.wcohen.ss |
This package contains a bunch of approximate string comparators, plus code for
performing controlled experiments with this.
|
| Modifier and Type | Class and Description |
|---|---|
class |
AbbreviationAlignment
Abbreviation distance metric which evaluates the probability of a short-form string being an abbreviation/acronym
of another long-form string.
|
class |
AbstractStatisticalTokenDistance
Abstract token distance metric that uses frequency statistics.
|
class |
DirichletJS
Jensen-Shannon distance of two unigram language models, smoothed
using Dirichlet prior.
|
class |
JaroTFIDF
Soft TFIDF-based distance metric, extended to use "soft" token-matching
with the Jaro distance metric.
|
class |
JaroWinklerTFIDF
Soft TFIDF-based distance metric, extended to use "soft" token-matching
with the JaroWinkler distance metric.
|
class |
JelinekMercerJS
Jensen-Shannon distance of two unigram language models, smoothed
using Jelinek-Mercer mixture model.
|
class |
JensenShannonDistance
Distance metrics based on Jensen-Shannon distance of two smoothed
unigram language models.
|
class |
Level2
Generic version of Monge & Elkan's "level 2" recursive field
matching.
|
class |
Level2Jaro
"Level 2" recursive field matching algorithm, based on Jaro
distance.
|
class |
Level2JaroWinkler
"Level 2" recursive field matching algorithm, based on Jaro
distance.
|
class |
Level2Levenstein
"Level 2" recursive field matching algorithm using Levenstein
distance.
|
class |
Level2MongeElkan
Monge & Elkan's "level 2" recursive field matching algorithm.
|
class |
Mixture
Mixture-based distance metric.
|
class |
MongeElkanTFIDF
Soft TFIDF-based distance metric, extended to use "soft" token-matching
with the MongeElkan distance metric.
|
class |
SoftTFIDF
TFIDF-based distance metric, extended to use "soft" token-matching.
|
class |
SoftTokenFelligiSunter
Highly simplified model of Felligi-Sunter's method 1,
applied to tokens.
|
class |
TagLink |
class |
TFIDF
TFIDF-based distance metric.
|
class |
TokenFelligiSunter
Highly simplified model of Felligi-Sunter's method 1,
applied to tokens.
|
class |
UnsmoothedJS
Jensen-Shannon distance of two unsmoothed unigram language models.
|
Copyright © 2016. All rights reserved.