| Package | Description |
|---|---|
| com.wcohen.ss |
This package contains a bunch of approximate string comparators, plus code for
performing controlled experiments with this.
|
| com.wcohen.ss.api | |
| com.wcohen.ss.expt | |
| com.wcohen.ss.lookup | |
| com.wcohen.ss.tokens |
| Modifier and Type | Class and Description |
|---|---|
class |
BasicStringWrapper
An extendible (non-final) class that implements some of the
functionality of a string.
|
class |
MultiStringWrapper
A StringWrapper that stores a version of the string
that has been either (a) split into a number of distinct fields,
or (b) duplicated k times, so that k different StringDistance's
can preprocess it, of (b) both of the above.
|
protected class |
SourcedTFIDF.UnitVector
Marker class extending BagOfTokens
|
protected class |
TagLink.UnitVector
Marker class extending BagOfTokens
|
protected class |
TFIDF.UnitVector
Marker class extending BagOfTokens
|
| Modifier and Type | Field and Description |
|---|---|
protected StringWrapper |
CombinedStringDistanceLearner.MyDistanceInstance.a |
protected StringWrapper |
CombinedStringDistanceLearner.MyDistanceInstance.b |
protected StringWrapper |
MemoMatrix.s |
protected StringWrapper |
MemoMatrix.t |
| Modifier and Type | Method and Description |
|---|---|
StringWrapper |
MultiStringWrapper.get(int i)
Return the i-th field.
|
StringWrapper |
CombinedStringDistanceLearner.MyDistanceInstance.getA() |
StringWrapper |
CombinedStringDistanceLearner.MyMultiDistanceInstance.getA(int j) |
StringWrapper |
CombinedStringDistanceLearner.MyDistanceInstance.getB() |
StringWrapper |
CombinedStringDistanceLearner.MyMultiDistanceInstance.getB(int j) |
StringWrapper |
CombinedStringDistanceLearner.JthStringWrapperValueIterator.nextStringWrapper() |
StringWrapper |
BasicStringWrapperIterator.nextStringWrapper() |
StringWrapper |
BasicSourcedStringWrapperIterator.nextStringWrapper() |
StringWrapper |
WinklerRescorer.prepare(String s) |
StringWrapper |
TokenFelligiSunter.prepare(String s)
Preprocess a string by finding tokens and giving them appropriate weights
|
StringWrapper |
TFIDF.prepare(String s)
Preprocess a string by finding tokens and giving them TFIDF weights
|
StringWrapper |
TagLink.prepare(String s)
Preprocess a string by finding tokens and giving them TFIDF weights
|
StringWrapper |
SourcedTFIDF.prepare(String s)
Preprocess a string by finding tokens and giving them TFIDF weights
|
StringWrapper |
SoftTokenFelligiSunter.prepare(String s)
Preprocess a string by finding tokens
|
StringWrapper |
MultiStringDistance.prepare(String s)
Prepare a string.
|
StringWrapper |
JensenShannonDistance.prepare(String s)
Preprocess a string by finding tokens and giving them weights W
such that W is the smoothed probability of the token appearing
in the document.
|
StringWrapper |
Jaro.prepare(String s) |
StringWrapper |
Jaccard.prepare(String s)
Preprocess a string by finding tokens.
|
StringWrapper |
CombinedStringDistanceLearner.CombinedStringDistance.prepare(String s) |
StringWrapper |
AbstractStringDistance.prepare(String s)
Default way to preprocess a string for distance computation.
|
| Modifier and Type | Method and Description |
|---|---|
protected com.wcohen.ss.BagOfTokens |
AbstractTokenizedStringDistance.asBagOfTokens(StringWrapper w) |
protected MultiStringWrapper |
MultiStringDistance.asMultiStringWrapper(StringWrapper w)
Lazily prepare a string.
|
protected MultiStringWrapper |
CombinedStringDistanceLearner.asMultiStringWrapper(StringWrapper w) |
protected TFIDF.UnitVector |
TFIDF.asUnitVector(StringWrapper w) |
protected TagLink.UnitVector |
TagLink.asUnitVector(StringWrapper w) |
protected void |
AbstractStatisticalTokenDistance.checkTrainingHasHappened(StringWrapper s,
StringWrapper t) |
protected void |
AbstractSourcedStatisticalTokenDistance.checkTrainingHasHappened(StringWrapper s,
StringWrapper t) |
String |
WinklerRescorer.explainScore(StringWrapper s,
StringWrapper t) |
String |
TokenFelligiSunter.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
TFIDF.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
TagLink.explainScore(StringWrapper s,
StringWrapper t)
explainStringMetric gives a brief explanation of how the stringMetric was
computed.
|
String |
SourcedTFIDF.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
SourcedSoftTFIDF.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
SoftTokenFelligiSunter.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
SoftTFIDF.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
SmithWaterman.explainScore(StringWrapper s,
StringWrapper t) |
String |
NeedlemanWunsch.explainScore(StringWrapper s,
StringWrapper t) |
String |
MultiStringDistance.explainScore(StringWrapper s,
StringWrapper t) |
String |
MongeElkan.explainScore(StringWrapper s,
StringWrapper t)
Version where distance which is possibly scaled to [0,1].
|
String |
Mixture.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
Level2.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
JensenShannonDistance.explainScore(StringWrapper s,
StringWrapper t) |
String |
Jaro.explainScore(StringWrapper s,
StringWrapper t) |
String |
Jaccard.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
String |
CombinedStringDistanceLearner.CombinedStringDistance.explainScore(StringWrapper s,
StringWrapper t) |
String |
ApproxNeedlemanWunsch.explainScore(StringWrapper s,
StringWrapper t) |
String |
AffineGap.explainScore(StringWrapper s,
StringWrapper t) |
abstract String |
AbstractStringDistance.explainScore(StringWrapper s,
StringWrapper t)
This method needs to be implemented by subclasses.
|
String |
AbbreviationAlignment.explainScore(StringWrapper s,
StringWrapper t) |
double |
WinklerRescorer.score(StringWrapper s,
StringWrapper t) |
double |
TokenFelligiSunter.score(StringWrapper s,
StringWrapper t) |
double |
TFIDF.score(StringWrapper s,
StringWrapper t) |
double |
TagLink.score(StringWrapper s,
StringWrapper t)
getStringMetric computes the similarity between a pair of strings T and U.
|
double |
SourcedTFIDF.score(StringWrapper s0,
StringWrapper t0) |
double |
SourcedSoftTFIDF.score(StringWrapper s0,
StringWrapper t0) |
double |
SoftTokenFelligiSunter.score(StringWrapper s,
StringWrapper t) |
double |
SoftTFIDF.score(StringWrapper s,
StringWrapper t) |
double |
SmithWaterman.score(StringWrapper s,
StringWrapper t) |
double |
ScaledLevenstein.score(StringWrapper s,
StringWrapper t) |
double |
NeedlemanWunsch.score(StringWrapper s,
StringWrapper t) |
double |
MultiStringDistance.score(StringWrapper s,
StringWrapper t) |
double |
MongeElkan.score(StringWrapper s,
StringWrapper t)
Version of distance which is possibly scaled to [0,1].
|
double |
Mixture.score(StringWrapper s,
StringWrapper t)
Distance is argmax_lambda prod_{w in s} lambda Pr(w|t) * (1-lambda) Pr(w|background).
|
double |
Level2.score(StringWrapper s,
StringWrapper t) |
double |
JensenShannonDistance.score(StringWrapper s,
StringWrapper t)
Jensen-Shannon distance between distributions.
|
double |
Jaro.score(StringWrapper s,
StringWrapper t) |
double |
Jaccard.score(StringWrapper s,
StringWrapper t) |
double |
CombinedStringDistanceLearner.CombinedStringDistance.score(StringWrapper s,
StringWrapper t) |
double |
ApproxNeedlemanWunsch.score(StringWrapper s,
StringWrapper t) |
double |
AffineGap.score(StringWrapper s,
StringWrapper t) |
abstract double |
AbstractStringDistance.score(StringWrapper s,
StringWrapper t)
This method needs to be implemented by subclasses.
|
double |
AbbreviationAlignment.score(StringWrapper s,
StringWrapper t) |
void |
MultiStringWrapper.set(int i,
StringWrapper w)
Set the i-th field.
|
| Constructor and Description |
|---|
InsertSMatrix(StringWrapper s,
StringWrapper t) |
InsertTMatrix(StringWrapper s,
StringWrapper t) |
MatrixTrio(StringWrapper s,
StringWrapper t) |
MyDistanceInstance(StringWrapper a,
StringWrapper b,
boolean correct,
double distance) |
MyMultiDistanceInstance(StringWrapper a,
StringWrapper b,
boolean correct,
double distance) |
| Modifier and Type | Interface and Description |
|---|---|
interface |
IdentifiedStringWrapper |
interface |
SourcedStringWrapper |
| Modifier and Type | Method and Description |
|---|---|
StringWrapper |
DistanceInstance.getA() |
StringWrapper |
DistanceInstance.getB() |
StringWrapper |
StringWrapperIterator.nextStringWrapper() |
StringWrapper |
StringDistance.prepare(String s)
Preprocess a string for distance computation
|
| Modifier and Type | Method and Description |
|---|---|
String |
StringDistance.explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
double |
StringDistance.score(StringWrapper s,
StringWrapper t)
Find the distance between s and t.
|
| Modifier and Type | Class and Description |
|---|---|
static class |
MatchData.Instance
A single item (aka record, string, etc) to match against others.
|
static class |
SourcedMatchData.Instance
A single item (aka record, string, etc) to match against
others.
|
| Modifier and Type | Method and Description |
|---|---|
StringWrapper |
Blocker.Pair.getA() |
StringWrapper |
Blocker.Pair.getB() |
StringWrapper |
SourcedMatchData.MatchIterator.nextStringWrapper()
Return the next StringWrapper.
|
StringWrapper |
MatchData.MatchIterator.nextStringWrapper()
Return the next StringWrapper.
|
| Modifier and Type | Method and Description |
|---|---|
StringWrapper |
SoftDictionary.prepare(String s)
Prepare a string for quicker lookup.
|
| Modifier and Type | Method and Description |
|---|---|
Object |
SoftDictionary.lookup(String id,
StringWrapper toFind)
Lookup a prepared string in the dictionary.
|
Object |
SoftDictionary.lookup(StringWrapper toFind)
Lookup a prepared string in the dictionary.
|
double |
SoftDictionary.lookupDistance(String id,
StringWrapper toFind)
Return the distance to the best match.
|
double |
SoftDictionary.lookupDistance(StringWrapper toFind)
Return the distance to the best match.
|
void |
SoftDictionary.put(String id,
StringWrapper toInsert,
Object value)
Insert a prepared string into the dictionary.
|
| Modifier and Type | Method and Description |
|---|---|
String |
TagLinkToken.explainScore(StringWrapper s,
StringWrapper t)
explainScore returns an explanation of how the string distance was
computed.
|
double |
TagLinkToken.score(StringWrapper s,
StringWrapper t)
score return the a strng distance value between 0 and 1 of a pair
of tokens.
|
Copyright © 2016. All rights reserved.