| Modifier and Type | Field and Description |
|---|---|
int |
count
The frequency of n-gram in the corpus.
|
| Constructor and Description |
|---|
NGram(java.lang.String[] words,
int count)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
int |
compareTo(NGram o) |
static NGram[][] |
of(java.util.Collection<java.lang.String[]> sentences,
int maxNGramSize,
int minFrequency)
Extracts n-gram phrases by an Apiori-like algorithm.
|
java.lang.String |
toString() |
public NGram(java.lang.String[] words,
int count)
words - the n-gram word sequence.count - the frequency of n-gram in the corpus.public int compareTo(NGram o)
compareTo in interface java.lang.Comparable<NGram>public static NGram[][] of(java.util.Collection<java.lang.String[]> sentences, int maxNGramSize, int minFrequency)
The algorithm takes a collection of sentences and generates all n-grams of length at most MaxNGramSize that occur at least MinFrequency times in the sentences.
sentences - A collection of sentences (already split).maxNGramSize - The maximum length of n-gramminFrequency - The minimum frequency of n-gram in the sentences.