| Interface | Description |
|---|---|
| AnchorText |
The anchor text is the visible, clickable text in a hyperlink.
|
| Corpus |
A corpus is a collection of documents.
|
| TextTerms |
The terms in a text.
|
| Class | Description |
|---|---|
| Bigram |
Bigrams or digrams are groups of two words, and are very commonly used
as the basis for simple statistical analysis of text.
|
| NGram |
An n-gram is a contiguous sequence of n words from a given sequence of text.
|
| SimpleCorpus |
An in-memory text corpus.
|
| SimpleText |
A list-of-words representation of documents.
|
| Text |
A minimal interface of text in the corpus.
|
| Trie<K,V> |
A trie, also called digital tree or prefix tree, is an ordered tree data
structure that is used to store a dynamic set or associative array where
the keys are usually strings.
|