public class PolishWordTokenizer extends WordTokenizer
REMOVED_EMOJI| Constructor and Description |
|---|
PolishWordTokenizer() |
| Modifier and Type | Method and Description |
|---|---|
void |
setTagger(Tagger tagger)
Set the tagger to use in tokenizing.
|
List<String> |
tokenize(String text)
Tokenizes text.
|
getProtocols, getTokenizingCharacters, isCurrencyExpression, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrls, replaceEmojis, restoreEmojis, splitCurrencyExpressionpublic List<String> tokenize(String text)
tokenize in interface Tokenizertokenize in class WordTokenizertext - String of words to tokenize.public void setTagger(Tagger tagger)
tagger - The tagger to use (compatible only with the
Polish BaseTagger that uses the delivered PoliMorfologik 2.1
or later).