|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
| Class Summary | |
|---|---|
| JapaneseAnalyzer | Analyzer for Japanese which uses "Sen" morphological analyzer. |
| JapaneseBasicFormFilter | Replaces term text with the BasicFormAttribute. |
| JapaneseKatakanaStemFilter | Convert a katakana word to a normalized form by stemming KATAKANA-HIRAGANA PROLONGED SOUND MARK (U+30FC) which exists at the last of the string. |
| JapanesePartOfSpeechKeepFilter | Removes tokens that do NOT match a set of POS tags. |
| JapanesePartOfSpeechStopFilter | Removes tokens that match a set of POS tags. |
| JapanesePunctuationFilter | Removes punctuation tokens |
| JapaneseTokenizer | This is a Japanese tokenizer which uses "Sen" morphological analyzer. |
| JapaneseWidthFilter | A TokenFilter that normalizes CJK width differences:
Folds fullwidth ASCII variants into the equivalent basic latin
Folds halfwidth Katakana variants into the equivalent kana
NOTE: this filter can be viewed as a (practical) subset of NFKC/NFKD
Unicode normalization. |
| StreamTagger2 | Breaks text into sentences according to UAX #29: Unicode Text Segmentation (http://www.unicode.org/reports/tr29/) |
| ToStringUtil | |
|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||