Package org.apache.lucene.analysis.ja

Class Summary
JapaneseAnalyzer Analyzer for Japanese which uses "Sen" morphological analyzer.
JapaneseBasicFormFilter Replaces term text with the BasicFormAttribute.
JapaneseKatakanaStemFilter Convert a katakana word to a normalized form by stemming KATAKANA-HIRAGANA PROLONGED SOUND MARK (U+30FC) which exists at the last of the string.
JapanesePartOfSpeechKeepFilter Removes tokens that do NOT match a set of POS tags.
JapanesePartOfSpeechStopFilter Removes tokens that match a set of POS tags.
JapanesePunctuationFilter Removes punctuation tokens
JapaneseTokenizer This is a Japanese tokenizer which uses "Sen" morphological analyzer.
JapaneseWidthFilter A TokenFilter that normalizes CJK width differences: Folds fullwidth ASCII variants into the equivalent basic latin Folds halfwidth Katakana variants into the equivalent kana NOTE: this filter can be viewed as a (practical) subset of NFKC/NFKD Unicode normalization.
StreamTagger2 Breaks text into sentences according to UAX #29: Unicode Text Segmentation (http://www.unicode.org/reports/tr29/)
ToStringUtil  
 



Copyright © 2012. All Rights Reserved.