org.apache.lucene.analysis.ja
Class JapaneseKatakanaStemFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.ja.JapaneseKatakanaStemFilter
- All Implemented Interfaces:
- Closeable
public final class JapaneseKatakanaStemFilter
- extends org.apache.lucene.analysis.TokenFilter
Convert a katakana word to a normalized form by stemming KATAKANA-HIRAGANA
PROLONGED SOUND MARK (U+30FC) which exists at the last of the string. In
general, most of Japanese full-text search engine uses more complicated
method which needs dictionaries. I think they are better than this filter in
quality, but they needs a well-tuned dictionary. In contract, this filter is
simple and maintenance-free.
Note: This filter don't supports hankaku katakana characters, so you must
convert them before using this filter. And this filter support only
pre-composed characters.
To prevent terms from being stemmed use an instance of
KeywordMarkerFilter or a custom TokenFilter that sets
the KeywordAttribute before this TokenStream.
| Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
| Fields inherited from class org.apache.lucene.analysis.TokenFilter |
input |
|
Method Summary |
boolean |
incrementToken()
Returns the next input Token, after being stemmed |
| Methods inherited from class org.apache.lucene.analysis.TokenFilter |
close, end, reset |
| Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
JapaneseKatakanaStemFilter
public JapaneseKatakanaStemFilter(org.apache.lucene.analysis.TokenStream in)
incrementToken
public boolean incrementToken()
throws IOException
- Returns the next input Token, after being stemmed
- Specified by:
incrementToken in class org.apache.lucene.analysis.TokenStream
- Throws:
IOException
Copyright © 2012. All Rights Reserved.