net.java.sen.tokenizers.ja
Class JapaneseTokenizer
java.lang.Object
net.java.sen.dictionary.Tokenizer
net.java.sen.tokenizers.ja.JapaneseTokenizer
public class JapaneseTokenizer
- extends Tokenizer
A Tokenizer for Japanese text
|
Method Summary |
Node |
lookup(SentenceIterator iterator,
char[] surface)
Searches for possible morphemes from the given SentenceIterator. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
JapaneseTokenizer
public JapaneseTokenizer(Dictionary dictionary,
String unknownPartOfSpeechDescription)
- Creates a JapaneseTokenizer with the given Dictionary
- Parameters:
dictionary - The Dictionary in which to search for possible morphemesunknownPartOfSpeechDescription - The part-of-speech code to use for unknown tokens
lookup
public Node lookup(SentenceIterator iterator,
char[] surface)
- Description copied from class:
Tokenizer
- Searches for possible morphemes from the given SentenceIterator. The
Node that is returned links through
Node.rnext to a list of matches which may be of varying
lengths
- Specified by:
lookup in class Tokenizer
- Parameters:
iterator - The iterator to search fromsurface - The underlying character surface
- Returns:
- The head of a chain of
Nodes representing the possible
morphemes beginning at the given index
Copyright © 2012. All Rights Reserved.