|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||
StreamFilter
StreamFilter to be applied during analysis
StringTagger.analyze(String, List) instead.
StringTagger.analyze(char[], List) instead.
Morpheme.getBasicForm().Node representing a beginning-of-string
charsTokens with a single
composite TokenToken with one or more
alternative Tokens.Morpheme.getConjugationalForm() and
Morpheme.getConjugationalType().Node, comprising
this.prev, this Node, and
this.next.
Token.getCost().org.apache.lucene.analysis.util.ReusableAnalyzerBase.TokenStreamComponents
used to tokenize all the text in the provided Reader.
Dictionary class wraps access to a compiled Sen dictionaryDictionary used to find possible morphemes
Node representing an end-of-string
Viterbi.getBestTokens(Sentence, List) instead
Node.
CToken.
Node.
CToken.
Node with the specified
characteristics.
CToken.
Tokens are available
BasicFormAttribute.JapaneseBasicFormFilter.JapaneseKatakanaStemFilter.JapanesePartOfSpeechKeepFilter.JapanesePartOfSpeechStopFilter.JapanesePunctuationFilter.JapaneseTokenizer.TokenFilter that normalizes CJK width differences:
Folds fullwidth ASCII variants into the equivalent basic latin
Folds halfwidth Katakana variants into the equivalent kana
NOTE: this filter can be viewed as a (practical) subset of NFKC/NFKD
Unicode normalization.JapaneseWidthFilter.Node covers
Node returned for the same ending position
within the sentence by the Dictionary
Morpheme that does not link to any
Dictionary
Morpheme that does not link
to any Dictionary
Morpheme that is contained within this Node
Node lattice
Viterbi cost latticeMorpheme.getPartOfSpeech().Node lattice
ReadingProcessor.ReadingResult object
isolated from further changes to the reading processor
Morpheme.getPronunciations().Morpheme.getReadings().StreamFilters
Node returned for the same starting position
within the sentence by the Dictionary
Tokens
used to replace them
Viterbi, StringTagger, and
ReadingProcessor objectsposition;
any existing constraints that overlap the new constraint will be removed.
- setReadingConstraint(Reading) -
Method in class net.java.sen.ReadingProcessor
- Sets a reading constraint on the currently analysed text.
- setSentenceStart(boolean) -
Method in class net.java.sen.dictionary.Token
- Sets whether or not this token begins a new sentence.
- setSentenceStart(boolean) -
Method in interface org.apache.lucene.analysis.ja.tokenAttributes.SentenceStartAttribute
-
- setSentenceStart(boolean) -
Method in class org.apache.lucene.analysis.ja.tokenAttributes.SentenceStartAttributeImpl
-
- setStart(int) -
Method in class net.java.sen.dictionary.Token
- Sets the start of the character range of this Token within the
underlying sentence
- setSurface(String) -
Method in class net.java.sen.dictionary.Token
- Sets the character range of this Token within the underlying sentence
- setText(String) -
Method in class net.java.sen.ReadingProcessor
- Sets the currently analysed text.
- setVisible(int, Boolean) -
Method in class net.java.sen.filter.reading.OverrideFilter
- Sets a visibility override at a given character index
- size() -
Method in class net.java.sen.compiler.VirtualTupleList
- Returns the number of entries in the list
- SIZE -
Static variable in class net.java.sen.dictionary.CToken
- The length in bytes of a stored CToken
- skippedCharCount() -
Method in interface net.java.sen.dictionary.SentenceIterator
- Returns the number of characters skipped between the previous and
current character spans
- sort() -
Method in class net.java.sen.compiler.VirtualTupleList
- Sorts the list
- span -
Variable in class net.java.sen.dictionary.Node
- The number of characters between the end of the previous
Node
and the end of this one, including any ignored characters that do not
form part of the Morpheme
- start -
Variable in class net.java.sen.dictionary.Node
- The index of the first character of this
Node within the
surface
- start -
Variable in class net.java.sen.dictionary.Reading
- The starting point within the sentence
- StreamFilter - Interface in net.java.sen.filter
- Represents a Node filter capable of both pre- and post-processing.
- StreamTagger - Class in net.java.sen
- Tokenizes text read from a
java.io.Reader
See examples.StreamTaggerDemo in the Sen source for an example of how to
use this class
Thread Safety: Objects of this class are NOT thread safe and
should not be accessed simultaneously by multiple threads. - StreamTagger(StringTagger, Reader) -
Constructor for class net.java.sen.StreamTagger
-
- StreamTagger2 - Class in org.apache.lucene.analysis.ja
- Breaks text into sentences according to UAX #29: Unicode Text Segmentation
(http://www.unicode.org/reports/tr29/)
- StreamTagger2(StringTagger, Reader) -
Constructor for class org.apache.lucene.analysis.ja.StreamTagger2
- Construct a new StreamTagger2 that breaks text into words from the given Reader.
- StringCTokenTuple - Class in net.java.sen.compiler
- A tuple comprising a String and a CToken
- StringCTokenTuple(String, CToken) -
Constructor for class net.java.sen.compiler.StringCTokenTuple
-
- StringTagger - Class in net.java.sen
- Tokenizes strings
See examples.StringTaggerDemo in the Sen source for an example of how to
use this class
Thread Safety: Objects of this class are NOT thread safe and
should not be accessed simultaneously by multiple threads.
- StringTagger(Tokenizer) -
Constructor for class net.java.sen.StringTagger
-
Dictionary to assist the decomposition of
strings into potential morphemesTokenizer that uses the specified
Dictionary to find possible morphemes within a given string
CToken representing an unknown morpheme
StringCTokenTuples.true if the stored readings, if any, are to be shown,
otherwise false
|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||