net.java.sen
Class StringTagger

java.lang.Object
  extended by net.java.sen.StringTagger

public class StringTagger
extends Object

Tokenizes strings

See examples.StringTaggerDemo in the Sen source for an example of how to use this class

Thread Safety: Objects of this class are NOT thread safe and should not be accessed simultaneously by multiple threads. Note that creating additional instances using SenFactory is relatively cheap in both memory and time


Constructor Summary
StringTagger(Tokenizer tokenizer)
           
 
Method Summary
 void addFilter(StreamFilter filter)
          Add a StreamFilter to be applied during analysis
 List<Token> analyze(char[] surface)
          Deprecated. use analyze(char[], List) instead.
 List<Token> analyze(char[] surface, List<Token> reuse)
          Decompose a string into its most likely constituent morphemes
 List<Token> analyze(String surface)
          Deprecated. use analyze(String, List) instead.
 List<Token> analyze(String surface, List<Token> reuse)
          Decompose a string into its most likely constituent morphemes
 void removeFilters()
          Remove all current StreamFilters
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StringTagger

public StringTagger(Tokenizer tokenizer)
Parameters:
tokenizer - The Tokenizer to use for analysis
Method Detail

addFilter

public void addFilter(StreamFilter filter)
Add a StreamFilter to be applied during analysis

Parameters:
filter - The StreamFilter to add

removeFilters

public void removeFilters()
Remove all current StreamFilters


analyze

public List<Token> analyze(String surface,
                           List<Token> reuse)
                    throws IOException
Decompose a string into its most likely constituent morphemes

Parameters:
surface - The string to analyse
Returns:
An array of Tokens representing the most likely morphemes
Throws:
IOException

analyze

@Deprecated
public List<Token> analyze(String surface)
                    throws IOException
Deprecated. use analyze(String, List) instead.

Throws:
IOException

analyze

public List<Token> analyze(char[] surface,
                           List<Token> reuse)
                    throws IOException
Decompose a string into its most likely constituent morphemes

Parameters:
surface - The string to analyse
Returns:
An array of Tokens representing the most likely morphemes
Throws:
IOException

analyze

@Deprecated
public List<Token> analyze(char[] surface)
                    throws IOException
Deprecated. use analyze(char[], List) instead.

Throws:
IOException


Copyright © 2012. All Rights Reserved.