Package net.sf.okapi.steps.tokenization
Class TokenizationStep
- java.lang.Object
-
- net.sf.okapi.common.pipeline.BasePipelineStep
-
- net.sf.okapi.steps.tokenization.TokenizationStep
-
- All Implemented Interfaces:
IPipelineStep
public class TokenizationStep extends BasePipelineStep
-
-
Constructor Summary
Constructors Constructor Description TokenizationStep()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description List<Token>apostrophe(Token token, LocaleId locale)Break French and Italian words with apostrophe into three tokens WORD, PUNCTUATION, WORDStringgetDescription()Gets a short localizable description of what this step does.StringgetName()Gets the localizable name of this step.LocaleIdgetSourceLocale()Delegate to concrete classLocaleIdgetTargetLocale()Delegate to concrete classprotected EventhandleStartDocument(Event event)Handles theEventType.START_DOCUMENTevent.protected EventhandleTextUnit(Event event)Handles theEventType.TEXT_UNITevent.Collection<? extends Token>postProcess(Token t, LocaleId language)Various rules to make corrections toRbbiTokenizervoidsetSourceLocale(LocaleId sourceLocale)Delegate to concrete classvoidsetTargetLocale(LocaleId targetLocale)-
Methods inherited from class net.sf.okapi.common.pipeline.BasePipelineStep
cancel, destroy, getHelpLocation, getParameters, handleCustom, handleDocumentPart, handleEndBatch, handleEndBatchItem, handleEndDocument, handleEndGroup, handleEndSubDocument, handleEndSubfilter, handleEvent, handleMultiEvent, handlePipelineParameters, handleRawDocument, handleStartBatch, handleStartBatchItem, handleStartGroup, handleStartSubDocument, handleStartSubfilter, isDone, isLastOutputStep, setLastOutputStep, setParameters
-
-
-
-
Method Detail
-
handleStartDocument
protected Event handleStartDocument(Event event)
Description copied from class:BasePipelineStepHandles theEventType.START_DOCUMENTevent.- Overrides:
handleStartDocumentin classBasePipelineStep- Parameters:
event- event to handle.- Returns:
- the event returned.
-
handleTextUnit
protected Event handleTextUnit(Event event)
Description copied from class:BasePipelineStepHandles theEventType.TEXT_UNITevent.- Overrides:
handleTextUnitin classBasePipelineStep- Parameters:
event- event to handle.- Returns:
- the event returned.
-
getSourceLocale
public LocaleId getSourceLocale()
Description copied from interface:IPipelineStepDelegate to concrete class- Specified by:
getSourceLocalein interfaceIPipelineStep- Overrides:
getSourceLocalein classBasePipelineStep- Returns:
- LocaleId
-
setSourceLocale
public void setSourceLocale(LocaleId sourceLocale)
Description copied from interface:IPipelineStepDelegate to concrete class- Specified by:
setSourceLocalein interfaceIPipelineStep- Overrides:
setSourceLocalein classBasePipelineStep
-
getTargetLocale
public LocaleId getTargetLocale()
Description copied from interface:IPipelineStepDelegate to concrete class- Specified by:
getTargetLocalein interfaceIPipelineStep- Overrides:
getTargetLocalein classBasePipelineStep- Returns:
- LocaleId
-
setTargetLocale
public void setTargetLocale(LocaleId targetLocale)
- Specified by:
setTargetLocalein interfaceIPipelineStep- Overrides:
setTargetLocalein classBasePipelineStep
-
postProcess
public Collection<? extends Token> postProcess(Token t, LocaleId language)
Various rules to make corrections toRbbiTokenizer- Parameters:
t- theToken- Returns:
- list of correct tokens or the original token if no changes were made
-
apostrophe
public List<Token> apostrophe(Token token, LocaleId locale)
Break French and Italian words with apostrophe into three tokens WORD, PUNCTUATION, WORD- Parameters:
token-- Returns:
- list of transformed tokens if any
-
getName
public String getName()
Description copied from interface:IPipelineStepGets the localizable name of this step.- Returns:
- the localizable name of this step.
-
getDescription
public String getDescription()
Description copied from interface:IPipelineStepGets a short localizable description of what this step does.- Returns:
- the text of a short description of what this step does.
-
-