public class LocationTextExtractionStrategy extends Object implements TextExtractionStrategy
| Modifier and Type | Class and Description |
|---|---|
static class |
LocationTextExtractionStrategy.TextChunk
Represents a chunk of text, it's orientation, and location relative to the orientation vector
|
static interface |
LocationTextExtractionStrategy.TextChunkFilter
Specifies a filter for filtering
LocationTextExtractionStrategy.TextChunk objects during text extraction |
static interface |
LocationTextExtractionStrategy.TextChunkLocation |
static class |
LocationTextExtractionStrategy.TextChunkLocationDefaultImp |
static interface |
LocationTextExtractionStrategy.TextChunkLocationStrategy |
| Constructor and Description |
|---|
LocationTextExtractionStrategy()
Creates a new text extraction renderer.
|
LocationTextExtractionStrategy(LocationTextExtractionStrategy.TextChunkLocationStrategy strat)
Creates a new text extraction renderer, with a custom strategy for
creating new TextChunkLocation objects based on the input of the
TextRenderInfo.
|
| Modifier and Type | Method and Description |
|---|---|
void |
beginTextBlock()
Called when a new text block is beginning (i.e.
|
void |
endTextBlock()
Called when a text block has ended (i.e.
|
String |
getResultantText()
Returns the result so far.
|
String |
getResultantText(LocationTextExtractionStrategy.TextChunkFilter chunkFilter)
Gets text that meets the specified filter
If multiple text extractions will be performed for the same page (i.e. for different physical regions of the page),
filtering at this level is more efficient than filtering using
FilteredRenderListener - but not nearly as powerful
because most of the RenderInfo state is not captured in LocationTextExtractionStrategy.TextChunk |
protected boolean |
isChunkAtWordBoundary(LocationTextExtractionStrategy.TextChunk chunk,
LocationTextExtractionStrategy.TextChunk previousChunk)
Determines if a space character should be inserted between a previous chunk and the current chunk.
|
void |
renderImage(ImageRenderInfo renderInfo)
no-op method - this renderer isn't interested in image events
|
void |
renderText(TextRenderInfo renderInfo)
Called when text should be rendered
|
public LocationTextExtractionStrategy()
public LocationTextExtractionStrategy(LocationTextExtractionStrategy.TextChunkLocationStrategy strat)
strat - the custom strategypublic void beginTextBlock()
RenderListenerbeginTextBlock in interface RenderListenerRenderListener.beginTextBlock()public void endTextBlock()
RenderListenerendTextBlock in interface RenderListenerRenderListener.endTextBlock()protected boolean isChunkAtWordBoundary(LocationTextExtractionStrategy.TextChunk chunk, LocationTextExtractionStrategy.TextChunk previousChunk)
chunk - the new chunk being evaluatedpreviousChunk - the chunk that appeared immediately before the current chunkpublic String getResultantText(LocationTextExtractionStrategy.TextChunkFilter chunkFilter)
FilteredRenderListener - but not nearly as powerful
because most of the RenderInfo state is not captured in LocationTextExtractionStrategy.TextChunkchunkFilter - the filter to to applypublic String getResultantText()
getResultantText in interface TextExtractionStrategypublic void renderText(TextRenderInfo renderInfo)
RenderListenerrenderText in interface RenderListenerrenderInfo - information specifying what to renderRenderListener.renderText(com.itextpdf.text.pdf.parser.TextRenderInfo)public void renderImage(ImageRenderInfo renderInfo)
renderImage in interface RenderListenerrenderInfo - information specifying what to renderRenderListener.renderImage(com.itextpdf.text.pdf.parser.ImageRenderInfo)Copyright © 2016. All rights reserved.