|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||
java.lang.Objectorg.opencms.search.extractors.A_CmsTextExtractor
org.opencms.search.extractors.CmsExtractorPdf
public final class CmsExtractorPdf
Extracts the text from a PDF document.
| Method Summary | |
|---|---|
I_CmsExtractionResult |
extractText(java.io.InputStream in)
Extracts the text and meta information from the document on the input stream. |
static I_CmsTextExtractor |
getExtractor()
Returns an instance of this text extractor. |
| Methods inherited from class org.opencms.search.extractors.A_CmsTextExtractor |
|---|
combineContentItem, extractText, extractText, extractText, extractText, removeControlChars |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Method Detail |
|---|
public static I_CmsTextExtractor getExtractor()
public I_CmsExtractionResult extractText(java.io.InputStream in)
throws java.lang.Exception
I_CmsTextExtractorThe encoding of the input stream is either not required (the document type may have one common default encoding) or the extractor is able to divine the encoding from the provided input stream automatically.
Delivers is the same result as calling
when I_CmsTextExtractor.extractText(InputStream, String)String == null.
extractText in interface I_CmsTextExtractorextractText in class A_CmsTextExtractorin - the input stream for the document to extract the text from
java.lang.Exception - if the text extration failsI_CmsTextExtractor.extractText(java.io.InputStream, java.lang.String)
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||