|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||
java.lang.Objectorg.opencms.search.extractors.A_CmsTextExtractor
org.opencms.search.extractors.CmsExtractorMsOfficeOOXML
public final class CmsExtractorMsOfficeOOXML
Extracts text data from a VFS resource that is an OOXML MS Office document.
Supported formats are MS Word (.docx), MS PowerPoint (.pptx) and MS Excel (.xlsx).
The OLE 2 format was introduced in Microsoft Office version 97 and was the default format until Office version 2007 and the new XML-based OOXML format.
| Method Summary | |
|---|---|
I_CmsExtractionResult |
extractText(java.io.InputStream in)
Extracts the text and meta information from the document on the input stream. |
static I_CmsTextExtractor |
getExtractor()
Returns an instance of this text extractor. |
| Methods inherited from class org.opencms.search.extractors.A_CmsTextExtractor |
|---|
combineContentItem, extractText, extractText, extractText, extractText, removeControlChars |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Method Detail |
|---|
public static I_CmsTextExtractor getExtractor()
public I_CmsExtractionResult extractText(java.io.InputStream in)
throws java.lang.Exception
I_CmsTextExtractorThe encoding of the input stream is either not required (the document type may have one common default encoding) or the extractor is able to divine the encoding from the provided input stream automatically.
Delivers is the same result as calling
when I_CmsTextExtractor.extractText(InputStream, String)String == null.
extractText in interface I_CmsTextExtractorextractText in class A_CmsTextExtractorin - the input stream for the document to extract the text from
java.lang.Exception - if the text extration failsI_CmsTextExtractor.extractText(java.io.InputStream, java.lang.String)
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||