public final class HtmlParserSettings extends com.univocity.parsers.remote.RemoteParserSettings<com.univocity.parsers.common.CommonParserSettings,HtmlEntityList,HtmlParsingContext>
Configuration class for the HtmlParser. Properties that also exist in HtmlEntitySettings are global and will be used by each entity configuration by default. Individual HtmlEntitySettings can have their own specific configuration modified to override its defaults.
HtmlParser,
HtmlEntityList| Constructor and Description |
|---|
HtmlParserSettings()
Creates a new
HtmlParserSettings, which will process an input to produce records for entities defined by a HtmlEntityList. |
| Modifier and Type | Method and Description |
|---|---|
protected HtmlParserSettings |
clone() |
protected com.univocity.parsers.common.CommonParserSettings |
createGlobalSettings() |
void |
fetchResourcesBeforeParsing(FetchOptions fetchOptions) |
boolean |
fetchResourcesBeforeParsingEnabled() |
String |
getDefaultFileExtension() |
FetchOptions |
getFetchOptions() |
HtmlPaginator |
getPaginator()
Returns the
HtmlPaginator associated with this HtmlParserSettings |
int |
getParserThreadCount()
Returns the maximum number of threads used by the parser when processing data of multiple entities from the same HTML input.
|
protected HtmlPaginator |
newPaginator(com.univocity.parsers.remote.RemoteParserSettings parserSettings)
Creates a new
HtmlPaginator and returns it. |
void |
setPaginator(HtmlPaginator paginator)
Configures a
HtmlPaginator to handle multiple pages of remote content that needs to parsed. |
void |
setParserThreadCount(int parserThreadCount)
Explicitly defines a maximum number of threads that should be used by the parser when processing data of multiple entities from the same HTML input.
|
clearFileNameParameters, getBatchId, getDownloadContentDirectory, getDownloadListener, getDownloadThreads, getEmptyValue, getExecutorService, getFileNameParameter, getFileNameParameters, getFileNamePattern, getNesting, getParseDate, getRemoteInterval, getTextEncoding, ignoreFollowingErrors, isColumnReorderingEnabled, isDownloadBeforeParsingEnabled, isDownloadEnabled, isDownloadOverwritingEnabled, isIgnoreFollowingErrors, setBatchId, setColumnReorderingEnabled, setDownloadBeforeParsingEnabled, setDownloadContentDirectory, setDownloadContentDirectory, setDownloadEnabled, setDownloadListener, setDownloadOverwritingEnabled, setDownloadThreads, setEmptyValue, setExecutorService, setFileNameParameter, setFileNamePattern, setNesting, setPaginator, setParseDate, setParseDate, setParseDate, setRemoteInterval, setTextEncoding, setTextEncodingaddEntitiesToRead, addEntitiesToRead, addEntitiesToSkip, addEntitiesToSkip, createEmptyGlobalSettings, getEntitiesToRead, getEntitiesToSkip, getErrorContentLength, getNullValue, getProcessorErrorHandler, getTrimLeadingWhitespaces, getTrimTrailingWhitespaces, setEntitiesToRead, setEntitiesToRead, setEntitiesToSkip, setEntitiesToSkip, setErrorContentLength, setNullValue, setProcessorErrorHandler, setTrimLeadingWhitespaces, setTrimTrailingWhitespaces, shouldRead, shouldSkip, trimValuespublic HtmlParserSettings()
Creates a new HtmlParserSettings, which will process an input to produce records for entities defined by a HtmlEntityList. The HtmlEntityList is used to manage HtmlEntitySettings for each entity whose records will be parsed.
protected HtmlPaginator newPaginator(com.univocity.parsers.remote.RemoteParserSettings parserSettings)
Creates a new HtmlPaginator and returns it. Used by getPaginator().
newPaginator in class com.univocity.parsers.remote.RemoteParserSettings<com.univocity.parsers.common.CommonParserSettings,HtmlEntityList,HtmlParsingContext>HtmlPaginator that was createdpublic final void setPaginator(HtmlPaginator paginator)
Configures a HtmlPaginator to handle multiple pages of remote content that needs to parsed.
paginator - a HtmlPaginator to be associated with the current HtmlParserSettingspublic final HtmlPaginator getPaginator()
Returns the HtmlPaginator associated with this HtmlParserSettings
getPaginator in class com.univocity.parsers.remote.RemoteParserSettings<com.univocity.parsers.common.CommonParserSettings,HtmlEntityList,HtmlParsingContext>HtmlPaginator stored within this HtmlParserSettingspublic final int getParserThreadCount()
Returns the maximum number of threads used by the parser when processing data of multiple entities from the same HTML input.
Defaults to the number of available processors available to the JVM (via Runtime.getRuntime().availableProcessors())
public final void setParserThreadCount(int parserThreadCount)
Explicitly defines a maximum number of threads that should be used by the parser when processing data of multiple entities from the same HTML input.
By default, to the number of available processors available to the JVM will be used (via Runtime.getRuntime().availableProcessors())
parserThreadCount - the maximum number of threads to usepublic String getDefaultFileExtension()
getDefaultFileExtension in class com.univocity.parsers.remote.RemoteParserSettings<com.univocity.parsers.common.CommonParserSettings,HtmlEntityList,HtmlParsingContext>protected com.univocity.parsers.common.CommonParserSettings createGlobalSettings()
createGlobalSettings in class com.univocity.parsers.common.EntityParserSettings<com.univocity.parsers.common.CommonParserSettings,HtmlEntityList,HtmlParsingContext>protected HtmlParserSettings clone()
clone in class com.univocity.parsers.remote.RemoteParserSettings<com.univocity.parsers.common.CommonParserSettings,HtmlEntityList,HtmlParsingContext>public final FetchOptions getFetchOptions()
public final boolean fetchResourcesBeforeParsingEnabled()
public final void fetchResourcesBeforeParsing(FetchOptions fetchOptions)
Copyright © 2018 uniVocity Software Pty Ltd. All rights reserved.