public abstract class HtmlParserListener extends Object
An abstract class that is used by the HtmlParser to provide information about events that occur during the parsing process.
Important:This listener is used in a concurrent environment. If you are assigning the same instance to multiple entities make sure your implementation is thread-safe, or limit the number of threads to be used when parsing to 1 with HtmlParserSettings.setParserThreadCount(int)
HtmlParser,
HtmlParsingContext,
HtmlElement| Constructor and Description |
|---|
HtmlParserListener() |
| Modifier and Type | Method and Description |
|---|---|
void |
elementMatched(HtmlElement element,
HtmlParsingContext context)
A method that runs when a HTML element is matched based on the path set in the creation a field in the corresponding
HtmlEntitySettings |
void |
elementVisited(HtmlElement element,
HtmlParsingContext context)
A method that runs every time the
HtmlParser visits a HTML element on a HTML document. |
void |
parsingEnded(HtmlParsingContext context)
A method that runs when the parsing process has ended.
|
void |
parsingStarted(HtmlParsingContext context)
A method that runs when the
HtmlParser begins parsing a web page. |
public void parsingStarted(HtmlParsingContext context)
A method that runs when the HtmlParser begins parsing a web page. Note that if the current entity is associated with a HtmlLinkFollower this method will be called every time a new web page is opened by it.
context - the HtmlParsingContext used by the HtmlParser during the parsing process of a single web page.public void elementVisited(HtmlElement element, HtmlParsingContext context)
A method that runs every time the HtmlParser visits a HTML element on a HTML document.
element - the element that was visited. Note that only elements with tags are visited. Text nodes will not trigger the invocation of this method. Will be destroyed after parsingEnded(HtmlParsingContext) is called.context - the HtmlParsingContext used by the HtmlParser during the parsing processpublic void elementMatched(HtmlElement element, HtmlParsingContext context)
A method that runs when a HTML element is matched based on the path set in the creation a field in the corresponding HtmlEntitySettings
element - the element that was matched. Will be destroyed after parsingEnded(HtmlParsingContext) is called.context - the HtmlParsingContext used by the HtmlParser during the parsing processpublic void parsingEnded(HtmlParsingContext context)
A method that runs when the parsing process has ended. Note that if the current entity is associated with a HtmlLinkFollower this method will be called every time the processing over each linked web page stops. Any HtmlElement you may have collected from the other methods will be destroyed after this method executes.
context - the HtmlParsingContext used by the HtmlParser during the parsing processCopyright © 2018 uniVocity Software Pty Ltd. All rights reserved.