|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.google.gwt.thirdparty.streamhtmlparser.util.HtmlUtils
public final class HtmlUtils
Utility functions for HTML and Javascript that are most likely not interesting to users outside this package.
The HtmlParser will be open-sourced hence we took the
decision to keep these utilities in this package as well as not to
leverage others that may exist in the google3 code base.
The functionality exposed is designed to be 100% compatible with the corresponding logic in the C-version of the HtmlParser as such we are particularly concerned with cross-language compatibility.
Note: The words Javascript and ECMAScript are used
interchangeably unless otherwise noted.
| Nested Class Summary | |
|---|---|
static class |
HtmlUtils.META_REDIRECT_TYPE
Indicates the type of content contained in the content HTML
attribute of the meta HTML tag. |
| Method Summary | |
|---|---|
static java.lang.String |
encodeCharForAscii(char chr)
Encodes the specified character using Ascii for convenient insertion into a single-quote enclosed String. |
static boolean |
isAttributeJavascript(java.lang.String attribute)
Determines if the HTML attribute specified expects javascript for its value. |
static boolean |
isAttributeStyle(java.lang.String attribute)
Determines if the HTML attribute specified expects a style
for its value. |
static boolean |
isAttributeUri(java.lang.String attribute)
Determines if the HTML attribute specified expects a URI
for its value. |
static boolean |
isHtmlSpace(char chr)
Determines if the specified character is an HTML whitespace character. |
static boolean |
isJavascriptIdentifier(char chr)
Determines if the specified character is a valid character in an ECMAScript identifier. |
static boolean |
isJavascriptRegexpPrefix(java.lang.String input)
Determines if the input token provided is a valid token prefix to a javascript regular expression. |
static boolean |
isJavascriptWhitespace(char chr)
Determines if the specified character is an ECMAScript whitespace or line terminator character. |
static HtmlUtils.META_REDIRECT_TYPE |
parseContentAttributeForUrl(java.lang.String value)
Parses the given String to determine if it contains a URL in the
format followed by the content attribute of the meta
HTML tag. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Method Detail |
|---|
public static boolean isAttributeJavascript(java.lang.String attribute)
onclick
attribute.
Currently returns true for any attribute name that starts
with "on" which is not exactly correct but we trust a developer to
not use non-spec compliant attribute names (e.g. onbogus).
attribute - the name of an HTML attribute
false if the input is null or is not an attribute
that expects javascript code; truepublic static boolean isAttributeStyle(java.lang.String attribute)
style
for its value. Currently this is only true for the style
HTML attribute.
attribute - the name of an HTML attribute
true iff the attribute name is one that expects a
style for a value; otherwise falsepublic static boolean isAttributeUri(java.lang.String attribute)
URI
for its value. For example, both href and src
expect a URI but style does not. Returns
false if the attribute given was null.
attribute - the name of an HTML attribute
true if the attribute name is one that expects
a URI for a value; otherwise nullATTRIBUTE_EXPECTS_URIpublic static boolean isHtmlSpace(char chr)
Space character
Tab character
Line feed character
Carriage Return character
Zero-Width Space character
​)
which is not included in the C version.
chr - the char to check
true if the character is an HTML whitespace character
White spacepublic static boolean isJavascriptWhitespace(char chr)
Tab, Vertical Tab,
Form Feed, Space,
No-break space)
Line Feed,
Carriage Return, Line separator,
Paragraph Separator).
Encompasses the characters in sections 7.2 and 7.3 of ECMAScript 3, in
particular, this list is quite different from that in
Character.isWhitespace.
ECMAScript Language Specification
chr - the char to check
true or falsepublic static boolean isJavascriptIdentifier(char chr)
Character.isJavaIdentifierStart
and Character.isJavaIdentifierPart given that Java
and Javascript follow similar identifier naming rules but we lose
compatibility with the C-version.
chr - char to check
true if the chr is a Javascript whitespace
character; otherwise falsepublic static boolean isJavascriptRegexpPrefix(java.lang.String input)
Set of identifiers that can precede a regular expression in the
javascript grammar, and returns true if the provided
String is in that Set.
input - the String token to check
true iff the token is a valid prefix of a regexppublic static java.lang.String encodeCharForAscii(char chr)
String. Printable characters
are returned as-is. Carriage Return, Line Feed, Horizontal Tab,
back-slash and single quote are all backslash-escaped. All other characters
are returned hex-encoded.
chr - char to encode
charpublic static HtmlUtils.META_REDIRECT_TYPE parseContentAttributeForUrl(java.lang.String value)
String to determine if it contains a URL in the
format followed by the content attribute of the meta
HTML tag.
This function expects to receive the value of the content HTML
attribute. This attribute takes on different meanings depending on the
value of the http-equiv HTML attribute of the same meta
tag. Since we may not have access to the http-equiv attribute,
we instead rely on parsing the given value to determine if it contains
a URL.
The specification of the meta HTML tag can be found in:
http://dev.w3.org/html5/spec/Overview.html#attr-meta-http-equiv-refresh
We return HtmlUtils.META_REDIRECT_TYPE indicating whether the
value contains a URL and whether we are at the start of the URL or past
the start. We are at the start of the URL if and only if one of the two
conditions below is true:
Examples:
meta tag where the content
attribute contains a URL [we are not at the start of the URL]:
<meta http-equiv="refresh" content="5; URL=http://www.google.com">
meta tag where the content
attribute contains a URL [we are at the start of the URL]:
<meta http-equiv="refresh" content="5; URL=">
meta tag where the content
attribute does not contain a URL:
<meta http-equiv="content-type" content="text/html">
value - String to parse
HtmlUtils.META_REDIRECT_TYPE indicating the presence
of a URL in the given value
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||