Package org.opencms.util
Interface I_CmsHtmlNodeVisitor
-
- All Known Implementing Classes:
CmsHtml2TextConverter
,CmsHtmlDecorator
,CmsHtmlParser
,CmsLinkProcessor
public interface I_CmsHtmlNodeVisitor
Interface for a combination of a visitor of HTML documents along with the hook to start the parser / lexer that triggers the visit.- Since:
- 6.1.3
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description java.lang.String
getConfiguration()
Returns the configuartion String of this visitor or the empty String if was not provided before.java.lang.String
getResult()
Returns the text extraction result.java.lang.String
process(java.lang.String html, java.lang.String encoding)
Extracts the text from the given html content, assuming the given html encoding.void
setConfiguration(java.lang.String configuration)
Set a configuartion String for this visitor.void
setNoAutoCloseTags(java.util.List<java.lang.String> noAutoCloseTags)
Sets a list of upper case tag names for which parsing / visitng should not correct missing closing tags.void
visitEndTag(org.htmlparser.Tag tag)
Visitor method (callback) invoked when a closing Tag is encountered.void
visitRemarkNode(org.htmlparser.Remark remark)
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.void
visitStringNode(org.htmlparser.Text text)
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.void
visitTag(org.htmlparser.Tag tag)
Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered.
-
-
-
Method Detail
-
getConfiguration
java.lang.String getConfiguration()
Returns the configuartion String of this visitor or the empty String if was not provided before.- Returns:
- the configuartion String of this visitor - by this contract never null but an empty String if not provided.
- See Also:
setConfiguration(String)
-
getResult
java.lang.String getResult()
Returns the text extraction result.- Returns:
- the text extraction result
-
process
java.lang.String process(java.lang.String html, java.lang.String encoding) throws org.htmlparser.util.ParserException
Extracts the text from the given html content, assuming the given html encoding.- Parameters:
html
- the content to extract the plain text fromencoding
- the encoding to use- Returns:
- the text extracted from the given html content
- Throws:
org.htmlparser.util.ParserException
- if something goes wrong
-
setConfiguration
void setConfiguration(java.lang.String configuration)
Set a configuartion String for this visitor.This will most likely be done with data from an xsd, custom jsp tag, ...
- Parameters:
configuration
- the configuration of this visitor to set.
-
setNoAutoCloseTags
void setNoAutoCloseTags(java.util.List<java.lang.String> noAutoCloseTags)
Sets a list of upper case tag names for which parsing / visitng should not correct missing closing tags.This has to be used before
is invoked to take an effect.process(String, String)
- Parameters:
noAutoCloseTags
- a list of upper case tag names for which parsing / visiting should not correct missing closing tags to set.
-
visitEndTag
void visitEndTag(org.htmlparser.Tag tag)
Visitor method (callback) invoked when a closing Tag is encountered.- Parameters:
tag
- the tag that is ended.- See Also:
NodeVisitor.visitEndTag(org.htmlparser.Tag)
-
visitRemarkNode
void visitRemarkNode(org.htmlparser.Remark remark)
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.- Parameters:
remark
- the remark Tag to visit.- See Also:
NodeVisitor.visitRemarkNode(org.htmlparser.Remark)
-
visitStringNode
void visitStringNode(org.htmlparser.Text text)
Visitor method (callback) invoked when a remark Tag (HTML comment) is encountered.- Parameters:
text
- the text that is visited.- See Also:
NodeVisitor.visitStringNode(org.htmlparser.Text)
-
visitTag
void visitTag(org.htmlparser.Tag tag)
Visitor method (callback) invoked when a starting Tag (HTML comment) is encountered.- Parameters:
tag
- the tag that is visited.- See Also:
NodeVisitor.visitTag(org.htmlparser.Tag)
-
-