Package org.htmlparser.visitors
Class TextExtractingVisitor
java.lang.Object
org.htmlparser.visitors.NodeVisitor
org.htmlparser.visitors.TextExtractingVisitor
Extracts text from a web page.
Usage:
Parser parser = new Parser(...);
TextExtractingVisitor visitor = new TextExtractingVisitor();
parser.visitAllNodesWith(visitor);
String textInPage = visitor.getExtractedText();
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
visitEndTag
(Tag tag) Called for eachTag
visited that is an end tag.void
visitStringNode
(Text stringNode) Called for eachStringNode
visited.void
Called for eachTag
visited.Methods inherited from class org.htmlparser.visitors.NodeVisitor
beginParsing, finishedParsing, shouldRecurseChildren, shouldRecurseSelf, visitRemarkNode
-
Constructor Details
-
TextExtractingVisitor
public TextExtractingVisitor()
-
-
Method Details
-
getExtractedText
-
visitStringNode
Description copied from class:NodeVisitor
Called for eachStringNode
visited.- Overrides:
visitStringNode
in classNodeVisitor
- Parameters:
stringNode
- The string node being visited.
-
visitTag
Description copied from class:NodeVisitor
Called for eachTag
visited.- Overrides:
visitTag
in classNodeVisitor
- Parameters:
tag
- The tag being visited.
-
visitEndTag
Description copied from class:NodeVisitor
Called for eachTag
visited that is an end tag.- Overrides:
visitEndTag
in classNodeVisitor
- Parameters:
tag
- The end tag being visited.
-