|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.xml.dom.TokenWalker
public class TokenWalker
Walks through a document, handing all textual tokens over to a
TokenProcessor
.
Instances of this class are thread-safe iff the provided
TokenProcessor
is -- but subclass implementations might be not.
Constructor Summary | |
---|---|
TokenWalker(TokenProcessor processor,
TokenizerFactory tFactory)
Creates a new instance. |
Method Summary | |
---|---|
protected void |
processCollectedText(Element element,
CharSequence collectedText,
TokenCounter tokenCounter,
TextTokenizer tokenizer,
ContextMap context)
Helper method that tokenizes the collected textual contents of an element and delegates to the token processor for each of them. |
protected void |
processToken(Element element,
String left,
TokenDetails details,
String right,
ContextMap context)
Processes a token in an XML element by delegating to the configured TokenProcessor . |
String |
toString()
Returns a string representation of this object. |
void |
walk(Document document,
ContextMap context)
Walks through the contents of an XML document, tokenizing the textual contents. |
protected void |
walk(Element element,
TokenCounter tokenCounter,
TextTokenizer tokenizer,
ContextMap context)
Walks through the contents of a node, tokenizing textual contents and recursing through nested elements. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public TokenWalker(TokenProcessor processor, TokenizerFactory tFactory)
processor
- used to process the tokenstFactory
- used to instantiate tokenizersMethod Detail |
---|
protected void processCollectedText(Element element, CharSequence collectedText, TokenCounter tokenCounter, TextTokenizer tokenizer, ContextMap context) throws IOException, ProcessingException
element
- the element to walk throughcollectedText
- the collected textual contents (limited to the
text between/before/after child elements in case of mixed content)tokenCounter
- keeps track of the encountered tokenstokenizer
- used to tokenize textcontext
- a map of objects that are made available for processing
IOException
- might be throws by the token processor
ProcessingException
- might be throws by the token processorprotected void processToken(Element element, String left, TokenDetails details, String right, ContextMap context) throws IOException, ProcessingException
TokenProcessor
.
element
- the element containing the tokenleft
- the textual contents of the element to the left of the
token
(in case of mixed contents, only up to the last
preceding child element, if any)details
- details about the token to processright
- the textual contents of the element to the right of the
token
(in case of mixed contents, only up to the next
following child element, if any)context
- a map of objects that are made available for processing
IOException
- if an I/O error occurs
ProcessingException
- if an error occurs during processingpublic void walk(Document document, ContextMap context) throws IOException, ProcessingException
TokenProcessor
.
document
- the document to walk throughcontext
- a map of objects that are made available for processing;
might be null
if not requred by the token processor
IOException
- might be throws by the token processor
ProcessingException
- might be throws by the token processorprotected void walk(Element element, TokenCounter tokenCounter, TextTokenizer tokenizer, ContextMap context) throws IOException, ProcessingException
element
- the element to walk throughtokenCounter
- keeps track of the encountered tokenstokenizer
- used to tokenize textcontext
- a map of objects that are made available for processing
IOException
- might be throws by the token processor
ProcessingException
- might be throws by the token processorpublic String toString()
toString
in class Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |