de.fu_berlin.ties.xml.dom
Class XMLStripper
java.lang.Object
de.fu_berlin.ties.ConfigurableProcessor
de.fu_berlin.ties.TextProcessor
de.fu_berlin.ties.DocumentReader
de.fu_berlin.ties.xml.dom.XMLStripper
- All Implemented Interfaces:
- Processor
public class XMLStripper
- extends DocumentReader
An XML stripper converts a XML document to plain text, removing all markup.
This class is thread-safe and can be used to convert several documents
in parallel.
- Version:
- $Revision: 1.4 $, $Date: 2004/08/23 17:11:21 $, $Author: siefkes $
- Author:
- Christian Siefkes
Method Summary |
void |
process(Document document,
Writer writer,
ContextMap context)
Strips all markup from an XML document and stores the resulting plain
text. |
XMLStripper
public XMLStripper()
- Creates a new instance, using a default extension and the
standard configuration.
XMLStripper
public XMLStripper(String outExt)
- Creates a new instance, using the
standard configuration.
- Parameters:
outExt
- the extension to use for output files
XMLStripper
public XMLStripper(String outExt,
TiesConfiguration config)
- Creates a new instance.
- Parameters:
outExt
- the extension to use for output filesconfig
- used to configure superclasses
process
public void process(Document document,
Writer writer,
ContextMap context)
throws IOException
- Strips all markup from an XML document and stores the resulting plain
text. This method just delegates to
DOMUtils.collectText(org.dom4j.Branch, StringBuffer)
.
- Specified by:
process
in class DocumentReader
- Parameters:
document
- the document to readwriter
- the writer to write the resulting plain text to; flushed
but not closed by this methodcontext
- a map of objects that are made available for processing
- Throws:
IOException
- if an I/O error occurs
Copyright © 2003-2004 Christian Siefkes. All Rights Reserved.