|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.xml.dom.DOMUtils
public final class DOMUtils
A static class that provides utility constants and methods for working with DOM-like XML representations, focussing especially on dom4j. No instances of this class can be created, only the static members should be used.
Method Summary | |
---|---|
static Attribute |
attributeByName(Element element,
String name)
Returns the attribute with the given name, compatible to the name format returned by name(Attribute) . |
static String |
collectText(Branch branch)
Recursively collects the complete textual content of a branch, i.e. |
static void |
collectText(Branch branch,
StringBuilder appender)
Recursively collects the complete textual content of a branch, i.e. |
static void |
collectText(Branch branch,
Writer writer)
Recursively collects the complete textual content of a branch, i.e. |
static OutputFormat |
createDefaultOutFormat()
Creates the default output format used by this class for storing XML. |
static QName |
defaultName(String localName)
Converts a local name into a qualfied name in the default namespace. |
static void |
deleteAllAttributes(Element element,
boolean recurse)
Deletes all attributes of an element and optionally of all its descendants. |
static List |
elementsByName(Element element,
String name)
Returns the child elements with the given name, compatible to the name format returned by name(Element) . |
static String |
name(Attribute attrib)
Static method that returns a String representing the name of an attribute in an XML document. |
static String |
name(Element element)
Static method that returns a String representing the name of an element in an XML document. |
static Document |
readDocument(File file)
Reads an XML document from a local filet. |
static Document |
readDocument(File file,
Configuration config)
Reads an XML document from a local file, using a configured charset. |
static Document |
readDocument(File file,
String charset)
Reads an XML document from a local file, using a given charset. |
static Document |
readDocument(InputStream in)
Reads an XML document from a given stream. |
static Document |
readDocument(Reader reader)
Reads an XML document from a given reader. |
static String |
showElement(Element element)
Builds a simple partial representation of an element, containing the name of the element and its normalized and shortened textual content. |
static String |
showToken(Element element,
String token)
Builds a simple partial representation of a textual token in an element, containing the name of the element and the normalized and shortened text of the token. |
static void |
writeDocument(Document document,
File file,
TiesConfiguration config,
String suffix)
Writes an XML document to a file, consulting a given configuration about whether to use compression. |
static void |
writeDocument(Document document,
OutputStream out)
Writes an XML document to a given stream. |
static void |
writeDocument(Document document,
OutputStreamWriter writer)
Writes an XML document to a given writer, using the character set of the underlying output stream. |
static void |
writeDocument(Document document,
Writer writer,
String charset)
Writes an XML document to a given writer, using the given character set. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Method Detail |
---|
public static Attribute attributeByName(Element element, String name)
name(Attribute)
. If there are more than one
attributes with the given name (e.g. in different namespaces) then the
first one is returned.
element
- the element whose attribute to returnname
- the name of the attribute, compatible to the name format
returned by name(Attribute)
null
if none
existspublic static String collectText(Branch branch)
branch
- the branch to recurse
public static void collectText(Branch branch, StringBuilder appender)
branch
- the branch to recurseappender
- the collected text of the branch and all its child
elements is appended to this string bufferpublic static void collectText(Branch branch, Writer writer) throws IOException
branch
- the branch to recursewriter
- the collected text of the branch and all its child
elements is appended to this writer; flushed but not closed by this
method
IOException
- if an I/O error occurs while writing to the writerpublic static OutputFormat createDefaultOutFormat()
public static QName defaultName(String localName)
localName
- the local the use
null
if localName
is null
public static void deleteAllAttributes(Element element, boolean recurse)
element
- the elements whose attributes should be deletedrecurse
- whether to recursively delete the attributes of all
direct and indirect child elements as wellpublic static List elementsByName(Element element, String name)
name(Element)
. If no elements are found
then this method returns an empty list.
element
- the element whose child elements to returnname
- the name of the child elements, compatible to the name format
returned by name(Attribute)
Element
s for the given namepublic static String name(Attribute attrib)
name(Element)
for details.
attrib
- the element to name
public static String name(Element element)
Node.getName()
or Element.getQualifiedName()
or similar methods directly in such cases.
Currently, only the local name if used, namespace URIs and namespace prefixes are ignored. Including namespace prefixes in context representations would be quite useless, because in different document different prefixes can represent the same namespace and vice versa.
Including namespace URIs might lead to higher precision by avoiding the risk of confusing elements from totally different namespaces. On other other hand it might lead to lower recall and slower learning because elements from similar namespaces (e.g. different versions of the HTML standard) are all considered separated from each other.
element
- the element to name
public static Document readDocument(File file) throws DocumentException, IOException
IOUtils.openCompressableInStream(InputStream)
).
file
- the file to read
DocumentException
- if an error occurs during parsing
IOException
- if an I/O error occurrspublic static Document readDocument(File file, Configuration config) throws DocumentException, IOException
IOUtils.openReader(File, Configuration)
to determine
the character set. Compressed files are automatically decompressed (cf.
IOUtils.openCompressableInStream(InputStream)
).
file
- the file to readconfig
- the configuration to use
DocumentException
- if an error occurs during parsing
IOException
- if an I/O error occurrspublic static Document readDocument(File file, String charset) throws DocumentException, IOException
IOUtils.openCompressableInStream(InputStream)
)
file
- the file to readcharset
- the character set to use for reading the file;
if null
, the default charset of the current platform is used
DocumentException
- if an error occurs during parsing
IOException
- if an I/O error occurrspublic static Document readDocument(InputStream in) throws DocumentException, IOException
IOUtils.openCompressableInStream(InputStream)
)
in
- stream containing the text to parse; not closed by this method
DocumentException
- if an error occurs during parsing
IOException
- if an I/O error occurrspublic static Document readDocument(Reader reader) throws DocumentException
reader
- reader containing the text to parse; not closed by this
method
DocumentException
- if an error occurs during parsingpublic static String showElement(Element element)
element
- the element to show (may be null
)
public static String showToken(Element element, String token)
element
- the element containing the token; must not be
null
token
- the token to show (may be null
)
public static void writeDocument(Document document, File file, TiesConfiguration config, String suffix) throws IOException
document
- the document to writefile
- the file to write the document toconfig
- used to decide whether to use compressionsuffix
- an optional suffix that allows
overwriting the
general value of the configuration paramter with a more specified value
IOException
- if an I/O error occurs while writingpublic static void writeDocument(Document document, OutputStream out) throws IOException
document
- the document to writeout
- the stream to write the document to; flushed
but not closed by this method
IOException
- if an I/O error occurs during writingpublic static void writeDocument(Document document, OutputStreamWriter writer) throws IOException
document
- the document to writewriter
- the writer to write the document to; flushed
but not closed by this method
IOException
- if an I/O error occurs during writingpublic static void writeDocument(Document document, Writer writer, String charset) throws IllegalArgumentException, IOException
document
- the document to writewriter
- the writer to write the document to; flushed
but not closed by this methodcharset
- the character set of the writer; this must be a valid
charset name (not null
or empty etc.), it should be
the canonical (standard) name of the used charset
IllegalArgumentException
- if the specific charset is
null
or empty
IOException
- if an I/O error occurs during writing
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |