|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.ConfigurableProcessor
de.fu_berlin.ties.TextProcessor
de.fu_berlin.ties.DocumentReader
de.fu_berlin.ties.DocumentProcessor
de.fu_berlin.ties.xml.convert.AttributeUnflatten
public class AttributeUnflatten
Unflattens an XML document, reading labels for a
CombinationStrategy
from an XML attribute
("class" by default). The value of this attribute will only be considered
for leaf elements, i.e. elements without child elements. If it is missing,
CombinationState.OUTSIDE
will be assumed.
Attributes used for unflattening will be deleted from the resulting
document.
For example, using IOB2 tagging, the document:
<document> <token class="O">Please</token> <token class="O">consult</token> <token class="B-person">Mr.</token> <token class="I-person">John</token> <token class="I-person">Smith</token> <token>for</token> <token>assistance</token> </document>will be unflattened as follows:
<document> <token>Please</token> <token>consult</token> <person> <token>Mr.</token> <token>John</token> <token>Smith</token> </person> <token>for</token> <token>assistance</token> </document>
Field Summary |
---|
Fields inherited from class de.fu_berlin.ties.TextProcessor |
---|
CONFIG_POST, KEY_DIRECTORY, KEY_LOCAL_NAME, KEY_OUT_DIRECTORY, KEY_URL |
Constructor Summary | |
---|---|
AttributeUnflatten(String outExt)
Creates a new instance, using the standard configuration. |
|
AttributeUnflatten(String outExt,
CombinationStrategy combiStrategy,
StrategyAdapter stratAdapter,
QName labelAttrib,
TiesConfiguration conf)
Creates a new instance. |
|
AttributeUnflatten(String outExt,
TiesConfiguration conf)
Creates a new instance, configuring all fields from the provided configuration. |
Method Summary | |
---|---|
Document |
process(Document document,
ContextMap context)
Processes an XML document. This implementation delegates to the unflatten(Document) method. |
String |
toString()
Returns a string representation of this object. |
void |
unflatten(Document document)
Unflattens an XML document using the combination strategy and the strategy adapter stored in this instance. |
Methods inherited from class de.fu_berlin.ties.DocumentProcessor |
---|
process |
Methods inherited from class de.fu_berlin.ties.DocumentReader |
---|
doProcess |
Methods inherited from class de.fu_berlin.ties.TextProcessor |
---|
getOutFileExt, process, process, process, process, process, process |
Methods inherited from class de.fu_berlin.ties.ConfigurableProcessor |
---|
getConfig |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public AttributeUnflatten(String outExt) throws ProcessingException
outExt
- the extension to use for output files.
ProcessingException
- if an error occurs while initializing the
strategy or the adapterpublic AttributeUnflatten(String outExt, TiesConfiguration conf) throws ProcessingException
outExt
- the extension to use for output files.conf
- used to configure this instance
ProcessingException
- if an error occurs while initializing the
strategy or the adapterpublic AttributeUnflatten(String outExt, CombinationStrategy combiStrategy, StrategyAdapter stratAdapter, QName labelAttrib, TiesConfiguration conf)
outExt
- the extension to use for output filescombiStrategy
- the combination strategy to use; must not be
null
stratAdapter
- used to translate the labels returned by the used
combination strategy if necessary; must not be null
but a
dummy adapter can be
usedlabelAttrib
- the attribute used to read the labels fromconf
- used to configure this instanceMethod Detail |
---|
public Document process(Document document, ContextMap context) throws ProcessingException
unflatten(Document)
method.
process
in class DocumentProcessor
document
- the document to processcontext
- a map of objects that are made available for processing
document
passed it.
ProcessingException
- if an error occurs during processingpublic String toString()
toString
in class TextProcessor
public void unflatten(Document document) throws ProcessingException
document
- the document to unflatten; will be modified by this
method
ProcessingException
- if the input document contains an illegal
sequence of labels which cannot be processed by the used strategy
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |