de.fu_berlin.ties.xml.convert
Class AttributeUnflatten

java.lang.Object
  extended by de.fu_berlin.ties.ConfigurableProcessor
      extended by de.fu_berlin.ties.TextProcessor
          extended by de.fu_berlin.ties.DocumentReader
              extended by de.fu_berlin.ties.DocumentProcessor
                  extended by de.fu_berlin.ties.xml.convert.AttributeUnflatten
All Implemented Interfaces:
Processor

public class AttributeUnflatten
extends DocumentProcessor

Unflattens an XML document, reading labels for a CombinationStrategy from an XML attribute ("class" by default). The value of this attribute will only be considered for leaf elements, i.e. elements without child elements. If it is missing, CombinationState.OUTSIDE will be assumed. Attributes used for unflattening will be deleted from the resulting document.

For example, using IOB2 tagging, the document:

 <document>
   <token class="O">Please</token>
   <token class="O">consult</token>
   <token class="B-person">Mr.</token>
   <token class="I-person">John</token>
   <token class="I-person">Smith</token>
   <token>for</token>
   <token>assistance</token>
 </document>
 
will be unflattened as follows:
 <document>
   <token>Please</token>
   <token>consult</token>
   <person>
     <token>Mr.</token>
     <token>John</token>
     <token>Smith</token>
   </person>
   <token>for</token>
   <token>assistance</token>
 </document>
 

Version:
$Revision: 1.10 $, $Date: 2006/10/21 16:04:31 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
 
Fields inherited from class de.fu_berlin.ties.TextProcessor
CONFIG_POST, KEY_DIRECTORY, KEY_LOCAL_NAME, KEY_OUT_DIRECTORY, KEY_URL
 
Constructor Summary
AttributeUnflatten(String outExt)
          Creates a new instance, using the standard configuration.
AttributeUnflatten(String outExt, CombinationStrategy combiStrategy, StrategyAdapter stratAdapter, QName labelAttrib, TiesConfiguration conf)
          Creates a new instance.
AttributeUnflatten(String outExt, TiesConfiguration conf)
          Creates a new instance, configuring all fields from the provided configuration.
 
Method Summary
 Document process(Document document, ContextMap context)
          Processes an XML document. This implementation delegates to the unflatten(Document) method.
 String toString()
          Returns a string representation of this object.
 void unflatten(Document document)
          Unflattens an XML document using the combination strategy and the strategy adapter stored in this instance.
 
Methods inherited from class de.fu_berlin.ties.DocumentProcessor
process
 
Methods inherited from class de.fu_berlin.ties.DocumentReader
doProcess
 
Methods inherited from class de.fu_berlin.ties.TextProcessor
getOutFileExt, process, process, process, process, process, process
 
Methods inherited from class de.fu_berlin.ties.ConfigurableProcessor
getConfig
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

AttributeUnflatten

public AttributeUnflatten(String outExt)
                   throws ProcessingException
Creates a new instance, using the standard configuration.

Parameters:
outExt - the extension to use for output files.
Throws:
ProcessingException - if an error occurs while initializing the strategy or the adapter

AttributeUnflatten

public AttributeUnflatten(String outExt,
                          TiesConfiguration conf)
                   throws ProcessingException
Creates a new instance, configuring all fields from the provided configuration.

Parameters:
outExt - the extension to use for output files.
conf - used to configure this instance
Throws:
ProcessingException - if an error occurs while initializing the strategy or the adapter

AttributeUnflatten

public AttributeUnflatten(String outExt,
                          CombinationStrategy combiStrategy,
                          StrategyAdapter stratAdapter,
                          QName labelAttrib,
                          TiesConfiguration conf)
Creates a new instance.

Parameters:
outExt - the extension to use for output files
combiStrategy - the combination strategy to use; must not be null
stratAdapter - used to translate the labels returned by the used combination strategy if necessary; must not be null but a dummy adapter can be used
labelAttrib - the attribute used to read the labels from
conf - used to configure this instance
Method Detail

process

public Document process(Document document,
                        ContextMap context)
                 throws ProcessingException
Processes an XML document. Callers must always continue working on the returned document instance instead of the passed-in instance -- document processors are allowed to modify the document in-place, but this is not required. This implementation delegates to the unflatten(Document) method.

Specified by:
process in class DocumentProcessor
Parameters:
document - the document to process
context - a map of objects that are made available for processing
Returns:
the processed document; this object may or may not be identical to the document passed it.
Throws:
ProcessingException - if an error occurs during processing

toString

public String toString()
Returns a string representation of this object.

Overrides:
toString in class TextProcessor
Returns:
a textual representation

unflatten

public void unflatten(Document document)
               throws ProcessingException
Unflattens an XML document using the combination strategy and the strategy adapter stored in this instance.

Parameters:
document - the document to unflatten; will be modified by this method
Throws:
ProcessingException - if the input document contains an illegal sequence of labels which cannot be processed by the used strategy


Copyright © 2003-2007 Christian Siefkes. All Rights Reserved.