de.fu_berlin.ties.classify.feature
Class OSBTransformer

java.lang.Object
  extended by de.fu_berlin.ties.classify.feature.FeatureTransformer
      extended by de.fu_berlin.ties.classify.feature.OSBTransformer
All Implemented Interfaces:
XMLStorable

public class OSBTransformer
extends FeatureTransformer

Transforms a feature vector using the orthogonal sparse bigrams (OSB) technique developed by Fidelis Assis. This transformer discard all comment-only features. It slides of window of length N over the remaining original features. At each window position it generates N-1 joint features as exemplified above (assumping the pipe character "|" is used as separator and N=5:

    -   -   -  w4 | w5
    -   -   w3  | | w5
    -  w2   |   | | w5
   w1   |   |   | | w5
 

If isPreserving(), the original features are preserved as well; otherwise they are discarded.

Instances of this class are thread-safe.

Version:
$Revision: 1.19 $, $Date: 2006/10/21 16:03:57 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
(package private) static QName ATTRIB_LENGTH
          Attribute name used for XML serialization.
(package private) static QName ATTRIB_SEPARATOR
          Attribute name used for XML serialization.
 
Fields inherited from class de.fu_berlin.ties.classify.feature.FeatureTransformer
CONFIG_TRANSFORMERS, ELEMENT_MAIN
 
Constructor Summary
OSBTransformer(Element element)
          Creates a new instance from an XML element, fulfilling the recommandation of the XMLStorable interface.
OSBTransformer(FeatureTransformer precTrans, int len, String sepString, boolean preserve)
          Creates a new instance.
OSBTransformer(FeatureTransformer precTrans, TiesConfiguration config)
          Creates a new instance.
 
Method Summary
protected  FeatureVector doTransform(FeatureVector orgFeatures)
          Transforms a feature vector.
 int getLength()
          Returns the maximum number of original features joined.
 String getSeparator()
          Returns the string used to separate original features (by default a space character).
 boolean isPreserving()
          Whether original features are preserved as well in addition to the generated joint features.
 ObjectElement toElement()
          Stores all relevant fields of this object in an XML element for serialization. An equivalent object can be created by calling ObjectElement.createObject(org.dom4j.Element, Class) on the created element.
 String toString()
          Returns a string representation of this object.
 
Methods inherited from class de.fu_berlin.ties.classify.feature.FeatureTransformer
createTransformer, createTransformer, getPrecedingTransformer, transform
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

ATTRIB_LENGTH

static final QName ATTRIB_LENGTH
Attribute name used for XML serialization.


ATTRIB_SEPARATOR

static final QName ATTRIB_SEPARATOR
Attribute name used for XML serialization.

Constructor Detail

OSBTransformer

public OSBTransformer(Element element)
               throws InstantiationException
Creates a new instance from an XML element, fulfilling the recommandation of the XMLStorable interface.

Parameters:
element - the XML element containing the serialized representation
Throws:
InstantiationException - if the given element does not contain a valid transformer description

OSBTransformer

public OSBTransformer(FeatureTransformer precTrans,
                      int len,
                      String sepString,
                      boolean preserve)
               throws IllegalArgumentException
Creates a new instance.

Parameters:
precTrans - the preceding transformer to use if this transformer is part of a chain; null otherwise
len - the maximum number of original features joined; minimum value is 2
sepString - the string used to separate original features -- this string should never occur within original features
preserve - whether to preserve the original features as well or only to use joint features
Throws:
IllegalArgumentException - if len < 2 or if strengthArray is empty

OSBTransformer

public OSBTransformer(FeatureTransformer precTrans,
                      TiesConfiguration config)
Creates a new instance.

Parameters:
precTrans - the preceding transformer to use if this transformer is part of a chain; null otherwise
config - used to configure this instance
Method Detail

doTransform

protected FeatureVector doTransform(FeatureVector orgFeatures)
Transforms a feature vector.

Specified by:
doTransform in class FeatureTransformer
Parameters:
orgFeatures - the original feature vector to transform
Returns:
a new feature vector containing the transformed features

getLength

public int getLength()
Returns the maximum number of original features joined.

Returns:
the value of the attribute

getSeparator

public String getSeparator()
Returns the string used to separate original features (by default a space character). This string should never occur within original features.

Returns:
the value of the attribute

isPreserving

public boolean isPreserving()
Whether original features are preserved as well in addition to the generated joint features.

Returns:
the value of the attribute

toElement

public ObjectElement toElement()
Stores all relevant fields of this object in an XML element for serialization. An equivalent object can be created by calling ObjectElement.createObject(org.dom4j.Element, Class) on the created element.

Specified by:
toElement in interface XMLStorable
Overrides:
toElement in class FeatureTransformer
Returns:
the created XML element

toString

public String toString()
Returns a string representation of this object.

Overrides:
toString in class FeatureTransformer
Returns:
a textual representation


Copyright © 2003-2007 Christian Siefkes. All Rights Reserved.