|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.classify.feature.FeatureTransformer
de.fu_berlin.ties.classify.feature.OSBTransformer
public class OSBTransformer
Transforms a feature vector using the orthogonal sparse bigrams (OSB) technique developed by Fidelis Assis. This transformer discard all comment-only features. It slides of window of length N over the remaining original features. At each window position it generates N-1 joint features as exemplified above (assumping the pipe character "|" is used as separator and N=5:
- - - w4 | w5 - - w3 | | w5 - w2 | | | w5 w1 | | | | w5
If isPreserving()
, the original features are preserved as well;
otherwise they are discarded.
Instances of this class are thread-safe.
Field Summary |
---|
Fields inherited from class de.fu_berlin.ties.classify.feature.FeatureTransformer |
---|
CONFIG_TRANSFORMERS |
Constructor Summary | |
---|---|
OSBTransformer(FeatureTransformer precTrans,
int len,
String sep,
boolean preserve,
float[] strengthArray,
float singleTokenStrength)
Creates a new instance. |
|
OSBTransformer(FeatureTransformer precTrans,
TiesConfiguration config)
Creates a new instance. |
Method Summary | |
---|---|
protected FeatureVector |
doTransform(FeatureVector orgFeatures)
Transforms a feature vector. |
int |
getLength()
Returns the maximum number of original features joined. |
String |
getSeparator()
Returns the string used to separate original features (by default a space character). |
boolean |
isPreserving()
Whether original features are preserved as well in addition to the generated joint features. |
String |
toString()
Returns a string representation of this object. |
Methods inherited from class de.fu_berlin.ties.classify.feature.FeatureTransformer |
---|
createTransformer, createTransformer, getPrecedingTransformer, transform |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public OSBTransformer(FeatureTransformer precTrans, int len, String sep, boolean preserve, float[] strengthArray, float singleTokenStrength) throws IllegalArgumentException
precTrans
- the preceding transformer to use if this transformer
is part of a chain; null
otherwiselen
- the maximum number of original features joined; minimum value
is 2sep
- the string used to separate original features -- this string
should never occur within original featurespreserve
- whether to preserve the original features as well or
only to use joint featuresstrengthArray
- Array of strength values used for bigrams with
different distancessingleTokenStrength
- Strength value used for unigrams (single
tokens); ignored if preserve
is false
IllegalArgumentException
- if len < 2
or if
strengthArray
is emptypublic OSBTransformer(FeatureTransformer precTrans, TiesConfiguration config)
precTrans
- the preceding transformer to use if this transformer
is part of a chain; null
otherwiseconfig
- used to configure this instanceMethod Detail |
---|
protected FeatureVector doTransform(FeatureVector orgFeatures)
doTransform
in class FeatureTransformer
orgFeatures
- the original feature vector to transform
public int getLength()
public String getSeparator()
public boolean isPreserving()
public String toString()
toString
in class FeatureTransformer
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |