de.fu_berlin.ties.context
Class AbstractRepresentation

java.lang.Object
  extended by de.fu_berlin.ties.context.Representation
      extended by de.fu_berlin.ties.context.AbstractRepresentation
Direct Known Subclasses:
DefaultRepresentation, SimpleRepresentation

public abstract class AbstractRepresentation
extends Representation

Provides basic functionality shared by different representations.

Version:
$Revision: 1.2 $, $Date: 2004/09/06 17:22:41 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
static String CONFIG_RECOGN_NUM
          Configuration key: The number of preceding recognitions to represent.
static String CONFIG_SPLIT_MAXIMUM
          Configuration key: The maximum number of subsequences to keep when a feature value must be split.
static String CONFIG_STORE_NTH
          Configuration key: Each n-th context representation is stored for debugging and inspection purposes,if > 0.
 
Constructor Summary
AbstractRepresentation(int recogNum, int splitMax, int n, String outCharset)
          Creates a new instance.
 
Method Summary
 FeatureVector buildContext(Element element, String leftText, String mainText, String rightText, PriorRecognitions priorRecognitions, Map<Element,List<LocalFeature>> featureCache, String logPurpose)
          Builds the context representation of text in an element.
protected abstract  FeatureVector doBuildContext(Element element, String leftText, String mainText, String rightText, PriorRecognitions priorRecognitions, Map<Element,List<LocalFeature>> featureCache, String logPurpose)
          Builds the context representation of text in an element.
 int getSplitMaximum()
          Returns the maximum number of subsequences to keep when a feature value must be split (at whitespace).
 int getStoreN()
          Each storeN-th context representation is stored for debugging and inspection purposes (if > 0, otherwise no representation is stored).
 String toString()
          Returns a string representation of this object.
 
Methods inherited from class de.fu_berlin.ties.context.Representation
buildContext, buildContext, createRecognitionBuffer, getRecognitionNumber
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

CONFIG_RECOGN_NUM

public static final String CONFIG_RECOGN_NUM
Configuration key: The number of preceding recognitions to represent.

See Also:
Constant Field Values

CONFIG_SPLIT_MAXIMUM

public static final String CONFIG_SPLIT_MAXIMUM
Configuration key: The maximum number of subsequences to keep when a feature value must be split.

See Also:
Constant Field Values

CONFIG_STORE_NTH

public static final String CONFIG_STORE_NTH
Configuration key: Each n-th context representation is stored for debugging and inspection purposes,if > 0.

See Also:
Constant Field Values
Constructor Detail

AbstractRepresentation

public AbstractRepresentation(int recogNum,
                              int splitMax,
                              int n,
                              String outCharset)
Creates a new instance.

Parameters:
recogNum - the number of preceding recognitions to represent
splitMax - the maximum number of subsequences to keep when a feature value must be split (at whitespace).
n - Each n-th context representation is stored if > 0; otherwise no representation is stored
outCharset - the output character set to use (only used to store some configurations for inspection purposes, if n > 0); if null, the default charset of the current platform is used
Method Detail

buildContext

public final FeatureVector buildContext(Element element,
                                        String leftText,
                                        String mainText,
                                        String rightText,
                                        PriorRecognitions priorRecognitions,
                                        Map<Element,List<LocalFeature>> featureCache,
                                        String logPurpose)
                                 throws ClassCastException,
                                        IllegalArgumentException
Builds the context representation of text in an element. Returns a feature vector of all context features considered relevant for representation.

Specified by:
buildContext in class Representation
Parameters:
element - the element whose context should be represented
leftText - textual content to the left of (preceding) mainText, might be empty
mainText - the main textual content to represent, might be empty
rightText - textual content to the right of (following) mainText, might be empty
priorRecognitions - a buffer of the last Recognitions from the document, created by calling Representation.createRecognitionBuffer(); might be null
featureCache - a cache of (local) feature, should be re-used between all calls for the nodes in a single document (but must not be re-used when building the context of nodes in different documents!)
logPurpose - the type of contexts of main interest to the caller (e.g. "Token" or "Sentence"), used for logging
Returns:
a vector of features considered relevant for representation
Throws:
ClassCastException - if the priorRecognitions buffer contains objects that aren't Recognitions
IllegalArgumentException - if the specified node is of an unsupported type

doBuildContext

protected abstract FeatureVector doBuildContext(Element element,
                                                String leftText,
                                                String mainText,
                                                String rightText,
                                                PriorRecognitions priorRecognitions,
                                                Map<Element,List<LocalFeature>> featureCache,
                                                String logPurpose)
                                         throws ClassCastException,
                                                IllegalArgumentException
Builds the context representation of text in an element. Returns a feature vector of all context features considered relevant for representation.

Parameters:
element - the element whose context should be represented
leftText - textual content to the left of (preceding) mainText, might be empty
mainText - the main textual content to represent, might be empty
rightText - textual content to the right of (following) mainText, might be empty
priorRecognitions - a buffer of the last Recognitions from the document, created by calling Representation.createRecognitionBuffer(); might be null
featureCache - a cache of (local) feature, should be re-used between all calls for the nodes in a single document (but must not be re-used when building the context of nodes in different documents!)
logPurpose - the type of contexts of main interest to the caller (e.g. "Token" or "Sentence"), used for logging
Returns:
a vector of features considered relevant for representation
Throws:
ClassCastException - if the priorRecognitions buffer contains objects that aren't Recognitions
IllegalArgumentException - if the specified node is of an unsupported type

getSplitMaximum

public int getSplitMaximum()
Returns the maximum number of subsequences to keep when a feature value must be split (at whitespace).

Returns:
the maximum number

getStoreN

public int getStoreN()
Each storeN-th context representation is stored for debugging and inspection purposes (if > 0, otherwise no representation is stored).

Returns:
the value of the attribute

toString

public String toString()
Returns a string representation of this object.

Overrides:
toString in class Representation
Returns:
a textual representation


Copyright © 2003-2004 Christian Siefkes. All Rights Reserved.