de.fu_berlin.ties.eval
Class FeatureCount

java.lang.Object
  extended by de.fu_berlin.ties.io.BaseStorable
      extended by de.fu_berlin.ties.eval.FeatureCount
All Implemented Interfaces:
FeatureCountView, Storable

public class FeatureCount
extends BaseStorable
implements FeatureCountView

Keeps track of the average number of features and of unique features in context representations and of the average number of contexts in documents. Comment-only features are excluded when counting features; comments are ignored when comparing features.

Instances of this class are not thread-safe and must be synchronized externally, if required.

Version:
$Revision: 1.11 $, $Date: 2006/10/21 16:04:11 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
static String KEY_AVERAGE_CONTEXTS
          Serialization key for the average number of context representations in a document.
static String KEY_AVERAGE_FEATURES
          Serialization key for the average number of features in a context representation.
static String KEY_AVERAGE_UNIQUE_FEATURES
          Serialization key for the average number of unique features in a context representation.
static String KEY_CHARS
          Serialization key for characters.
static String KEY_CHARS_PER_CONTEXT
          Serialization key for the average number of characters in a context representation.
static String KEY_CHARS_PER_FEATURE
          Serialization key for the average number of characters in a feature.
static String KEY_CONTEXTS
          Serialization key for context representations.
static String KEY_DOCUMENTS
          Serialization key for documents.
static String KEY_FEATURES
          Serialization key for features.
static String KEY_UNIQUE_FEATURES
          Serialization key for unique features.
 
Constructor Summary
FeatureCount()
          Creates a new instance.
FeatureCount(FieldMap fieldMap)
          Creates a new instance from a field map, fulfilling the Storable contract.
 
Method Summary
 void countDocument()
          Counts a document (increases the number of documents by one.
 double getAverageContexts()
          Calculates and returns the average number of context representations in a document.
 double getAverageFeatures()
          Calculates and returns the average number of non-comment features in a context representation.
 double getAverageUniqueFeatures()
          Calculates and returns the average number of unique non-comment features in a context representation.
 long getCharacters()
          Returns the number of characters counted so far.
 double getCharactersPerContext()
          Calculates and returns the average number of characters in a context representation.
 double getCharactersPerFeature()
          Calculates and returns the average number of characters in a feature.
 long getContexts()
          Returns the number of representations evaluated so far.
 long getDocuments()
          Returns the number of documents counted so far.
 long getFeatureSum()
          Returns the number of non-comment features encountered so far.
 long getUniqueFeatureSum()
          Returns the number of non-comment non-duplicate features encountered so far.
 FieldMap storeFields()
          Stores all relevant fields of this object in a field map for serialization.
 void update(FeatureVector features)
          Evaluates a feature vector and updates the statistics accordingly.
 
Methods inherited from class de.fu_berlin.ties.io.BaseStorable
toString, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

KEY_CONTEXTS

public static final String KEY_CONTEXTS
Serialization key for context representations.

See Also:
Constant Field Values

KEY_DOCUMENTS

public static final String KEY_DOCUMENTS
Serialization key for documents.

See Also:
Constant Field Values

KEY_FEATURES

public static final String KEY_FEATURES
Serialization key for features.

See Also:
Constant Field Values

KEY_CHARS

public static final String KEY_CHARS
Serialization key for characters.

See Also:
Constant Field Values

KEY_UNIQUE_FEATURES

public static final String KEY_UNIQUE_FEATURES
Serialization key for unique features.

See Also:
Constant Field Values

KEY_AVERAGE_CONTEXTS

public static final String KEY_AVERAGE_CONTEXTS
Serialization key for the average number of context representations in a document.

See Also:
Constant Field Values

KEY_AVERAGE_FEATURES

public static final String KEY_AVERAGE_FEATURES
Serialization key for the average number of features in a context representation.

See Also:
Constant Field Values

KEY_AVERAGE_UNIQUE_FEATURES

public static final String KEY_AVERAGE_UNIQUE_FEATURES
Serialization key for the average number of unique features in a context representation.

See Also:
Constant Field Values

KEY_CHARS_PER_CONTEXT

public static final String KEY_CHARS_PER_CONTEXT
Serialization key for the average number of characters in a context representation.

See Also:
Constant Field Values

KEY_CHARS_PER_FEATURE

public static final String KEY_CHARS_PER_FEATURE
Serialization key for the average number of characters in a feature.

See Also:
Constant Field Values
Constructor Detail

FeatureCount

public FeatureCount()
Creates a new instance.


FeatureCount

public FeatureCount(FieldMap fieldMap)
             throws IllegalArgumentException
Creates a new instance from a field map, fulfilling the Storable contract.

Parameters:
fieldMap - map containing the serialized fields
Throws:
IllegalArgumentException - if at least one of the parameters is negative or missing
Method Detail

countDocument

public void countDocument()
Counts a document (increases the number of documents by one.


getAverageContexts

public double getAverageContexts()
Calculates and returns the average number of context representations in a document.

Specified by:
getAverageContexts in interface FeatureCountView
Returns:
the average number of context representations

getAverageFeatures

public double getAverageFeatures()
Calculates and returns the average number of non-comment features in a context representation.

Specified by:
getAverageFeatures in interface FeatureCountView
Returns:
the average number of features

getAverageUniqueFeatures

public double getAverageUniqueFeatures()
Calculates and returns the average number of unique non-comment features in a context representation.

Specified by:
getAverageUniqueFeatures in interface FeatureCountView
Returns:
the average number of features

getCharacters

public long getCharacters()
Returns the number of characters counted so far. Only characters within features are counted; separators between different features are ignored.

Specified by:
getCharacters in interface FeatureCountView
Returns:
the value of the attribute

getCharactersPerContext

public double getCharactersPerContext()
Calculates and returns the average number of characters in a context representation. Only characters within features are considered; separators between different features are ignored.

Specified by:
getCharactersPerContext in interface FeatureCountView
Returns:
the average number of characters in a context

getCharactersPerFeature

public double getCharactersPerFeature()
Calculates and returns the average number of characters in a feature.

Specified by:
getCharactersPerFeature in interface FeatureCountView
Returns:
the average number of characters in a feature

getContexts

public long getContexts()
Returns the number of representations evaluated so far.

Specified by:
getContexts in interface FeatureCountView
Returns:
the value of the attribute

getDocuments

public long getDocuments()
Returns the number of documents counted so far.

Specified by:
getDocuments in interface FeatureCountView
Returns:
the value of the attribute

getFeatureSum

public long getFeatureSum()
Returns the number of non-comment features encountered so far.

Specified by:
getFeatureSum in interface FeatureCountView
Returns:
the value of the attribute

getUniqueFeatureSum

public long getUniqueFeatureSum()
Returns the number of non-comment non-duplicate features encountered so far. Duplicates within the same context representation are ignored; but equal features in different representations are not recognized as duplicate.

Specified by:
getUniqueFeatureSum in interface FeatureCountView
Returns:
the value of the attribute

storeFields

public FieldMap storeFields()
Stores all relevant fields of this object in a field map for serialization. An equivalent object can be created by calling FieldMap.createObject(Class) on the created field map. The calculated averages are also stored (they are ignored when deserializing a stored instance).

Specified by:
storeFields in interface Storable
Returns:
the created field map

update

public void update(FeatureVector features)
            throws ClassCastException
Evaluates a feature vector and updates the statistics accordingly. Comment-only features are excluded when counting features; comments are ignored when comparing features.

Parameters:
features - a feature vector representing a context
Throws:
ClassCastException - if the list contains objects that aren't Features


Copyright © 2003-2007 Christian Siefkes. All Rights Reserved.