de.fu_berlin.ties.extract.reestimate
Class LengthFilter

java.lang.Object
  extended by de.fu_berlin.ties.extract.reestimate.Reestimator
      extended by de.fu_berlin.ties.extract.reestimate.LengthFilter

public class LengthFilter
extends Reestimator

A very simple re-estimator that discards any extractions that are longer than the longest extraction of the same type seen in the training corpus, multipied with a tolerance factor.

Version:
$Revision: 1.9 $, $Date: 2006/10/21 16:04:17 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
static int DEFAULT_LENGTH
          The default length used for unknown types: 3.
 
Fields inherited from class de.fu_berlin.ties.extract.reestimate.Reestimator
CONFIG_REESTIMATORS
 
Constructor Summary
LengthFilter(Reestimator precReestimator, TiesConfiguration config)
          Creates a new instance.
 
Method Summary
protected  Extraction doReestimate(Extraction extraction)
          Re-estimates the probability of an extraction.
protected  void doTrain(Extraction extraction)
          Trains this re-estimator on an extraction.
 int toleratedLength(String type)
          Returns the maximum length tolerated for extractions of a given type.
 String toString()
          Returns a string representation of this object.
 
Methods inherited from class de.fu_berlin.ties.extract.reestimate.Reestimator
createReestimators, createReestimators, getPrecedingReestimator, reestimate, train, trainOtherToken
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

DEFAULT_LENGTH

public static final int DEFAULT_LENGTH
The default length used for unknown types: 3.

See Also:
Constant Field Values
Constructor Detail

LengthFilter

public LengthFilter(Reestimator precReestimator,
                    TiesConfiguration config)
Creates a new instance.

Parameters:
precReestimator - the preceding re-estimator to use if this re-estimator is part of a chain; null otherwise
config - the configuration to use
Method Detail

toleratedLength

public int toleratedLength(String type)
Returns the maximum length tolerated for extractions of a given type. The returned value is calculated by multiplying the maximum extraction length seen during training (so far) with the tolerance factor and rounding the result down to the nearest integer.

Parameters:
type - the extraction type
Returns:
the maximum tolerated length for extractions of this type

doReestimate

protected Extraction doReestimate(Extraction extraction)
Re-estimates the probability of an extraction.

Specified by:
doReestimate in class Reestimator
Parameters:
extraction - the extraction to re-estimate
Returns:
the re-estimated extraction; or null if the extraction should be deleted

doTrain

protected void doTrain(Extraction extraction)
Trains this re-estimator on an extraction.

Specified by:
doTrain in class Reestimator
Parameters:
extraction - the extraction to train

toString

public String toString()
Returns a string representation of this object.

Overrides:
toString in class Reestimator
Returns:
a textual representation


Copyright © 2003-2007 Christian Siefkes. All Rights Reserved.