de.fu_berlin.ties.filter
Class TrainableFilteringTokenWalker

java.lang.Object
  extended by de.fu_berlin.ties.xml.dom.TokenWalker
      extended by de.fu_berlin.ties.filter.FilteringTokenWalker
          extended by de.fu_berlin.ties.filter.TrainableFilteringTokenWalker

public class TrainableFilteringTokenWalker
extends FilteringTokenWalker

A filtering token walker that can be trained.

Version:
$Revision: 1.8 $, $Date: 2004/09/15 15:56:54 $, $Author: siefkes $
Author:
Christian Siefkes

Constructor Summary
TrainableFilteringTokenWalker(TokenProcessor processor, TokenizerFactory tFactory, TrainableFilter elementFilter, SkipHandler sHandler, Oracle elementOracle)
          Creates a new instance, enabling training the filter.
TrainableFilteringTokenWalker(TokenProcessor processor, TokenizerFactory tFactory, TrainableFilter elementFilter, SkipHandler sHandler, Oracle elementOracle, boolean enableTraining)
          Creates a new instance.
 
Method Summary
protected  boolean handleAccept(Element element, Element filteredElement, boolean decision)
          This method can be overwritten by subclasses to modify decisions of the element filter. This implementation relies on the oracle to make the final decision and joins the predicted decision and the correct decision via OR.
 boolean isTrainingEnabled()
          Returns true if training the embedded filter is enabled (default).
 
Methods inherited from class de.fu_berlin.ties.filter.FilteringTokenWalker
getAcceptedElements, getFilter, getRejectedElements, processToken, toString, walk
 
Methods inherited from class de.fu_berlin.ties.xml.dom.TokenWalker
processCollectedText, walk
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TrainableFilteringTokenWalker

public TrainableFilteringTokenWalker(TokenProcessor processor,
                                     TokenizerFactory tFactory,
                                     TrainableFilter elementFilter,
                                     SkipHandler sHandler,
                                     Oracle elementOracle)
Creates a new instance, enabling training the filter.

Parameters:
processor - used to process the tokens
tFactory - used to instantiate tokenizers
elementFilter - the trainable element filter to use
sHandler - a handler that is called whenever some tokens are skipped; may be null
elementOracle - oracle queried to decide which elements should be accepted by the trainable filter

TrainableFilteringTokenWalker

public TrainableFilteringTokenWalker(TokenProcessor processor,
                                     TokenizerFactory tFactory,
                                     TrainableFilter elementFilter,
                                     SkipHandler sHandler,
                                     Oracle elementOracle,
                                     boolean enableTraining)
Creates a new instance.

Parameters:
processor - used to process the tokens
tFactory - used to instantiate tokenizers
elementFilter - the trainable element filter to use
sHandler - a handler that is called whenever some tokens are skipped; may be null
elementOracle - oracle queried to decide which elements should be accepted by the trainable filter
enableTraining - if true the embedded filter is trained from the decisions of the oracle; otherwise the oracle is only queried to log if the filter made a mistake
Method Detail

handleAccept

protected boolean handleAccept(Element element,
                               Element filteredElement,
                               boolean decision)
                        throws ProcessingException
This method can be overwritten by subclasses to modify decisions of the element filter. The standard behavior is to accept the decision as is. This implementation relies on the oracle to make the final decision and joins the predicted decision and the correct decision via OR. This allows the next step to view the tokenized text in all necessary cases (if it should view it as determined by the oracle, or it is would view it because of the the trainable classifer's prediction). It also gives the trainable filter to chance to train itself on the correct decision -- even if the original decision was already correct since there are classifiers (e.g. Winnow) that are not purely error-driven but also learn from reinforcement of (some) correct instances.

Overrides:
handleAccept in class FilteringTokenWalker
Parameters:
element - the element to test
filteredElement - the element that was actually filtered (element or a parent), or null if the decision had been cached (no filtering took place)
decision - the decision of the element filer
Returns:
the revised decision
Throws:
ProcessingException - if an error occurs while revising the decision

isTrainingEnabled

public boolean isTrainingEnabled()
Returns true if training the embedded filter is enabled (default).

Returns:
whether training is enabled


Copyright © 2003-2004 Christian Siefkes. All Rights Reserved.