de.fu_berlin.ties.classify
Class MultiBinaryClassifier

java.lang.Object
  extended by de.fu_berlin.ties.classify.TrainableClassifier
      extended by de.fu_berlin.ties.classify.MultiBinaryClassifier
All Implemented Interfaces:
Classifier, XMLStorable

public class MultiBinaryClassifier
extends TrainableClassifier

This classifier converts an multi-class classification task into a several binary (two-class) classification task. It wraps several instances of another classifier that are used to perform the binary classifications and combines their results.

The first class from the set of classes passed to the constructor is considered as the "background" class, while all further members are considered as "foreground" classes.

Instances of this class are thread-safe if and only if instances of the wrapped classifier are.

WARNING: The current implementation does not query the shouldTrain method of inner classifiers. Because of this, classifiers overwriting shouldTrain might not work correctly within this classifier.

Version:
$Revision: 1.19 $, $Date: 2004/12/09 18:09:14 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
 
Fields inherited from class de.fu_berlin.ties.classify.TrainableClassifier
META_CLASSIFIER, MULTI_CLASSIFIER, OAR_CLASSIFIER
 
Fields inherited from interface de.fu_berlin.ties.classify.Classifier
CONFIG_CLASSIFIER
 
Constructor Summary
MultiBinaryClassifier(Set<String> allValidClasses, FeatureTransformer trans, File runDirectory, String[] innerSpec, TiesConfiguration conf)
          Creates a new instance.
 
Method Summary
protected  Set<String> createBinarySet(String foregroundClass)
          Helper method that creates a set containing the two classes of a binary classifier.
protected  PredictionDistribution doClassify(FeatureVector features, Set candidateClasses, ContextMap context)
          Classifies an item that is represented by a feature vector by choosing the most probable class among a set of candidate classes. This implementation combines the predictions for the foreground of all involved inner classifiers.
protected  void doTrain(FeatureVector features, String targetClass, ContextMap context)
          Incorporates an item that is represented by a feature vector into the classification model.
 String getBackgroundClass()
          Returns the "background" class of this classifier.
 void reset()
          Resets the classifer, completely deleting the prediction model.
 String toString()
          Returns a string representation of this object.
protected  boolean trainOnErrorHook(PredictionDistribution predDist, FeatureVector features, String targetClass, Set candidateClasses, ContextMap context)
          Subclasses can implement this hook for more refined error-driven learning.
 
Methods inherited from class de.fu_berlin.ties.classify.TrainableClassifier
classify, createClassifier, createClassifier, createClassifier, createClassifier, getAllClasses, getConfig, shouldTrain, toElement, train, trainOnError
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

MultiBinaryClassifier

public MultiBinaryClassifier(Set<String> allValidClasses,
                             FeatureTransformer trans,
                             File runDirectory,
                             String[] innerSpec,
                             TiesConfiguration conf)
                      throws IllegalArgumentException,
                             ProcessingException
Creates a new instance.

Parameters:
allValidClasses - the set of all valid classes; the first member of this set is considered as the "background" class, all further members are considered as "foreground" classes
trans - the last transformer in the transformer chain to use, or null if no feature transformers should be used
runDirectory - optional run directory passed to inner classifiers of the ExternalClassifier type
innerSpec - the specification used to initialize the inner classifiers, passed to the TrainableClassifier.createClassifier(Set, File, FeatureTransformer, String[], TiesConfiguration) factory method
conf - used to configure this instance and the inner classifiers
Throws:
IllegalArgumentException - if there are fewer than three classes (in which case you should use the inner classifier directly since there is no need to wrap several instances)
ProcessingException - if an error occurred while creating this classifier or one of the wrapped classifiers
Method Detail

createBinarySet

protected Set<String> createBinarySet(String foregroundClass)
Helper method that creates a set containing the two classes of a binary classifier.

Parameters:
foregroundClass - the "foreground" class to use
Returns:
a set containing the "background" class and the specified foregroundClass; this implementation returns the two classes in alphabetic order

doClassify

protected PredictionDistribution doClassify(FeatureVector features,
                                            Set candidateClasses,
                                            ContextMap context)
                                     throws ProcessingException
Classifies an item that is represented by a feature vector by choosing the most probable class among a set of candidate classes. This implementation combines the predictions for the foreground of all involved inner classifiers.

If the background class is part of the candidateClasses, the classifier whose background probability is closest to 0.5 determines the probability of the background class. In this way all classes will be sorted the right way (foreground classes with a higher than 0.5 before the background class, those with a lower probability after it). pR values are not considered for this purpose, so you should be careful when combination classifiers that mainly rely on the pR value in a multi-binary setup.

The probability estimates returned by each classifier are used "as is", so the result will not be a real probability distribution because sum of all probabilities will be more than 1. If you want to work on a real probability distribution you have normalize it yourself.

Specified by:
doClassify in class TrainableClassifier
Parameters:
features - the feature vector to consider
candidateClasses - an set of classes that are allowed for this item
context - can be used to transport implementation-specific contextual information between the TrainableClassifier.doClassify(FeatureVector, Set, ContextMap), TrainableClassifier.doTrain(FeatureVector, String, ContextMap), and TrainableClassifier.trainOnErrorHook(PredictionDistribution, FeatureVector, String, Set, ContextMap) methods
Returns:
the result of the classification; you can call PredictionDistribution.best() to get the most probably class
Throws:
ProcessingException - if an error occurs during classification

doTrain

protected void doTrain(FeatureVector features,
                       String targetClass,
                       ContextMap context)
                throws ProcessingException
Incorporates an item that is represented by a feature vector into the classification model.

Specified by:
doTrain in class TrainableClassifier
Parameters:
features - the feature vector to consider
targetClass - the class of this feature vector
context - can be used to transport implementation-specific contextual information between the TrainableClassifier.doClassify(FeatureVector, Set, ContextMap), TrainableClassifier.doTrain(FeatureVector, String, ContextMap), and TrainableClassifier.trainOnErrorHook(PredictionDistribution, FeatureVector, String, Set, ContextMap) methods
Throws:
ProcessingException - if an error occurs during training

getBackgroundClass

public String getBackgroundClass()
Returns the "background" class of this classifier.

Returns:
the value of the attribute

toString

public String toString()
Returns a string representation of this object.

Overrides:
toString in class TrainableClassifier
Returns:
a textual representation

trainOnErrorHook

protected boolean trainOnErrorHook(PredictionDistribution predDist,
                                   FeatureVector features,
                                   String targetClass,
                                   Set candidateClasses,
                                   ContextMap context)
                            throws ProcessingException
Subclasses can implement this hook for more refined error-driven learning. It is called from the TrainableClassifier.trainOnError(FeatureVector, String, Set) method after classifying. This method can do any necessary training itself and return true to signal that no further action is necessary. This implementation is just a placeholder that always returns false.

Overrides:
trainOnErrorHook in class TrainableClassifier
Parameters:
predDist - the prediction distribution returned by TrainableClassifier.classify(FeatureVector, Set)
features - the feature vector to consider
targetClass - the expected class of this feature vector; must be contained in the set of candidateClasses
candidateClasses - an set of classes that are allowed for this item (the actual targetClass must be one of them)
context - can be used to transport implementation-specific contextual information between the TrainableClassifier.doClassify(FeatureVector, Set, ContextMap), TrainableClassifier.doTrain(FeatureVector, String, ContextMap), and TrainableClassifier.trainOnErrorHook(PredictionDistribution, FeatureVector, String, Set, ContextMap) methods
Returns:
this implementation always returns false; subclasses can return true to signal that any error-driven learning was already handled
Throws:
ProcessingException - if an error occurs during training

reset

public void reset()
           throws ProcessingException
Resets the classifer, completely deleting the prediction model.

Specified by:
reset in class TrainableClassifier
Throws:
ProcessingException - if an error occurs during reset


Copyright © 2003-2004 Christian Siefkes. All Rights Reserved.