de.fu_berlin.ties.classify
Class MultiBinaryClassifier

java.lang.Object
  extended by de.fu_berlin.ties.classify.TrainableClassifier
      extended by de.fu_berlin.ties.classify.MultiBinaryClassifier
All Implemented Interfaces:
Classifier, XMLStorable

public class MultiBinaryClassifier
extends TrainableClassifier

This classifier converts an multi-class classification task into a several binary (two-class) classification task. It wraps several instances of another classifier that are used to perform the binary classifications and combines their results.

The first class from the set of classes passed to the constructor is considered as the "background" class, while all further members are considered as "foreground" classes.

Instances of this class are thread-safe if and only if instances of the wrapped classifier are.

WARNING: The current implementation does not query the shouldTrain method of inner classifiers. Because of this, classifiers overwriting shouldTrain might not work correctly within this classifier.

Version:
$Revision: 1.26 $, $Date: 2006/10/21 16:03:54 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
(package private) static QName ATTRIB_FOR
          Attribute name used for XML serialization.
(package private) static QName ELEMENT_INNER
          Element name used for XML serialization.
 
Fields inherited from class de.fu_berlin.ties.classify.TrainableClassifier
ATTRIB_CLASSES, ATTRIB_TRAIN_ALL, ELEMENT_MAIN, META_CLASSIFIER, MULTI_CLASSIFIER, OAR_CLASSIFIER, TIE_CLASSIFIER
 
Fields inherited from interface de.fu_berlin.ties.classify.Classifier
CONFIG_CLASSIFIER
 
Constructor Summary
MultiBinaryClassifier(Element element)
          Creates a new instance from an XML element, fulfilling the recommandation of the XMLStorable interface.
MultiBinaryClassifier(Set<String> allValidClasses, FeatureTransformer trans, File runDirectory, String[] innerSpec, TiesConfiguration conf)
          Creates a new instance.
 
Method Summary
protected  Set<String> createBinarySet(String foregroundClass)
          Helper method that creates a set containing the two classes of a binary classifier.
 void destroy()
          Destroys the classifer.
protected  PredictionDistribution doClassify(FeatureVector features, Set candidateClasses, ContextMap context)
          Classifies an item that is represented by a feature vector by choosing the most probable class among a set of candidate classes. This implementation combines the predictions for the foreground of all involved inner classifiers.
protected  void doTrain(FeatureVector features, String targetClass, ContextMap context)
          Incorporates an item that is represented by a feature vector into the classification model.
 String getBackgroundClass()
          Returns the "background" class of this classifier.
 void reset()
          Resets the classifer, completely deleting the prediction model.
 ObjectElement toElement()
          Stores all relevant fields of this object in an XML element for serialization. An equivalent object can be created by calling ObjectElement.createObject(org.dom4j.Element, Class) on the created element. Subclasses of TrainableClassifier should extend this method and the corresponding constructor from Element to ensure (de)serialization works as expected.
 String toString()
          Returns a string representation of this object.
protected  boolean trainOnErrorHook(PredictionDistribution predDist, FeatureVector features, String targetClass, Set candidateClasses, ContextMap context)
          Subclasses can implement this hook for more refined error-driven learning.
 
Methods inherited from class de.fu_berlin.ties.classify.TrainableClassifier
classify, createClassifier, createClassifier, createClassifier, createClassifier, createClassifier, doTrainOnError, getAllClasses, getConfig, shouldTrain, train, trainOnError
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

ELEMENT_INNER

static final QName ELEMENT_INNER
Element name used for XML serialization.


ATTRIB_FOR

static final QName ATTRIB_FOR
Attribute name used for XML serialization.

Constructor Detail

MultiBinaryClassifier

public MultiBinaryClassifier(Element element)
                      throws InstantiationException
Creates a new instance from an XML element, fulfilling the recommandation of the XMLStorable interface.

Parameters:
element - the XML element containing the serialized representation
Throws:
InstantiationException - if the given element does not contain a valid classifier description

MultiBinaryClassifier

public MultiBinaryClassifier(Set<String> allValidClasses,
                             FeatureTransformer trans,
                             File runDirectory,
                             String[] innerSpec,
                             TiesConfiguration conf)
                      throws IllegalArgumentException,
                             ProcessingException
Creates a new instance.

Parameters:
allValidClasses - the set of all valid classes; the first member of this set is considered as the "background" class, all further members are considered as "foreground" classes
trans - the last transformer in the transformer chain to use, or null if no feature transformers should be used
runDirectory - optional run directory passed to inner classifiers of the ExternalClassifier type
innerSpec - the specification used to initialize the inner classifiers, passed to the TrainableClassifier.createClassifier(Set, File, FeatureTransformer, String[], TiesConfiguration) factory method
conf - used to configure this instance and the inner classifiers
Throws:
IllegalArgumentException - if there are fewer than three classes (in which case you should use the inner classifier directly since there is no need to wrap several instances)
ProcessingException - if an error occurred while creating this classifier or one of the wrapped classifiers
Method Detail

createBinarySet

protected Set<String> createBinarySet(String foregroundClass)
Helper method that creates a set containing the two classes of a binary classifier.

Parameters:
foregroundClass - the "foreground" class to use
Returns:
a set containing the "background" class and the specified foregroundClass; this implementation returns the two classes in alphabetic order

destroy

public void destroy()
             throws ProcessingException
Destroys the classifer. This method must be called only if the classifier will never be used again. The default implementation delegates to TrainableClassifier.reset(), but subclasses can overwrite this behaviour if appropriate.

Specified by:
destroy in interface Classifier
Overrides:
destroy in class TrainableClassifier
Throws:
ProcessingException - if an error occurs while the classifier is being destroyed

doClassify

protected PredictionDistribution doClassify(FeatureVector features,
                                            Set candidateClasses,
                                            ContextMap context)
                                     throws ProcessingException
Classifies an item that is represented by a feature vector by choosing the most probable class among a set of candidate classes. This implementation combines the predictions for the foreground of all involved inner classifiers.

If the background class is part of the candidateClasses, the classifier whose background probability is closest to 0.5 determines the probability of the background class. In this way all classes will be sorted the right way (foreground classes with a higher than 0.5 before the background class, those with a lower probability after it). pR values are not considered for this purpose, so you should be careful when combination classifiers that mainly rely on the pR value in a multi-binary setup.

The probability estimates returned by each classifier are used "as is", so the result will not be a real probability distribution because sum of all probabilities will be more than 1. If you want to work on a real probability distribution you have normalize it yourself.

Specified by:
doClassify in class TrainableClassifier
Parameters:
features - the feature vector to consider
candidateClasses - an set of classes that are allowed for this item
context - can be used to transport implementation-specific contextual information between the TrainableClassifier.doClassify(FeatureVector, Set, ContextMap), TrainableClassifier.doTrain(FeatureVector, String, ContextMap), and TrainableClassifier.trainOnErrorHook(PredictionDistribution, FeatureVector, String, Set, ContextMap) methods
Returns:
the result of the classification; you can call PredictionDistribution.best() to get the most probably class
Throws:
ProcessingException - if an error occurs during classification

doTrain

protected void doTrain(FeatureVector features,
                       String targetClass,
                       ContextMap context)
                throws ProcessingException
Incorporates an item that is represented by a feature vector into the classification model.

Specified by:
doTrain in class TrainableClassifier
Parameters:
features - the feature vector to consider
targetClass - the class of this feature vector
context - can be used to transport implementation-specific contextual information between the TrainableClassifier.doClassify(FeatureVector, Set, ContextMap), TrainableClassifier.doTrain(FeatureVector, String, ContextMap), and TrainableClassifier.trainOnErrorHook(PredictionDistribution, FeatureVector, String, Set, ContextMap) methods
Throws:
ProcessingException - if an error occurs during training

getBackgroundClass

public String getBackgroundClass()
Returns the "background" class of this classifier.

Returns:
the value of the attribute

toElement

public ObjectElement toElement()
Stores all relevant fields of this object in an XML element for serialization. An equivalent object can be created by calling ObjectElement.createObject(org.dom4j.Element, Class) on the created element. Subclasses of TrainableClassifier should extend this method and the corresponding constructor from Element to ensure (de)serialization works as expected.

Specified by:
toElement in interface XMLStorable
Overrides:
toElement in class TrainableClassifier
Returns:
the created XML element

toString

public String toString()
Returns a string representation of this object.

Overrides:
toString in class TrainableClassifier
Returns:
a textual representation

trainOnErrorHook

protected boolean trainOnErrorHook(PredictionDistribution predDist,
                                   FeatureVector features,
                                   String targetClass,
                                   Set candidateClasses,
                                   ContextMap context)
                            throws ProcessingException
Subclasses can implement this hook for more refined error-driven learning. It is called from the TrainableClassifier.trainOnError(FeatureVector, String, Set) method after classifying. This method can do any necessary training itself and return true to signal that no further action is necessary. This implementation is just a placeholder that always returns false.

Overrides:
trainOnErrorHook in class TrainableClassifier
Parameters:
predDist - the prediction distribution returned by TrainableClassifier.classify(FeatureVector, Set)
features - the feature vector to consider
targetClass - the expected class of this feature vector; must be contained in the set of candidateClasses
candidateClasses - an set of classes that are allowed for this item (the actual targetClass must be one of them)
context - can be used to transport implementation-specific contextual information between the TrainableClassifier.doClassify(FeatureVector, Set, ContextMap), TrainableClassifier.doTrain(FeatureVector, String, ContextMap), and TrainableClassifier.trainOnErrorHook(PredictionDistribution, FeatureVector, String, Set, ContextMap) methods
Returns:
this implementation always returns false; subclasses can return true to signal that any error-driven learning was already handled
Throws:
ProcessingException - if an error occurs during training

reset

public void reset()
           throws ProcessingException
Resets the classifer, completely deleting the prediction model.

Specified by:
reset in class TrainableClassifier
Throws:
ProcessingException - if an error occurs during reset


Copyright © 2003-2007 Christian Siefkes. All Rights Reserved.