|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.classify.TrainableClassifier
de.fu_berlin.ties.classify.winnow.Winnow
public class Winnow
Classifier implementing the Winnow algorithm (Nick Littlestone). Winnow
supports only error-driven training, so you always have to use the
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method. Trying to
call the TrainableClassifier.train(FeatureVector, String)
method instead will result in an
UnsupportedOperationException
.
Instances of this class are thread-safe.
Field Summary |
---|
Fields inherited from class de.fu_berlin.ties.classify.TrainableClassifier |
---|
ELEMENT_MAIN, META_CLASSIFIER, MULTI_CLASSIFIER, OAR_CLASSIFIER, TIE_CLASSIFIER |
Fields inherited from interface de.fu_berlin.ties.classify.Classifier |
---|
CONFIG_CLASSIFIER |
Constructor Summary | |
---|---|
|
Winnow(Element element)
Creates a new instance from an XML element, fulfilling the recommandation of the XMLStorable interface. |
|
Winnow(Set<String> allValidClasses)
Creates a new instance based on the standard configuration. |
|
Winnow(Set<String> allValidClasses,
FeatureTransformer trans,
boolean balance,
float promotionFactor,
float demotionFactor,
float thresholdThick,
int ignoreExp,
TiesConfiguration config,
String configSuffix)
Creates a new instance. |
|
Winnow(Set<String> allValidClasses,
FeatureTransformer trans,
TiesConfiguration config)
Creates a new instance based on the provided configuration. |
protected |
Winnow(Set<String> allValidClasses,
FeatureTransformer trans,
TiesConfiguration config,
String configSuffix)
Creates a new instance based on the provided configuration. |
protected |
Winnow(Set<String> allValidClasses,
String configSuffix)
Creates a new instance based on the standard configuration. |
|
Winnow(Set<String> allValidClasses,
TiesConfiguration config)
Creates a new instance based on the provided configuration. |
protected |
Winnow(Set<String> allValidClasses,
TiesConfiguration config,
String configSuffix)
Creates a new instance based on the provided configuration. |
Method Summary | |
---|---|
protected void |
adjustWeights(Feature feature,
short[] directions)
Adjusts the weights of a feature for all classes. |
protected boolean |
checkRelevance(float[] weights)
Checks whether a feature is relevant for classification. |
protected void |
chooseClassesToAdjust(WinnowDistribution winnowDist,
String targetClass,
Set<String> classesToPromote,
Set<String> classesToDemote)
Chooses the classes to promote and the classes to demote. |
protected double |
confidence(float normalized,
float sum)
Converts a normalized activation value into a confidence estimate. |
protected float |
defaultWeight()
Returns the default weight to use if a feature is unknown. |
void |
destroy()
Destroys the classifer. |
protected PredictionDistribution |
doClassify(FeatureVector features,
Set candidateClasses,
ContextMap context)
Classifies an item that is represented by a feature vector by choosing the most probable class among a set of candidate classes. |
protected void |
doTrain(FeatureVector features,
String targetClass,
ContextMap context)
Winnow supports only error-driven training, so you always have to use the TrainableClassifier.trainOnError(FeatureVector, String, Set) method
instead of this one. |
protected FeatureSet |
featureSet(FeatureVector fv)
Converts a feature vector into a FeatureSet (a multi-set of
features). |
float |
getDemotion()
Returns the promotion factor used by the algorithm. |
float |
getPromotion()
Returns the demotion factor used by the algorithm. |
float |
getThresholdThickness()
Returns the thickness of the threshold if the "thick threshold" heuristic is used. |
protected float[] |
initScores()
Initializes the score (activation values) to use for all classes. |
protected float |
initWeight()
Returns the initial weight to use for each feature per class. |
protected float[] |
initWeightArray()
Returns the initial weight array to use for a feature for all classes. |
boolean |
isBalanced()
Whether the Balanced Winnow or the standard Winnow algorithm is used. |
protected float |
majorThreshold(float threshold,
float rawThreshold)
Calculates the major theshold (theta+) to use for classification with the "thick threshold" heuristic. |
protected float |
minorThreshold(float threshold,
float rawThreshold)
Calculates the minor theshold (theta-) to use for classification with the "thick threshold" heuristic. |
protected float |
normalizeScore(float score,
float threshold,
float rawThreshold)
Converts the raw score (activation value) to a normalized value depending on the threshold theta. |
protected float |
rawThreshold(FeatureSet features)
Calculates the theshold (theta) to use for classification, based on the number of active features. |
void |
reset()
Resets the classifer, completely deleting the prediction model. |
Map<String,List<Float>> |
showFeatureWeights(FeatureVector features)
Returns a mapping from feature representations to weights. |
protected float |
threshold(float rawThreshold)
Calculates the theshold (theta) to use for classification. |
ObjectElement |
toElement()
Stores all relevant fields of this object in an XML element for serialization. An equivalent object can be created by calling ObjectElement.createObject(org.dom4j.Element,
Class) on the created element.
Subclasses of TrainableClassifier should extend this method and
the corresponding constructor from Element to
ensure (de)serialization works as expected. |
String |
toString()
Returns a string representation of this object. |
protected boolean |
trainOnErrorHook(PredictionDistribution predDist,
FeatureVector features,
String targetClass,
Set candidateClasses,
ContextMap context)
Hook implementing error-driven learning, promoting and demoting weights as required. |
protected void |
updateScores(Feature feature,
float[] scores)
Updates the score (activation values) for all classes by adding the weights of a feature. |
Methods inherited from class de.fu_berlin.ties.classify.TrainableClassifier |
---|
classify, createClassifier, createClassifier, createClassifier, createClassifier, createClassifier, doTrainOnError, getAllClasses, getConfig, shouldTrain, train, trainOnError |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public Winnow(Element element) throws InstantiationException
XMLStorable
interface.
element
- the XML element containing the serialized representation
InstantiationException
- if the given element does not contain
a valid classifier descriptionpublic Winnow(Set<String> allValidClasses) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classes
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)protected Winnow(Set<String> allValidClasses, String configSuffix) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classesconfigSuffix
- optional suffix appended to the configuration keys
when configuring this instance; might be null
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)public Winnow(Set<String> allValidClasses, TiesConfiguration config) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classesconfig
- contains configuration properties
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)protected Winnow(Set<String> allValidClasses, TiesConfiguration config, String configSuffix) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classesconfig
- contains configuration propertiesconfigSuffix
- optional suffix appended to the configuration keys
when configuring this instance; might be null
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)public Winnow(Set<String> allValidClasses, FeatureTransformer trans, TiesConfiguration config) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classestrans
- the last transformer in the transformer chain to use, or
null
if no feature transformers should be usedconfig
- contains configuration properties
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)protected Winnow(Set<String> allValidClasses, FeatureTransformer trans, TiesConfiguration config, String configSuffix) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classestrans
- the last transformer in the transformer chain to use, or
null
if no feature transformers should be usedconfig
- contains configuration propertiesconfigSuffix
- optional suffix appended to the configuration keys
when configuring this instance; might be null
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)public Winnow(Set<String> allValidClasses, FeatureTransformer trans, boolean balance, float promotionFactor, float demotionFactor, float thresholdThick, int ignoreExp, TiesConfiguration config, String configSuffix) throws IllegalArgumentException
allValidClasses
- the set of all valid classestrans
- the last transformer in the transformer chain to use, or
null
if no feature transformers should be usedbalance
- whether to use the Balanced Winnow or the standard
Winnow algorithmpromotionFactor
- the promotion factor used by the algorithm;
must be > 1.0demotionFactor
- the demotion factor used by the algorithm; must
be < 1.0thresholdThick
- the thickness of the threshold if the "thick
threshold" heuristic is used (must be < 1.0), 0.0 otherwiseignoreExp
- exponent used to calculate which features to consider
irrelevant for classification (if any)config
- contains configuration propertiesconfigSuffix
- optional suffix appended to the configuration keys
when configuring this instance; might be null
IllegalArgumentException
- if one of the parameters is outside
the allowed rangeMethod Detail |
---|
protected void adjustWeights(Feature feature, short[] directions)
feature
- the feature to processdirections
- an array specifying for each class (in alphabetic
order) whether it should be promoted (positive value), demoted (negative
value) or left unmodified (0)protected void chooseClassesToAdjust(WinnowDistribution winnowDist, String targetClass, Set<String> classesToPromote, Set<String> classesToDemote)
targetClass
for promotion if its score is
less or equal to the threshold.
It chooses all other classes for demotion if their score is greather
than the threshold.
winnowDist
- the prediction distribution returned by
TrainableClassifier.classify(FeatureVector, Set)
targetClass
- the expected class of this instance; must be
contained in the set of candidateClasses
classesToPromote
- the classes to promote are added to this setclassesToDemote
- the classes to demote are added to this setprotected double confidence(float normalized, float sum)
normalized
- the normalized activation value to convertsum
- the sum of all normalized activation values
normalized / sum
protected boolean checkRelevance(float[] weights)
weights
- the weights of the feature
true
iff the feature is relevant for classification;protected float defaultWeight()
Balanced
Winnow
(where positive and negative weights should cancel each other
out), initWeight()
otherwise.
public void destroy()
TrainableClassifier.reset()
, but subclasses can overwrite this behaviour if
appropriate.
destroy
in interface Classifier
destroy
in class TrainableClassifier
protected PredictionDistribution doClassify(FeatureVector features, Set candidateClasses, ContextMap context)
doClassify
in class TrainableClassifier
features
- the feature vector to considercandidateClasses
- an set of classes that are allowed for this itemcontext
- can be used to transport implementation-specific
contextual information between the
TrainableClassifier.doClassify(FeatureVector, Set, ContextMap)
,
TrainableClassifier.doTrain(FeatureVector, String, ContextMap)
, and
TrainableClassifier.trainOnErrorHook(PredictionDistribution, FeatureVector, String,
Set, ContextMap)
methods
PredictionDistribution.best()
to get the most probably classprotected void doTrain(FeatureVector features, String targetClass, ContextMap context) throws UnsupportedOperationException
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method
instead of this one. Trying to call this method instead will result in an
UnsupportedOperationException
.
doTrain
in class TrainableClassifier
features
- ignored by this methodtargetClass
- ignored by this methodcontext
- ignored by this method
UnsupportedOperationException
- always thrown by this method;
use TrainableClassifier.trainOnError(FeatureVector, String, Set)
insteadprotected FeatureSet featureSet(FeatureVector fv)
FeatureSet
(a multi-set of
features). If the
last
transformation of the provided vector already is a
FeatureSet
instance, it is casted and returned. Otherwise a
new FeatureSet
with the same contents is created, reading
the used method for considering feature frequencies in strength values
from the "classifier.winnow.strength.frequency" configuration key.
fv
- the feature vector to convert
public float getDemotion()
public float getPromotion()
public boolean isBalanced()
protected float[] initScores()
public float getThresholdThickness()
protected float initWeight()
protected float[] initWeightArray()
Balanced
Winnow. Each element is initialized to
initWeight()
.
protected float majorThreshold(float threshold, float rawThreshold)
threshold
- the threshold
thetarawThreshold
- the raw
threshold thetar
minorThreshold(float, float)
protected float minorThreshold(float threshold, float rawThreshold)
threshold
- the threshold
thetarawThreshold
- the raw
threshold thetar
majorThreshold(float, float)
protected float normalizeScore(float score, float threshold, float rawThreshold)
norm(score, theta, thetar) = e^((score - theta) / thetar))
score
- the raw score (activation value); must be a
positive value in case of normal (non-balanced) Winnowthreshold
- the threshold
theta used for this instancerawThreshold
- the raw
threshold thetar used for this instance
protected float rawThreshold(FeatureSet features)
features
- the feature set to consider
public void reset()
reset
in class TrainableClassifier
public Map<String,List<Float>> showFeatureWeights(FeatureVector features)
TrainableClassifier.getAllClasses()
.
This method exists for debugging and demonstration purposes.
features
- the feature vector to consider
protected float threshold(float rawThreshold)
rawThreshold
multiplied with
the default weight. Subclasses can
overwrite this method to calculate the theshold in a different way.
rawThreshold
- the raw
threshold
protected boolean trainOnErrorHook(PredictionDistribution predDist, FeatureVector features, String targetClass, Set candidateClasses, ContextMap context) throws ProcessingException
trainOnErrorHook
in class TrainableClassifier
predDist
- the prediction distribution returned by
TrainableClassifier.classify(FeatureVector, Set)
; must be a
WinnowDistribution
features
- the feature vector to considertargetClass
- the expected class of this feature vector; must be
contained in the set of candidateClasses
candidateClasses
- an set of classes that are allowed for this item
(the actual targetClass
must be one of them)context
- ignored by this implementation
true
to signal
that any error-driven learning was already handled
ProcessingException
- if an error occurs during trainingpublic ObjectElement toElement()
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
Subclasses of TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected.
toElement
in interface XMLStorable
toElement
in class TrainableClassifier
public String toString()
toString
in class TrainableClassifier
protected void updateScores(Feature feature, float[] scores)
feature
- the feature to processscores
- an array of floats containing the scores for each
class; will be updated by this method
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |