|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.classify.TrainableClassifier
de.fu_berlin.ties.classify.winnow.Winnow
public class Winnow
Classifier implementing the Winnow algorithm (Nick Littlestone). Winnow
supports only error-driven training, so you always have to use the
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method. Trying to
call the TrainableClassifier.train(FeatureVector, String)
method instead will result in an
UnsupportedOperationException
.
Instances of this class are thread-safe.
Field Summary |
---|
Fields inherited from class de.fu_berlin.ties.classify.TrainableClassifier |
---|
META_CLASSIFIER, MULTI_CLASSIFIER, OAR_CLASSIFIER |
Fields inherited from interface de.fu_berlin.ties.classify.Classifier |
---|
CONFIG_CLASSIFIER |
Constructor Summary | |
---|---|
|
Winnow(Set<String> allValidClasses)
Creates a new instance based on the standard configuration. |
|
Winnow(Set<String> allValidClasses,
FeatureTransformer trans,
boolean balance,
float promotionFactor,
float demotionFactor,
float thresholdThick,
TiesConfiguration config,
String configSuffix)
Creates a new instance. |
|
Winnow(Set<String> allValidClasses,
FeatureTransformer trans,
TiesConfiguration config)
Creates a new instance based on the provided configuration. |
protected |
Winnow(Set<String> allValidClasses,
FeatureTransformer trans,
TiesConfiguration config,
String configSuffix)
Creates a new instance based on the provided configuration. |
protected |
Winnow(Set<String> allValidClasses,
String configSuffix)
Creates a new instance based on the standard configuration. |
|
Winnow(Set<String> allValidClasses,
TiesConfiguration config)
Creates a new instance based on the provided configuration. |
protected |
Winnow(Set<String> allValidClasses,
TiesConfiguration config,
String configSuffix)
Creates a new instance based on the provided configuration. |
Method Summary | |
---|---|
protected void |
adjustWeights(Feature feature,
short[] directions)
Adjusts the weights of a feature for all classes. |
protected void |
chooseClassesToAdjust(WinnowDistribution winnowDist,
String targetClass,
Set<String> classesToPromote,
Set<String> classesToDemote)
Chooses the classes to promote and the classes to demote. |
protected double |
confidence(float sigmoid,
float sum)
Converts a sigmoid activation value into a confidence estimate. |
protected float |
defaultWeight()
Returns the default weight to use if a feature is unknown. |
protected PredictionDistribution |
doClassify(FeatureVector features,
Set candidateClasses,
ContextMap context)
Classifies an item that is represented by a feature vector by choosing the most probable class among a set of candidate classes. |
protected void |
doTrain(FeatureVector features,
String targetClass,
ContextMap context)
Winnow supports only error-driven training, so you always have to use the TrainableClassifier.trainOnError(FeatureVector, String, Set) method
instead of this one. |
protected FeatureSet |
featureSet(FeatureVector fv)
Converts a feature vector into a FeatureSet (a multi-set of
features). |
float |
getDemotion()
Returns the promotion factor used by the algorithm. |
float |
getPromotion()
Returns the demotion factor used by the algorithm. |
float |
getThresholdThickness()
Returns the thickness of the threshold if the "thick threshold" heuristic is used. |
protected float[] |
initScores()
Initializes the score (activation values) to use for all classes. |
protected float |
initWeight()
Returns the initial weight to use for each feature per class. |
protected float[] |
initWeightArray()
Returns the initial weight array to use for a feature for all classes. |
boolean |
isBalanced()
Whether the Balanced Winnow or the standard Winnow algorithm is used. |
protected float |
majorThreshold(float threshold,
float rawThreshold)
Calculates the major theshold (theta+) to use for classification with the "thick threshold" heuristic. |
protected float |
minorThreshold(float threshold,
float rawThreshold)
Calculates the minor theshold (theta-) to use for classification with the "thick threshold" heuristic. |
protected float |
rawThreshold(FeatureSet features)
Calculates the theshold (theta) to use for classification, based on the number of active features. |
void |
reset()
Resets the classifer, completely deleting the prediction model. |
protected float |
sigmoid(float score,
float threshold,
float rawThreshold)
Converts the raw score (activation value) to a value in the range from 0 to 1 via a sigmoid function depending on the threshold theta. |
protected float |
threshold(float rawThreshold)
Calculates the theshold (theta) to use for classification. |
String |
toString()
Returns a string representation of this object. |
protected boolean |
trainOnErrorHook(PredictionDistribution predDist,
FeatureVector features,
String targetClass,
Set candidateClasses,
ContextMap context)
Hook implementing error-driven learning, promoting and demoting weights as required. |
protected void |
updateScores(Feature feature,
double strength,
float[] scores)
Updates the score (activation values) for all classes by adding the weights of a feature. |
Methods inherited from class de.fu_berlin.ties.classify.TrainableClassifier |
---|
classify, createClassifier, createClassifier, createClassifier, createClassifier, getAllClasses, getConfig, shouldTrain, toElement, train, trainOnError |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public Winnow(Set<String> allValidClasses) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classes
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)protected Winnow(Set<String> allValidClasses, String configSuffix) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classesconfigSuffix
- optional suffix appended to the configuration keys
when configuring this instance; might be null
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)public Winnow(Set<String> allValidClasses, TiesConfiguration config) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classesconfig
- contains configuration properties
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)protected Winnow(Set<String> allValidClasses, TiesConfiguration config, String configSuffix) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classesconfig
- contains configuration propertiesconfigSuffix
- optional suffix appended to the configuration keys
when configuring this instance; might be null
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)public Winnow(Set<String> allValidClasses, FeatureTransformer trans, TiesConfiguration config) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classestrans
- the last transformer in the transformer chain to use, or
null
if no feature transformers should be usedconfig
- contains configuration properties
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)protected Winnow(Set<String> allValidClasses, FeatureTransformer trans, TiesConfiguration config, String configSuffix) throws IllegalArgumentException, ProcessingException
allValidClasses
- the set of all valid classestrans
- the last transformer in the transformer chain to use, or
null
if no feature transformers should be usedconfig
- contains configuration propertiesconfigSuffix
- optional suffix appended to the configuration keys
when configuring this instance; might be null
IllegalArgumentException
- if one of the parameters is outside
the allowed range
ProcessingException
- if an error occurred while creating
the feature transformer(s)public Winnow(Set<String> allValidClasses, FeatureTransformer trans, boolean balance, float promotionFactor, float demotionFactor, float thresholdThick, TiesConfiguration config, String configSuffix) throws IllegalArgumentException
allValidClasses
- the set of all valid classestrans
- the last transformer in the transformer chain to use, or
null
if no feature transformers should be usedbalance
- whether to use the Balanced Winnow or the standard
Winnow algorithmpromotionFactor
- the promotion factor used by the algorithm;
must be > 1.0demotionFactor
- the demotion factor used by the algorithm; must
be < 1.0thresholdThick
- the thickness of the threshold if the "thick
threshold" heuristic is used (must be < 1.0), 0.0 otherwiseconfig
- contains configuration propertiesconfigSuffix
- optional suffix appended to the configuration keys
when configuring this instance; might be null
IllegalArgumentException
- if one of the parameters is outside
the allowed rangeMethod Detail |
---|
protected void adjustWeights(Feature feature, short[] directions)
feature
- the feature to processdirections
- an array specifying for each class (in alphabetic
order) whether it should be promoted (positive value), demoted (negative
value) or left unmodified (0)protected void chooseClassesToAdjust(WinnowDistribution winnowDist, String targetClass, Set<String> classesToPromote, Set<String> classesToDemote)
targetClass
for promotion if its score is
less or equal to the threshold.
It chooses all other classes for demotion if their score is greather
than the threshold.
winnowDist
- the prediction distribution returned by
TrainableClassifier.classify(FeatureVector, Set)
targetClass
- the expected class of this instance; must be
contained in the set of candidateClasses
classesToPromote
- the classes to promote are added to this setclassesToDemote
- the classes to demote are added to this setprotected double confidence(float sigmoid, float sum)
sigmoid
- the sigmoid
activation value to convertsum
- the sum of all sigmoid activation values
sigmoid / sum
protected float defaultWeight()
Balanced
Winnow
(where positive and negative weights should cancel each other
out), initWeight()
otherwise.
protected PredictionDistribution doClassify(FeatureVector features, Set candidateClasses, ContextMap context)
doClassify
in class TrainableClassifier
features
- the feature vector to considercandidateClasses
- an set of classes that are allowed for this itemcontext
- can be used to transport implementation-specific
contextual information between the
TrainableClassifier.doClassify(FeatureVector, Set, ContextMap)
,
TrainableClassifier.doTrain(FeatureVector, String, ContextMap)
, and
TrainableClassifier.trainOnErrorHook(PredictionDistribution, FeatureVector, String,
Set, ContextMap)
methods
PredictionDistribution.best()
to get the most probably classprotected void doTrain(FeatureVector features, String targetClass, ContextMap context) throws UnsupportedOperationException
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method
instead of this one. Trying to call this method instead will result in an
UnsupportedOperationException
.
doTrain
in class TrainableClassifier
features
- ignored by this methodtargetClass
- ignored by this methodcontext
- ignored by this method
UnsupportedOperationException
- always thrown by this method;
use TrainableClassifier.trainOnError(FeatureVector, String, Set)
insteadprotected FeatureSet featureSet(FeatureVector fv)
FeatureSet
(a multi-set of
features). If the provided vector already is a FeatureSet
instance, it is casted and returned. Otherwise a new
FeatureSet
with the same contents is created, reading the
used method for considering feature frequencies in strength values
from the "classifier.winnow.strength.frequency" configuration key.
fv
- the feature vector to convert
public float getDemotion()
public float getPromotion()
public boolean isBalanced()
protected float[] initScores()
public float getThresholdThickness()
protected float initWeight()
protected float[] initWeightArray()
Balanced
Winnow. Each element is initialized to
initWeight()
.
protected float majorThreshold(float threshold, float rawThreshold)
threshold
- the threshold
thetarawThreshold
- the raw
threshold thetar
minorThreshold(float, float)
protected float minorThreshold(float threshold, float rawThreshold)
threshold
- the threshold
thetarawThreshold
- the raw
threshold thetar
majorThreshold(float, float)
protected float rawThreshold(FeatureSet features)
features
- the feature set to consider
public void reset()
reset
in class TrainableClassifier
protected float sigmoid(float score, float threshold, float rawThreshold) throws IllegalArgumentException
score
- the raw score (activation value); must be a
positive value in case of normal (non-balanced) Winnowthreshold
- the threshold
theta used for this instancerawThreshold
- the raw
threshold thetar used for this instance
IllegalArgumentException
- if normal Winnow is used and
score <= 0
protected float threshold(float rawThreshold)
rawThreshold
multiplied with
the default weight. Subclasses can
overwrite this method to calculate the theshold in a different way.
rawThreshold
- the raw
threshold
protected boolean trainOnErrorHook(PredictionDistribution predDist, FeatureVector features, String targetClass, Set candidateClasses, ContextMap context) throws ProcessingException
trainOnErrorHook
in class TrainableClassifier
predDist
- the prediction distribution returned by
TrainableClassifier.classify(FeatureVector, Set)
; must be a
WinnowDistribution
features
- the feature vector to considertargetClass
- the expected class of this feature vector; must be
contained in the set of candidateClasses
candidateClasses
- an set of classes that are allowed for this item
(the actual targetClass
must be one of them)context
- ignored by this implementation
true
to signal
that any error-driven learning was already handled
ProcessingException
- if an error occurs during trainingpublic String toString()
toString
in class TrainableClassifier
protected void updateScores(Feature feature, double strength, float[] scores)
feature
- the feature to processstrength
- the strength of this featurescores
- an array of floats containing the scores for each
class; will be updated by this method
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |