|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.classify.TrainableClassifier
de.fu_berlin.ties.classify.winnow.Winnow
Classifier implementing the Winnow algorithm (Nick Littlestone). Winnow
supports only error-driven training, so you always have to use the
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method. Trying to
call the TrainableClassifier.train(FeatureVector, String)
method instead will result in an
UnsupportedOperationException
.
Instances of this class are thread-safe.
Field Summary |
Fields inherited from interface de.fu_berlin.ties.classify.Classifier |
CONFIG_CLASSIFIER |
Constructor Summary | |
Winnow(Set allValidClasses)
Creates a new instance based on the standard configuration. |
|
Winnow(Set allValidClasses,
FeatureTransformer trans,
boolean balance,
float promotionFactor,
float demotionFactor,
float thresholdThick,
int featureNum)
Creates a new instance. |
|
Winnow(Set allValidClasses,
FeatureTransformer trans,
TiesConfiguration config)
Creates a new instance based on the provided configuration. |
|
Winnow(Set allValidClasses,
TiesConfiguration config)
Creates a new instance based on the provided configuration. |
Method Summary | |
protected void |
adjustWeights(Feature feature,
short[] directions)
Adjusts the weights of a feature for all classes. |
protected void |
chooseClassesToAdjust(WinnowDistribution winnowDist,
String targetClass,
Set classesToPromote,
Set classesToDemote)
Chooses the classes to promote and the classes to demote. |
protected double |
confidence(float sigmoid,
float sum)
Converts a sigmoid activation value into a confidence estimate. |
protected float |
defaultWeight()
Returns the default weight to use if a feature is unknown. |
protected PredictionDistribution |
doClassify(FeatureVector features,
Set candidateClasses)
Classifies an item that is represented by a feature vector by choosing the most probable class among a set of candidate classes. |
protected void |
doTrain(FeatureVector features,
String targetClass)
Winnow supports only error-driven training, so you always have to use the TrainableClassifier.trainOnError(FeatureVector, String, Set) method
instead of this one. |
protected FeatureSet |
featureSet(FeatureVector fv)
Converts a feature vector into a FeatureSet (a multi-set of
features). |
float |
getDemotion()
Returns the promotion factor used by the algorithm. |
float |
getPromotion()
Returns the demotion factor used by the algorithm. |
float |
getThresholdThickness()
Returns the thickness of the threshold if the "thick threshold" heuristic is used. |
protected float[] |
initScores()
Initializes the score (activation values) to use for all classes. |
protected float |
initWeight()
Returns the initial weight to use for each feature per class. |
protected float[] |
initWeightArray()
Returns the initial weight array to use for a feature for all classes. |
boolean |
isBalanced()
Whether the Balanced Winnow or the standard Winnow algorithm is used. |
protected float |
majorThreshold(float threshold,
float rawThreshold)
Calculates the major theshold (theta-) to use for classification with the "thick threshold" heuristic. |
protected float |
minorThreshold(float threshold,
float rawThreshold)
Calculates the minor theshold (theta-) to use for classification with the "thick threshold" heuristic. |
protected float |
rawThreshold(FeatureSet features)
Calculates the theshold (theta) to use for classification, based on the number of active features. |
protected float |
sigmoid(float score,
float threshold,
float rawThreshold)
Converts the raw score (activation value) to a value in the range from 0 to 1 via a sigmoid function depending on the threshold theta. |
protected float |
threshold(float rawThreshold)
Calculates the theshold (theta) to use for classification. |
String |
toString()
Returns a string representation of this object. |
protected boolean |
trainOnErrorHook(PredictionDistribution predDist,
FeatureVector features,
String targetClass,
Set candidateClasses)
Hook implementing error-driven learning, promoting and demoting weights as required. |
protected void |
updateScores(Feature feature,
float[] scores)
Updates the score (activation values) for all classes by adding the weights of a feature. |
Methods inherited from class de.fu_berlin.ties.classify.TrainableClassifier |
classify, createClassifier, createClassifier, createClassifier, getAllClasses, train, trainOnError |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
public Winnow(Set allValidClasses) throws IllegalArgumentException
allValidClasses
- the set of all valid classes
IllegalArgumentException
- if one of the parameters is outside
the allowed rangepublic Winnow(Set allValidClasses, TiesConfiguration config) throws IllegalArgumentException
allValidClasses
- the set of all valid classesconfig
- contains configuration properties
IllegalArgumentException
- if one of the parameters is outside
the allowed rangepublic Winnow(Set allValidClasses, FeatureTransformer trans, TiesConfiguration config) throws IllegalArgumentException
allValidClasses
- the set of all valid classestrans
- the last transformer in the transformer chain to use, or
null
if no feature transformers should be usedconfig
- contains configuration properties
IllegalArgumentException
- if one of the parameters is outside
the allowed rangepublic Winnow(Set allValidClasses, FeatureTransformer trans, boolean balance, float promotionFactor, float demotionFactor, float thresholdThick, int featureNum) throws IllegalArgumentException
allValidClasses
- the set of all valid classestrans
- the last transformer in the transformer chain to use, or
null
if no feature transformers should be usedbalance
- whether to use the Balanced Winnow or the standard
Winnow algorithmpromotionFactor
- the promotion factor used by the algorithm;
must be > 1.0demotionFactor
- the demotion factor used by the algorithm; must
be < 1.0thresholdThick
- the thickness of the threshold if the "thick
threshold" heuristic is used (must be < 1.0), 0.0 otherwisefeatureNum
- the number of features to store
IllegalArgumentException
- if one of the parameters is outside
the allowed rangeMethod Detail |
protected void adjustWeights(Feature feature, short[] directions)
feature
- the feature to processdirections
- an array specifying for each class (in alphabetic
order) whether it should be promoted (positive value), demoted (negative
value) or left unmodified (0)protected void chooseClassesToAdjust(WinnowDistribution winnowDist, String targetClass, Set classesToPromote, Set classesToDemote)
targetClass
for promotion if its score is
less or equal to the threshold.
It chooses all other classes for demotion if their score is greather
than the threshold.
winnowDist
- the prediction distribution returned by
TrainableClassifier.classify(FeatureVector, Set)
targetClass
- the expected class of this instance; must be
contained in the set of candidateClasses
classesToPromote
- the classes to promote are added to this setclassesToDemote
- the classes to demote are added to this setprotected double confidence(float sigmoid, float sum)
sigmoid
- the sigmoid
activation value to convertsum
- the sum of all sigmoid activation values
sigmoid / sum
protected float defaultWeight()
Balanced
Winnow
(where positive and negative weights should cancel each other
out), initWeight()
otherwise.
protected PredictionDistribution doClassify(FeatureVector features, Set candidateClasses)
doClassify
in class TrainableClassifier
features
- the feature vector to considercandidateClasses
- an set of classes that are allowed for this item
PredictionDistribution.best()
to get the most probably classprotected void doTrain(FeatureVector features, String targetClass) throws UnsupportedOperationException
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method
instead of this one. Trying to call this method instead will result in an
UnsupportedOperationException
.
doTrain
in class TrainableClassifier
features
- ignored by this methodtargetClass
- ignored by this method
UnsupportedOperationException
- always thrown by this method;
use TrainableClassifier.trainOnError(FeatureVector, String, Set)
insteadprotected FeatureSet featureSet(FeatureVector fv)
FeatureSet
(a multi-set of
features). If the provided vector already is a FeatureSet
instance, it is casted and returned. Otherwise a new
FeatureSet
with the same contents is created
fv
- the feature vector to convert
public float getDemotion()
public float getPromotion()
public boolean isBalanced()
protected float[] initScores()
public float getThresholdThickness()
protected float initWeight()
protected float[] initWeightArray()
Balanced
Winnow. Each element is initialized to
initWeight()
.
protected float majorThreshold(float threshold, float rawThreshold)
threshold
- the threshold
thetarawThreshold
- the raw
threshold thetar
minorThreshold(float, float)
protected float minorThreshold(float threshold, float rawThreshold)
threshold
- the threshold
thetarawThreshold
- the raw
threshold thetar
majorThreshold(float, float)
protected float rawThreshold(FeatureSet features)
features
- the feature set to consider
protected float sigmoid(float score, float threshold, float rawThreshold) throws IllegalArgumentException
score
- the raw score (activation value); must be a
positive value in case of normal (non-balanced) Winnowthreshold
- the threshold
theta used for this instancerawThreshold
- the raw
threshold thetar used for this instance
IllegalArgumentException
- if normal Winnow is used and
score <= 0
protected float threshold(float rawThreshold)
rawThreshold
multiplied with
the default weight. Subclasses can
overwrite this method to calculate the theshold in a different way.
rawThreshold
- the raw
threshold
protected boolean trainOnErrorHook(PredictionDistribution predDist, FeatureVector features, String targetClass, Set candidateClasses) throws ProcessingException
trainOnErrorHook
in class TrainableClassifier
predDist
- the prediction distribution returned by
TrainableClassifier.classify(FeatureVector, Set)
; must be a
WinnowDistribution
features
- the feature vector to considertargetClass
- the expected class of this feature vector; must be
contained in the set of candidateClasses
candidateClasses
- an set of classes that are allowed for this item
(the actual targetClass
must be one of them)
true
to signal
that any error-driven learning was already handled
ProcessingException
- if an error occurs during trainingpublic String toString()
toString
in class TrainableClassifier
protected void updateScores(Feature feature, float[] scores)
feature
- the feature to processscores
- an array of floats containing the scores for each
class; will be updated by this method
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |