|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.ConfigurableProcessor
de.fu_berlin.ties.DirectoryProcessor
de.fu_berlin.ties.extract.TrainEval
Trains an extractor and evaluates extraction quality.
Nested Class Summary | |
static class |
TrainEval.Results
An inner class wrapping the results of a training + evaluation run. |
Field Summary | |
static String |
CONFIG_FILE_EXT
Configuration key: The extension(s) of files to evaluate. |
static String |
CONFIG_RUN
Configuration key: Number of evaluation runs to do to get average results. |
static String |
CONFIG_TRAIN_SPLIT
Configuration key: The percentage of a corpus to use for training. |
static String |
CONFIG_UNIFORM
Configuration key for the isUniform() attribute. |
static String |
KEY_RUN
Serialization key for the number of the run. |
static String |
OUTPUT_DIR
The base name of the subdirectory created and used to store the output results. |
static String |
RUN_DIR
The base name of the subdirectories created in the OUTPUT_DIR
to store the results of each evaluation run. |
Constructor Summary | |
TrainEval()
Creates a new instance, using the standard configuration. |
|
TrainEval(FileFilter filter,
float trainingSplit,
int runNo,
boolean uniformTesting,
TiesConfiguration config)
Creates a new instance. |
|
TrainEval(TiesConfiguration config)
Creates a new instance. |
Method Summary | |
float |
getEvalSplit()
Returns the percentage of a corpus to use for evaluation. |
int |
getRuns()
Returns the number of evaluation runs to do to get average results. |
float |
getTrainSplit()
Returns the percentage of a corpus to use for training; the remaining documents (1-x) are used for evaluation. |
protected Extractor |
initExtractor(Trainer trainer)
Creates and initializes a extractor to use for an evaluation run, re-using the components of the provided trainer. |
protected Trainer |
initTrainer(File runDirectory)
Creates and initializes a trainer to use for an evaluation run, configured from the stored
configuration . |
boolean |
isUniform()
If true , the evaluator does two runs with 50/50 split,
using each file once for training and once for evaluation. |
void |
process(File[] files,
ContextMap context)
Processes an array of files, calling the trainAndEval(File[], ContextMap, File, int) method
getRuns()() times. |
String |
toString()
Returns a string representation of this object. |
TrainEval.Results |
trainAndEval(File[] files,
ContextMap context,
File runDirectory,
int runNo)
Processes an array of files. |
Methods inherited from class de.fu_berlin.ties.DirectoryProcessor |
process, process |
Methods inherited from class de.fu_berlin.ties.ConfigurableProcessor |
getConfig |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public static final String OUTPUT_DIR
public static final String RUN_DIR
OUTPUT_DIR
to store the results of each evaluation run.
public static final String CONFIG_FILE_EXT
public static final String CONFIG_TRAIN_SPLIT
public static final String CONFIG_RUN
public static final String CONFIG_UNIFORM
isUniform()
attribute.
public static final String KEY_RUN
Constructor Detail |
public TrainEval() throws IllegalArgumentException, ClassCastException, NoSuchElementException
IllegalArgumentException
- if the configured values are outside the
allowed ranges
ClassCastException
- if the configured numeric values cannot be
parsed
NoSuchElementException
- if one of the required values is
missing from the configurationpublic TrainEval(TiesConfiguration config) throws IllegalArgumentException, ClassCastException, NoSuchElementException
config
- used to configure this instance
IllegalArgumentException
- if the configured values are outside the
allowed ranges
ClassCastException
- if the configured numeric values cannot be
parsed
NoSuchElementException
- if one of the required values is
missing from the configurationpublic TrainEval(FileFilter filter, float trainingSplit, int runNo, boolean uniformTesting, TiesConfiguration config) throws IllegalArgumentException
filter
- the filter used to decide which files to accepttrainingSplit
- the percentage of a corpus to use for training;
the remaining documents (1-x) are used for evaluationrunNo
- Number of evaluation runs to do to get average resultsuniformTesting
- if true
, the evaluator does two runs
with 50/50 split, using each file once for training and once for
evaluation (ignoring the trainingSplit
runNo
arguments)config
- used to configure superclasses, trainer, and extractor;
if null
, the standard
configuration is used
IllegalArgumentException
- if trainingSplit
is not
a percentage (larger than 1 or smaller than 0) or if
crossValidation
is non-positiveMethod Detail |
public float getEvalSplit()
getTrainSplit()
public int getRuns()
public float getTrainSplit()
protected Extractor initExtractor(Trainer trainer)
trainer
- trainer whose components should be re-used
protected Trainer initTrainer(File runDirectory) throws ProcessingException
stored
configuration
. Subclasses can overwrite this method to provide a
different trainer.
runDirectory
- directory used to run the classifier
ProcessingException
- if an error occurs during initializationpublic boolean isUniform()
true
, the evaluator does two runs with 50/50 split,
using each file once for training and once for evaluation. The
getRuns()
and getTrainSplit()
settings are ignored in
this case.
public void process(File[] files, ContextMap context) throws IOException, ProcessingException
trainAndEval(File[], ContextMap, File, int)
method
getRuns()()
times. For each file, a corresponding answer key
(*.ans) must exist.
process
in class DirectoryProcessor
files
- the array of files to processcontext
- a map of objects that are made available for processing;
will be empty when called from the implemented process
methods in this class
IOException
- if an I/O error occurs
ProcessingException
- if an error occurs during processingpublic String toString()
toString
in class DirectoryProcessor
public TrainEval.Results trainAndEval(File[] files, ContextMap context, File runDirectory, int runNo) throws IOException, ProcessingException
files
- the array of files to processcontext
- a map of objects that are made available for processing;
will be empty when called from the implemented process
methods in this classrunDirectory
- directory used to do this run and store the resultsrunNo
- the number of this run (counting starts with 1)
IOException
- if an I/O error occurs
ProcessingException
- if an error occurs during processing
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |