de.fu_berlin.ties.eval
Class MistakeAnalyzer

java.lang.Object
  extended by de.fu_berlin.ties.ConfigurableProcessor
      extended by de.fu_berlin.ties.TextProcessor
          extended by de.fu_berlin.ties.eval.MistakeAnalyzer
All Implemented Interfaces:
Closeable, Processor

public class MistakeAnalyzer
extends TextProcessor
implements Closeable

Reads an EvaluatedExtractionContainer (in DSV format) and analyses the types of prediction errors that occurred. Detects misplaced borders (early or late start or end), type confusion (e.g. "end-time" instead of "start-time") and some other kinds of errors.

Neads access to the preprocessed ("augmented") input files. The directory containing these files can the specified using the CONFIG_AUG_DIR configuration key.

Instances of this type are not thread-safe.

Version:
$Revision: 1.20 $, $Date: 2006/10/21 16:04:11 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
static String CONFIG_AUG_DIR
          Configuration key: the directory containing the augmented (preprocessed) input texts, relative to the current working directory.
 
Fields inherited from class de.fu_berlin.ties.TextProcessor
CONFIG_POST, KEY_DIRECTORY, KEY_LOCAL_NAME, KEY_OUT_DIRECTORY, KEY_URL
 
Constructor Summary
MistakeAnalyzer()
          Creates a new instance, using a default extension and the standard configuration.
MistakeAnalyzer(String outExt)
          Creates a new instance, using the standard configuration.
MistakeAnalyzer(String outExt, TiesConfiguration conf)
          Creates a new instance.
 
Method Summary
 void analyzeBatch(ExtractionContainer predictions, ExtractionContainer answers, String batchSource)
          Analyses a batch of predictions and answer keys for a specific source file, determining the types of mistakes that occurred.
 MistakeMatrix analyzeMistakes(ExtractionContainer extractions, String localName)
          Analyses an evaluated extraction container, determining the types of mistakes that occurred.
 MistakeMatrix analyzeMistakes(Reader reader, String localName)
          Analyses an serialized contents of an evaluated extraction container and determines the types of mistakes that occurred, delegating to analyzeMistakes(ExtractionContainer, String).
 void close(int errorCount)
          Closes this instance, releasing all resources and stopping any background threads.
protected  void doProcess(Reader reader, Writer writer, ContextMap context)
          Processes the contents of a reader, writing a modified version to a writer.
 
Methods inherited from class de.fu_berlin.ties.TextProcessor
getOutFileExt, process, process, process, process, process, process, toString
 
Methods inherited from class de.fu_berlin.ties.ConfigurableProcessor
getConfig
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

CONFIG_AUG_DIR

public static final String CONFIG_AUG_DIR
Configuration key: the directory containing the augmented (preprocessed) input texts, relative to the current working directory.

See Also:
Constant Field Values
Constructor Detail

MistakeAnalyzer

public MistakeAnalyzer()
Creates a new instance, using a default extension and the standard configuration.


MistakeAnalyzer

public MistakeAnalyzer(String outExt)
Creates a new instance, using the standard configuration.

Parameters:
outExt - the extension to use for output files

MistakeAnalyzer

public MistakeAnalyzer(String outExt,
                       TiesConfiguration conf)
Creates a new instance.

Parameters:
outExt - the extension to use for output files
conf - the configuration to use
Method Detail

analyzeBatch

public void analyzeBatch(ExtractionContainer predictions,
                         ExtractionContainer answers,
                         String batchSource)
                  throws IOException,
                         ProcessingException
Analyses a batch of predictions and answer keys for a specific source file, determining the types of mistakes that occurred.

Parameters:
predictions - the container of predicted extractions
answers - the container of expected extractions (answer keys)
batchSource - the source of extractions of this batch
Throws:
IOException - if an I/O error occurs while reading the source (AUG) file
ProcessingException - if an error occurs during processing

analyzeMistakes

public MistakeMatrix analyzeMistakes(ExtractionContainer extractions,
                                     String localName)
                              throws IOException,
                                     ProcessingException
Analyses an evaluated extraction container, determining the types of mistakes that occurred.

Parameters:
extractions - the container of evaluated extractions
localName - how to refer to this run (will be used for statistics calculated by the close(int) method)
Returns:
a MistakeMatrix giving details and statistics on the mistakes that occured
Throws:
IOException - if an I/O error occurs while reading the source (AUG) file
ProcessingException - if an error occurs during processing

analyzeMistakes

public MistakeMatrix analyzeMistakes(Reader reader,
                                     String localName)
                              throws IOException,
                                     ProcessingException
Analyses an serialized contents of an evaluated extraction container and determines the types of mistakes that occurred, delegating to analyzeMistakes(ExtractionContainer, String).

Parameters:
reader - reader containg the extractions to analyse in DelimSepValues format; not closed by this method
localName - how to refer to this run (will be used for statistics calculated by the close(int) method)
Returns:
a MistakeMatrix giving details and statistics on the mistakes that occured
Throws:
IOException - if an I/O error occurs while reading the extractions or a corresponding the source (AUG) file
ProcessingException - if an error occurs during processing

close

public void close(int errorCount)
           throws IOException
Closes this instance, releasing all resources and stopping any background threads.

Specified by:
close in interface Closeable
Parameters:
errorCount - the number of errors (exceptions) that occurred during calls to this instance (0 if none)
Throws:
IOException - if an I/O error occurs

doProcess

protected void doProcess(Reader reader,
                         Writer writer,
                         ContextMap context)
                  throws IOException,
                         ProcessingException
Processes the contents of a reader, writing a modified version to a writer.

Specified by:
doProcess in class TextProcessor
Parameters:
reader - reader containing the text to process; should not be closed by this method
writer - the writer to write the processed text to; might be flushed but not closed by this method; if this method does not use the writer, the underlying file will be deleted afterwards
context - a map of objects that are made available for processing; when called from the implemented process methods in this class, it will contain mappings from IOUtils.KEY_LOCAL_CHARSET to the character set of the output writer; from TextProcessor.KEY_OUT_DIRECTORY to the output directory (File); from ContentType.KEY_MIME_TYPE to the document's MIME type; from TextProcessor.KEY_LOCAL_NAME to the local name (String) and either from TextProcessor.KEY_DIRECTORY to the input directory (File), in case of a local file) or from TextProcessor.KEY_URL to the URL (otherwise) of the processed document
Throws:
IOException - if an I/O error occurs
ProcessingException - if an error occurs during processing


Copyright © 2003-2007 Christian Siefkes. All Rights Reserved.