|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.ConfigurableProcessor
de.fu_berlin.ties.TextProcessor
de.fu_berlin.ties.DocumentReader
de.fu_berlin.ties.extract.AnswerBuilder
Buildings an ExtractionContainer
of answer
keys from an annotated text (in XML format).
Instances of this class are thread-safe and can process several documents in parallel.
Field Summary | |
static String |
EXT_ANSWERS
The recommended file extension to use for storing answer keys. |
static String |
KEY_ANSWERS
Context key referring to the extraction container used for storing the answer keys. |
Fields inherited from class de.fu_berlin.ties.TextProcessor |
CONFIG_POST, KEY_DIRECTORY, KEY_LOCAL_NAME, KEY_OUT_DIRECTORY, KEY_URL |
Constructor Summary | |
AnswerBuilder(String outExt)
Creates a new instance, configuring the target structure from the standard configuration. |
|
AnswerBuilder(String outExt,
TargetStructure targetStruct,
TokenizerFactory tFactory,
TiesConfiguration config)
Creates a new instance. |
|
AnswerBuilder(String outExt,
TiesConfiguration config)
Creates a new instance, configuring the target structure from the provided configuration. |
Method Summary | |
ExtractionContainer |
buildAnswers(Document document)
Buildings an ExtractionContainer of
answer keys from from an annotated XML document. |
TargetStructure |
getTargetStructure()
Returns the target structure specifying the classes to recognize. |
void |
process(Document document,
Writer writer,
ContextMap context)
Buildings an ExtractionContainer of
answer keys from from an annotated XML document. |
void |
processElement(Element element,
TokenContainer tokenContainer,
ContextMap context)
Classifies an element in an XML document, building features and delegating to the classifier. |
static ExtractionContainer |
readAnswerKeys(TargetStructure targetStruct,
File file,
Configuration config)
Reads back answer keys stored by the process(Document, Writer, ContextMap) method of an instance of
this class. |
static ExtractionContainer |
readCorrespondingAnswerKeys(TargetStructure targetStruct,
File orgFile,
Configuration config)
Reads the answer keys corresponding to a file. |
String |
toString()
Returns a string representation of this object. |
Methods inherited from class de.fu_berlin.ties.DocumentReader |
doProcess |
Methods inherited from class de.fu_berlin.ties.TextProcessor |
getOutFileExt, process, process, process, process |
Methods inherited from class de.fu_berlin.ties.ConfigurableProcessor |
getConfig |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
public static final String KEY_ANSWERS
public static final String EXT_ANSWERS
Constructor Detail |
public AnswerBuilder(String outExt)
outExt
- the extension to use for output filespublic AnswerBuilder(String outExt, TiesConfiguration config)
outExt
- the extension to use for output filesconfig
- the configuration to usepublic AnswerBuilder(String outExt, TargetStructure targetStruct, TokenizerFactory tFactory, TiesConfiguration config)
outExt
- the extension to use for output filestargetStruct
- the target structure specifying the classes to
recognizetFactory
- used to instantiate tokenizersconfig
- the configuration to useMethod Detail |
public static ExtractionContainer readAnswerKeys(TargetStructure targetStruct, File file, Configuration config) throws IllegalArgumentException, IOException
process(Document, Writer, ContextMap)
method of an instance of
this class.
targetStruct
- the target structure used when creating the answer
keysfile
- the file containing the answer keysconfig
- configuration used to determine the character set of the
keys (cf. IOUtils.openReader(File, Configuration)
IllegalArgumentException
- if the
(@linkplain de.fu_berlin.ties.classify.Prediction#getType() type) of
some answer keys in the answer keys doesn't fit the target structure
IOException
- if an I/O error occurs while reading the filepublic static ExtractionContainer readCorrespondingAnswerKeys(TargetStructure targetStruct, File orgFile, Configuration config) throws IllegalArgumentException, IOException
EXT_ANSWERS
instead of the
extension of the original file.
targetStruct
- the target structure used when creating the answer
keysorgFile
- the file whose answer keys should be returnedconfig
- configuration used to determine the character set of the
keys (cf. IOUtils.openReader(File, Configuration)
IllegalArgumentException
- if the
(@linkplain de.fu_berlin.ties.classify.Prediction#getType() type) of
some answer keys in the answer keys doesn't fit the target structure
IOException
- if an I/O error occurs while reading the filepublic ExtractionContainer buildAnswers(Document document) throws IOException, ProcessingException
ExtractionContainer
of
answer keys from from an annotated XML document.
document
- the document to read
IOException
- if an I/O error occurs
ProcessingException
- if an error occurs during processingpublic TargetStructure getTargetStructure()
public void process(Document document, Writer writer, ContextMap context) throws IOException, ProcessingException
ExtractionContainer
of
answer keys from from an annotated XML document.
process
in class DocumentReader
document
- the document to readwriter
- the writer to write the processed text to; flushed
but not closed by this methodcontext
- a map of objects that are made available for processing
IOException
- if an I/O error occurs
ProcessingException
- if an error occurs during processingpublic void processElement(Element element, TokenContainer tokenContainer, ContextMap context)
processElement
in interface ElementProcessor
element
- the element to processtokenContainer
- a container storing all tokens seen in the
document so far; TokenContainer.getLast()
contains the textual
content of the element and its child elementscontext
- a map of objects that are made available for processing;
the KEY_ANSWERS
key must map to an extraction container used for
storing the answer keyspublic String toString()
toString
in class TextProcessor
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |