TIE Processing Goals

NameClassOutput ExtensionFurther ArgumentsDescription
adjustde.fu_berlin.ties.xml.XMLAdjusterxmlTries to fix corrupt XML documents, especially documents containing nesting errors
analyzede.fu_berlin.ties.eval.MistakeAnalyzermistakesAnalyses the types of prediction errors that occurred during a test run
answersde.fu_berlin.ties.extract.AnswerBuilderansBuilds answer keys from from an annotated text (in XML format)
avg-lengthde.fu_berlin.ties.eval.AverageLengthavlCalculates the average length for extractions of different types and evaluation statuses
class-trainde.fu_berlin.ties.classify.ClassTrainclsClassifies a list of files, training the text classifier on each error
dsv2xmlde.fu_berlin.ties.xml.convert.DSVtoXMLConverterxmlConverts data in DSV format into XML
eval-predsde.fu_berlin.ties.eval.PredictionEvaluatorReads a set of files that must contain predictions and evaluates them against the corresponding answer keys (*.ans files)
externalizede.fu_berlin.ties.io.ExternalizedsvExternalizes the contents of a file in DSV format. For each entry, the contents of one specified field (read from the "externalize.key" configuration parameter) are stored in an external file whose name is stored in the output DSV file instead of its content.
extractde.fu_berlin.ties.extract.ExtractorpredExtracts relevant information from texts
filterde.fu_berlin.ties.classify.TextFilterA simple filter for classifying and/or training text files
preprocessde.fu_berlin.ties.preprocess.PreProcessoraugPreprocesses documents by converting them to a suitable XML format and adding lingustic information
re-evalde.fu_berlin.ties.eval.ReEvaluatorextRe-evaluates evaluated extractions (useful for switching the match mode -- eval.match.all)
shufflede.fu_berlin.ties.eval.ShuffleGeneratorCreates random "shuffles" of input arguments (e.g. files or URLs)
shuffle-linesde.fu_berlin.ties.eval.LineShuffleGeneratorrandRandomly reshuffles the lines in a file
simple-quotesde.fu_berlin.ties.text.SimplifyQuotestxtSimplifies different kinds of quotes that can occur in text files
splitde.fu_berlin.ties.io.SplitSplits an input file into a series of output files
stripde.fu_berlin.ties.xml.dom.XMLStrippertxtStrips all markup from an XML document and stores the resulting plain text
trainde.fu_berlin.ties.extract.TrainerTrains the classifier used to extract information
train-evalde.fu_berlin.ties.extract.TrainEvalmetricsTrains an extractor and evaluates extraction quality
unflattende.fu_berlin.ties.xml.convert.AttributeUnflattenxmlUnflattens an XML document, reading labels for a combination strategy from an XML attribute ("class" by default)