This document is the API specification for the Trainable Information Extractor (TIE) software. TIE is an incrementally trainable system for information extraction, text classification and generally language engineering. It employs classification models for working with texts. Other modules allow to augment text with linguistic annotations (by delegating to external tools) and to resolve nesting errors and other kinds of well-formedness violations in XML-like input.

Usage Notes

Thread Pooling and Asynchronous Execution

For asynchronous execution of tasks, the static {@link de.fu_berlin.ties.util.TaskRunner} functionality is available. It so often internally, e.g. by several {@link de.fu_berlin.ties.Processor}s and by {@link de.fu_berlin.ties.util.ExternalCommand}.

To allow efficient thread re-use, it is highly recommended to initially {@linkplain de.fu_berlin.ties.util.TaskRunner#registerInterest() register your interest} in the default task runner and to finally {@linkplain de.fu_berlin.ties.util.TaskRunner#deregisterInterest() deregister}. A good idea is to do this at the begin and end of your main method. You should deregister in a finally block and you must not forget to deregister, otherwise your program might run forever (because the worker threads continue waiting for tasks even after all other threads have terminated).