|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Storable
contract.
Accuracy
statistics and the underlying raw counts.Storable
to this container, by
calling its Storable.storeFields()
method and adding the
resulting field map.
StorableContainer
to this container, by
delegating to StorableContainer.storeEntries(FieldContainer)
.
Extraction.addToken(TokenDetails, Probability, boolean)
with a probability
of -1 ("confirmed").
TokenCounter.isWhitespaceAfterLast()
to true
.
<All>
.
MistakeAnalyzer.analyzeMistakes(ExtractionContainer, String)
.
ExtractionContainer
of answer
keys from an annotated text (in XML format).Util.asBoolean(Object)
on each element.
Util.asBoolean(char)
on each character.
Util.asChar(Object)
on each element.
Util.asDouble(Object)
on each element.
CollUtils.asDoubleArray(Object[])
on the result.
Util.asFloat(Object)
on each element.
CollUtils.asFloatArray(Object[])
on the result.
Util.asInt(Object)
on each element.
CollUtils.asIntArray(Object[])
on the result.
Util.asLong(Object)
on each element.
CollUtils.asLongArray(Object[])
on the result.
Util.asShort(Object)
on each element.
CollUtils.asShortArray(Object[])
on the result.
Util.asString(Object)
on each element.
Object.toString()
method for each non-null
object.
DOMUtils.name(Attribute)
.
CombinationStrategy
from an XML attribute
("class" by default).EvaluatedExtractionContainer
s and
calculates the average length (in characters and tokens) for extractions of
of all types (e.g. speaker, location etc.) and all
evaluation statuses
(e.g. correct,
missing etc.)FieldContainer.add(FieldMap)
operation, any key/value pairs from this map are
added to field map prior to storing it.
Sensor
interface that stores a
configuration and provides a factory method to initialize a set of sensors.Storable
interface that implements the BaseStorable.toString()
method based on the field
map created by Storable.storeFields()
.BeginEndStrategy
,
similar to the ELIE/2 system by Finn and Kushmerick.BeginEndStrategy
).ExtractionContainer
of
answer keys from from an annotated XML document.
featureList
.
input
text must contain a well-formed
XML element, otherwise this method will not work.
statistical summary
for each
object contained in at least one of the bags
added
to this instance.
Pattern
by
combining several alternatives in a capturing group.
XMLAdjuster.logEvent(String, String)
methods whenever
an event occurred to ensure the event is acceptable.
TargetClass
object for a given class name, if
defined in this target structure.
standard configuration
.
standard configuration
.
Processor
that collects
all the input arguments and processes the collected arguments when shutting
down.Collection
s and arrays.CombinationState.isDiscardPreceding()
to
false
.
Prediction
s.
VelocityService.TEMPLATE_DIR
and appending the VelocityService.TEMPLATE_EXT
.
TiesConfiguration.TiesConfiguration(String)
using "ties" as base name.
PropertiesConfiguration
format.
gzip
format.
control characters
(which are not allowed in
XML 1.0 and discouraged in XML 1.1).
true
, a fully incremental setup is
used where the trainer is trained on each document after the extractor
processed it.
Util.CONFIG_LOGGER_LOG
).
EvaluatedExtractionContainer.isMatchingPosition()
.
Tuner.CONFIG_TUNE
is used.
Trainer.isTrainingOnlyErrors()
).
Tuner.CONFIG_TUNE_EACH
is enabled.
[+|-]key[=value]
pairs in a string array.
logger
from the Util.CONFIG_LOGGER_LOG
and Util.CONFIG_LOGGER_SHOW
values in the provided configuration.
true
if this map contains a mapping for the
specified key.
TagVariety.TENTATIVE
.
TokenDetails
class by also storing
the context of a token.TokenDetails
instance.
FinalReextractor
.
FinalReextractor
.
ClassTrain.KEY_CLASSIFICATION
field for correct predictions:
"+".
TrainableClassifier.createClassifier(Set, TiesConfiguration)
using the
standard configuration.
TrainableClassifier.createClassifier(Set, TiesConfiguration, String)
without specifying a suffix.
TrainableClassifier.createClassifier(Set, File, TiesConfiguration, String)
without specifying an run directory.
TrainableClassifier.createClassifier(Set, File, FeatureTransformer, String[],
TiesConfiguration)
.
TreeSet
.
DelimSepValues
format.
FMetrics
instance of the
required type.
FMetrics
instance of the
required type from a field map.
ObjectElement.createObject(Document)
,
reading from a file.
ObjectElement.createObject(Element)
,
using the root element of a given document.
Util.createObject(Class, String[])
, reading the
class name from the first element in the array.
Util.createObject(Class, Object[], Class)
, setting
the paramType
to the String
class.
FieldMap.createObject(Class)
for each of the field maps contained
in this container.
Reestimator.createReestimators()
using
the standard configuration.
Reestimator.CONFIG_REESTIMATORS
key in the provided configuration.
CombinationStrategy.createStrategy(Set, TiesConfiguration)
using the
standard configuration.
CombinationStrategy.createStrategy(Set, String, TiesConfiguration)
using
the CombinationStrategy.CONFIG_COMBINATION
key in the provided configuration.
combinationName
.
FeatureTransformer.createTransformer()
using the
standard configuration.
FeatureTransformer.CONFIG_TRANSFORMERS
key in the provided configuration.
Feature
class.XMLStorable
interface.
XMLStorable
interface.
XMLStorable
interface.
PropertiesConfiguration
format).
DefaultRepresentation.getHeadElement()
.
TextProcessor.KEY_OUT_DIRECTORY
configuration key in a given
configuration.
DefaultRepresentation.calculatePositionalValues(String, ElementPosition, List)
to
collapse a position in to one of five values.
Processor
that operates
on the contents of directories.Processor
that operates
on XML documents.Processor
that read
XML documents.NodeFilter
over to an
ElementProcessor
.Processor
that writes
XML documents.ClassTrain.classifyAndTrain(FieldContainer, File, String, String)
.
LineShuffleGenerator.shuffleLines(Reader, Writer, int)
method, using the configured
number of lines to ignore.
Externalize.externalize(FieldContainer, File, String, String)
, using
DSV format for input and output.
Split.split(Reader, File, String, String)
.
input
text with the output of the TreeTagger.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method instead of this one.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method instead of this one.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method
instead of this one.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
method.
DSV format
(or
any other FieldContainer
s) into XML format.DOMUtils.name(Element)
.
FieldMap
s in this container in the
order they were added.
Object.equals(java.lang.Object)
contract.
Object.equals(java.lang.Object)
contract.
Object.equals(java.lang.Object)
contract.
Object.equals(java.lang.Object)
contract.
Object.equals(java.lang.Object)
contract.
Object.equals(java.lang.Object)
contract.
embeddingElements
.
FieldContainer
.Prediction
by also storing the
extracted text and location data.Storable
contract.
EvalStatus.TRUTH
.
EvalStatus.TRUTH
.
EvalStatus.UNKNOWN
.
Extraction
s of different
classes.ExtractionContainer.restoreEntries(FieldContainer)
.
ExtractionLocator.isRetrySilently()
to
false
.
Classifier
on a list of items/nodes and combines their results using a
CombinationStrategy
.Extractor
and
Trainer
.DefaultRepresentation
, node filter and combination strategy from
the provided configuration.
DefaultRepresentation
, node filter, combination strategy and
tokenizer factory from the provided configuration.
false
: '-'.
Storable
contract.
FeatureCount
class and the underlying raw
counts.FeatureSet
(a multi-set of
features).
XMLStorable
interface.
FieldMap
s.XMLStorable
interface.
standard
constructor
and then puts a first key/value pair into the map.
XMLStorable
interface.
XMLStorable
interface.
document filters
(if any) to modify it.
TokenProcessor
on the subset of tokens
that are children of an element accepted by a provided
ElementFilter
.FieldTokenizingExtractor.END_OF_FIELDS_WS
matched): "" (the empty string).
Util.TRUE_CHAR
and Util.FALSE_CHAR
characters,
without using separator characters.
Storable
contract.
FMetrics
class and the underlying raw counts.IOUtils.getExtension(File)
and preceding dot
).
IOUtils.getExtension(File)
and preceding dot
).
TargetClass
es at the top of the inheritance
hierarchy.
null
.
last
added
string in the original text (indexing starts
with 0).
last
added
string in the original text (counting starts
with 0, as the first occurrence is the "0th repetition").
Extraction.isFirstTokenRepIgnored()
is true
.
TokenContainer.add(String)
operation.
TokenContainer.add(String)
operation.
last
added token in the
original text (counting starts with 0, as the first occurrence is the
"0th repetition").
null
, no second level is used.
normalized
score
(activation value) of this prediction.
TextTokenizer.isNormalizedWhitespacePrepended()
is true
.
Double.NaN
if not known/not relevant.
Recognition
s from the current
document.
null
, the returned string should be used to separate
the training from the testing section of the corpus (e.g. "---") and
the train split and
test split values should be ignored.
CombinationStrategy
for this token.
TargetClass
es.
-1
, all remaining documents should be used.
null
if this instance
has never been transformed.
Tuner.isTuneEach()
is
true
.
Tuner.isTuneEach()
is enabled.
null
if we're outside of any instance (CombinationState.OUTSIDE
).
int
value.
TagVariety
of this tag.
LocalFeature
s into global features,
adding the created global features to a linked list.
decision
and the correct
decision via OR.
Feature.compactRepresentation()
instead
Object.hashCode()
contract.
Object.hashCode()
contract.
Object.hashCode()
contract.
Object.hashCode()
contract.
Object.hashCode()
contract.
true
if there is a next element.
TextTokenizer.nextToken()
is preceded by whitespace (i.e., text not matched by any token).
true
if there is a previous element.
ElementFilter.matches(Element)
or ElementFilter.prefers(Element)
on elements of
this document.
Recognition
s and passed
as argument to the Representation.buildContext(Element, String, String, String,
PriorRecognitions, Map, String)
method.
Recognition
s and passed
as argument to the Representation.buildContext(Element, String, String, String,
PriorRecognitions, Map, String)
method.
FinalReextractor
. This implementation returns a BeginEndReextractor
,
if configured.
FinalReextractor
.
stored
configuration
.
InsideOutsideStrategy.isBStartingAll()
to false
).
control characters
are deleted (these
characters are not allowed in XML 1.0 and discouraged in XML 1.1).
true
if this map contains no key-value mappings.
true
, the positions of extraction and answer keys must
match; otherwise only their contents must match (string compare).
TextTokenizer.getNormalizedWhitespace()
) to those tokens where TextTokenizer.hasPrecedingWhitespace()
would return true
.
true
the trainer only ensures that all answer keys exist
and can be located in the document instead of doing any training.
true
if training the embedded filter is enabled
(default).
last
added
string.
last
added
token.
last
added
string.
last
added
token.
Feature
s stored in this vector.
Class
of a Java object stored in an
element.
Class
of the stored object.
Class
of Java object stored
in an element.
Class
of the stored object.
Pattern
by
combining several alternatives.
ClassTrain.CORRECT_CLASS
if the correct class was predicted or the
wrongly predicted class in case of an error.
TextProcessor.KEY_DIRECTORY
is used instead.
AverageLength.metricsByLength()
method to serialize the
token lengths.
TokenContainer.add(String)
operation
contains the specified token.
TokenContainer.add(String)
operation.
FeatureVector.lastTransformation(FeatureVector)
method,
passing this instance as argument.
TextTokenizer.nextToken()
.
PropertiesConfiguration
or
XML
format.
TiesConfiguration.CONFIG_LANG
} key (if this key doesn't exist, the language of
default locale used by the Java Virtual Machine is used).
null
, but null
are not
allowed).
null
, but null
are not
allowed).
standard configuration
to specified
files (or standard out).
MetaClassifier
.
metrics
F-measure, precision and
recall, calculated separately for all extractions of the same type (as
usual) and token length.
EvaluatedExtractionContainer
(in DSV format) and analyses the types of prediction errors that occurred.MistakeAnalyzer
.key[=value]
pair.
MultiBinaryClassifier
.
XMLStorable
interface.
FMetrics
for different types.FMetrics
and the sums and averages calculated over them.MultiValueMap
allows storing multiple values for each key.HashMap
as storage.
int
whose value can be changed.TextUtils.NEWLINE_ALTERNATIVES
in a non-capturing group).
null
if there are no
more tokens left in the provided text.
OneAgainstTheRestClassifier
.
XMLStorable
interface.
gzip
format.
gzip
format).
gzip
format).
gzip
format).
ElementFilter
s
should match elements.XMLStorable
interface.
null
).
cause
.
push
ed into this container.
push
ed into this container.
push
ed into this container.
push
ed into this container.
TextTokenizer.nextToken()
.
TextTokenizer.nextToken()
matches the defined whitespace pattern.
Storable
contract.
EvalStatus.UNKNOWN
.
Prediction
s based on their
probabilities.Recognition
s that should
be considered in the context representation.Double.NaN
(unknown).
ExtractionContainer
of
answer keys from from an annotated XML document.
TextProcessor.doProcess(Reader, Writer, ContextMap)
method and invokes a post-processor, if configured.
TextProcessor.process(File, Writer, ContextMap)
method.
TextProcessor.process(Reader, Writer, ContextMap)
method.
process
method.
TextProcessor.process(URLConnection, Writer, ContextMap)
method.
TextProcessor.process(Reader, Writer, ContextMap)
method.
AttributeUnflatten.unflatten(Document)
method.
DSVtoXMLConverter.convert(FieldContainer)
, reading input in
DSV format.
cause
.
TokenProcessor
.
TokenProcessor
.
Configuration.getProperty(String)
is empty.
XMLAdjuster.isEscapingPseudoEntities()
is true
.
LocalFeature.OPEN
and
LocalFeature.CLOSE
character).
AnswerBuilder.process(Document, Writer, ContextMap)
method of an instance of
this class.
bytes
array is full or end-of-input is reached or an end-of-line character is
encountered.
text/uri-list
) into an array of strings.
text/uri-list
) into an array of strings.
FieldContainer
s.
EvaluatedExtractionContainer
.ReEvaluator.reEvalulate(ExtractionContainer, EvaluatedExtractionContainer)
.
equals
-based comparisons used by
Collection.remove(Object)
.
GlobalFeature
s to remove extraneous
FeatureType.MARKER
features.
added
to this
container.
input
matched by the
given pattern matcher with the given
replacement.
input
that matches the given
Pattern
with the given replacement.
Representation
to convert elements into
feature vectors.CombinationStrategy.state()
of this instance to the initial value
CombinationState.OUTSIDE
.
CombinationStrategy.reset()
method to query whether the last
extraction should be discarded, analogously to
CombinationState.isDiscardPreceding()
.
CombinationStrategy.reset()
method to query whether the last
extraction should be discarded, analogously to
CombinationState.isDiscardPreceding()
.
Storable
objects and support serialization
and deserialization of these objects in a human-readable format.TextTokenizer.nextToken()
.
LengthEstimator
that uses
the rounded square root of the length (token count) instead of the raw
token count.PropertiesConfiguration
format.
PropertiesConfiguration
format.
XMLStorable
interface.
null
, no second level is used.
TextTokenizer.isNormalizedWhitespacePrepended()
is true
.
TextTokenizer.getNormalizedWhitespace()
) to those tokens where TextTokenizer.hasPrecedingWhitespace()
would return true
.
int
value.
TagVariety
of this tag.
XMLStorable
interface.
TextUtils.shorten(String, int, int)
, using the same number
of characters at the start and the end of the shortened string.
TextUtils.shorten(String, int, int)
, showing up to
24 characters at the start and the end of the shortened
string.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
to decide
whether to train an instance.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
to decide
whether to train an instance.
TrainableClassifier.trainOnError(FeatureVector, String, Set)
to decide
whether to train an instance.
ignoreFirst
lines).
FilteringTokenWalker
whenever some
tokens are skipped.
FilteringTokenWalker
whenever some
tokens are skipped.
FilteringTokenWalker
whenever some tokens
are skipped.MultiValueMap
that sorts the values
stored for each key, discarding duplicates.TreeMap
.
UnsupportedOperationException
instead.
Pattern.split(java.lang.CharSequence)
and storing each member
of the returned array in a separate file.
Split.split(Reader, File, String, String, Pattern)
method, using the
configured default pattern.
FieldMap
.Storable
objects and support serialization of
these objects in a human-readable format, by storing them in a
FieldContainer
.FieldContainer.store(Writer)
.
FieldContainer.createFieldContainer(TiesConfiguration)
and
FieldContainer.storeInFile(File, String, String, Configuration)
.
Storable
items in this object to a field
container for serialization.
Storable
items in this object to a field
container for serialization.
CombinationStrategy
, using a list of regular
expressions and replacement texts (or the other way around).Classifier.CONFIG_CLASSIFIER
configuration key to modify
the type of classifiers used by this instance.
FMetrics
extension that additionally
calculates a StatisticalSummary
of the
intermediate precision, recall, and F1 metrics resulting from different
update
operations.Storable
contract.
Runnable
tasks.Thread.NORM_PRIORITY
) for threads.
Processor
that operates
on text documents.TieClassifier
.
XMLStorable
interface.
TiesConfiguration.addConfiguration(Configuration, Configuration)
.
TiesConfiguration.load(String)
.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
Subclasses of TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected. Currently, this classifier does not support XML
serialization, throwing an UnsupportedOperationException
instead.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
Subclasses of TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected. Currently, this classifier does not support XML
serialization, throwing an UnsupportedOperationException
instead.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
Subclasses of TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected. Currently, this classifier does not support XML
serialization, throwing an UnsupportedOperationException
instead.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
Subclasses of TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
Subclasses of TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
Subclasses of TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected.
TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected.
ObjectElement.createObject(org.dom4j.Element,
Class)
on the created element.
Subclasses of TrainableClassifier
should extend this method and
the corresponding constructor from Element
to
ensure (de)serialization works as expected.
FieldContainer.toElement(String)
, setting the name of the main element to "list".
FieldMap.toElement(boolean)
, setting the Java attribute.
DefaultRepresentation.calculateValuesFromText(String, String, List)
to determine the
"tokenType" value.
TextTokenizer
s of
different types.TokenizerFactory.CONFIG_TOKEN_PATTERNS
and
TokenizerFactory.CONFIG_WHITESPACE_PATTERN
keys of the provided configuration.
TokenizerFactory.CONFIG_TOKEN_PATTERNS
and
TokenizerFactory.CONFIG_WHITESPACE_PATTERN
keys of the provided configuration,
adapted by
appending the suffix
.
TokenProcessor
.Storable
object,
printing all field name/value pairs in the order used to insert them
into the FieldMap
.
FieldMap
.
TrainableClassifier.doTrain(FeatureVector, String, ContextMap)
method.XMLStorable
interface.
trainable classifier
for training.Classifier
to be used for extraction.standard configuration
to configure the
remaining fields.
TrainableClassifier.trainOnError(FeatureVector, String,
java.util.Set)
method on the stored trainable classifier.
true
: '+'.
Class
of the stored object.
AverageLength.updateAverageLengths(ExtractionContainer)
.
Util
instances should NOT be constructed in standard
programming.
StatisticalSummary
for any number of items ("keys") that occur zero or more times in any
number of runs ("identifiers").FMetrics
of the specified type.
FMetrics
of the specified type.
FMetrics
containing the sums and
averages over all types.
FMetrics
containing the sums and
averages over all types.
statistical
summaries of precision, recall, and F1 metrics
over all types,
if calculated.
statistical
summaries of precision, recall, and F1 metrics
over all types,
if calculated by the used implementation.
Trainer.resetGlobalAccuracy()
) by
each classifier.
flattened
string
representation.
Mistake.MistakeTypes
that occurred.
statistical
summaries of precision, recall, and F1 metrics
of the specified type,
if calculated.
statistical
summaries of precision, recall, and F1 metrics
of the specified type,
if calculated by the used implementation.
XMLStorable
interface.
Winnow classifier
.Winnow
algorithm.Storable
contract.
EvalStatus.UNKNOWN
.
Winnow
.<body ...
- writeDocument(Document, File, TiesConfiguration, String) -
Static method in class de.fu_berlin.ties.xml.dom.DOMUtils
- Writes an XML document to a file, consulting a given configuration about
whether to use compression.
- writeDocument(Document, OutputStream) -
Static method in class de.fu_berlin.ties.xml.dom.DOMUtils
- Writes an XML document to a given stream.
- writeDocument(Document, OutputStreamWriter) -
Static method in class de.fu_berlin.ties.xml.dom.DOMUtils
- Writes an XML document to a given writer, using the character set of the
underlying output stream.
- writeDocument(Document, Writer, String) -
Static method in class de.fu_berlin.ties.xml.dom.DOMUtils
- Writes an XML document to a given writer, using the given character set.
- writeHTMLHead(Writer) -
Method in class de.fu_berlin.ties.demo.FilterResult
- Writes HTML code that must be inserted into the contents of the
<head>
element of a HTML file containing the output
of the FilterResult.writeVizualization(Writer)
method.
- writeLine(String, Writer) -
Static method in class de.fu_berlin.ties.io.IOUtils
- Writes a line of text to a writer, followed by a
line separator.
- writeln(Writer, String) -
Static method in class de.fu_berlin.ties.text.TextUtils
- Convenience method that writes a text to a writer and appends to
line separator.
- writeTestHTML(Writer) -
Method in class de.fu_berlin.ties.demo.FilterResult
- Writes a simple but complete HTML file that combines the output of
of the
FilterResult.writeHTMLHead(Writer)
,
FilterResult.writeBodyAttribute(Writer)
and
FilterResult.writeVizualization(Writer)
methods.
- writeToWriter(CharSequence, Writer) -
Static method in class de.fu_berlin.ties.io.IOUtils
- Writes the contents of a character sequence to a writer.
- writeVizualization(Writer) -
Method in class de.fu_berlin.ties.demo.FilterResult
- Writes an HTML fragment that contains a vizualization of the classified
mail (showing which features have been most important for classification
etc.).
TextTokenizer
s
for XML-like input.
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |