Package de.fu_berlin.ties.text

This package contains utility classes for working with texts.

See:
          Description

Class Summary
FieldTokenizingExtractor A tokenizing extractor that prepends field names to each token.
SimplifyQuotes Simplifies different kinds of quotes that can occur in text files, replacing all kinds of quotes by a " character.
TextTokenizer Splits a text into a sequence of tokens.
TextUtils A static class that provides utility constants and methods for working with texts and regular expressions.
TokenContainer A container that keeps track of the tokens in a document.
TokenCounter A simple container that keeps track of the tokens in a document.
TokenDetails Stores details on a token in a document.
TokenizerFactory Factory for creating TextTokenizers of different types.
TokenizingExtractor Uses a tokenizer to convert a text into a feature vector.
 

Package de.fu_berlin.ties.text Description

This package contains utility classes for working with texts.



Copyright © 2003-2007 Christian Siefkes. All Rights Reserved.