|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.text.TokenizingExtractor
public class TokenizingExtractor
Uses a tokenizer to convert a text into a feature vector. Each token is stored as a feature, preserving the original order of tokens in a text.
Instances of this class are not thread-safe and must be synchronizing externally, if required.
Constructor Summary | |
---|---|
TokenizingExtractor(TiesConfiguration conf,
String suffix)
Creates a new instance. |
Method Summary | |
---|---|
FeatureVector |
buildFeatures(Reader reader)
Extracts a vector of relevant features from a text sequence. |
TextTokenizer |
getTokenizer()
Returns the tokenizer used by this instance. |
String |
toString()
Returns a string representation of this object. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public TokenizingExtractor(TiesConfiguration conf, String suffix)
conf
- used to configure this instancesuffix
- optional suffix for
adapting configuration keys if not null
Method Detail |
---|
public FeatureVector buildFeatures(Reader reader) throws IOException
buildFeatures
in interface FeatureExtractor
reader
- a reader containing the text to represent
IOException
- if an I/O error occurs while reading the inputpublic TextTokenizer getTokenizer()
public String toString()
toString
in class Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |