|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.text.TokenContainer
public class TokenContainer
A container that keeps track of the tokens in a document. Instances of this class are not thread-safe; if you want to share a single instance between different thread, you have to ensure proper synchronization.
Constructor Summary | |
---|---|
TokenContainer(TokenizerFactory tFactory)
Creates a new instance. |
Method Summary | |
---|---|
void |
add(String text)
Adds text to this container. |
int |
getCount(String token)
Returns the cardinality of the given token in this container. |
int |
getFirstTokenInLastIndex()
Returns the index of the first token of the last
added string in the original text (indexing starts
with 0). |
int |
getFirstTokenInLastRep()
Returns the repetition of the first token of the last
added string in the original text (counting starts
with 0, as the first occurrence is the "0th repetition"). |
String |
getLast()
Returns a trimmed and whitespace-normalized representation of the string added this container by the last add(String) operation. |
int |
getLastCount(String token)
Returns the cardinality of the given token in the text added by the last add(String) operation. |
boolean |
isWhitespaceAfterLast()
Whether there is whitespace after the last added
string. |
boolean |
isWhitespaceBeforeLast()
Whether there is whitespace before the last added
string. |
boolean |
lastContains(String token)
Whether the text added by the last add(String) operation
contains the specified token. |
Iterator |
lastIterator()
Returns an iterator over the word and number tokens added by the last add(String) operation. |
int |
size()
Returns the token number of tokens counted by this instances (including duplicates). |
String |
toString()
Returns a string representation of this object. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public TokenContainer(TokenizerFactory tFactory)
tFactory
- used to instantiate the employed tokenizerMethod Detail |
---|
public void add(String text)
text
- the text to addpublic int getCount(String token)
token
- the token to check
>= 0
public int getFirstTokenInLastIndex()
last
added
string in the original text (indexing starts
with 0).
public int getFirstTokenInLastRep()
last
added
string in the original text (counting starts
with 0, as the first occurrence is the "0th repetition").
public int getLastCount(String token)
add(String)
operation.
token
- the token to check
>= 0
public String getLast()
add(String)
operation.
Starting and trailing whitespace is removed; each internal whitespace
is converted into a single space charater.
public boolean isWhitespaceAfterLast()
last
added
string.
true
iff there is whitespace after/at the end of
the stringpublic boolean isWhitespaceBeforeLast()
last
added
string.
true
iff there is whitespace before/at the start of
the stringpublic boolean lastContains(String token)
add(String)
operation
contains the specified token.
token
- the token to check
true
iff the specified argument is contained as a
word or number token in the last added string.public Iterator lastIterator()
add(String)
operation. The iterator contains each token only
once (no matter how often it occurred in the last string); the tokens
are iterated in no particular order.
public int size()
public String toString()
toString
in class Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |