|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.filter.PredictionRewriter2
public class PredictionRewriter2
A variant of the prediction rewriter that uses predictions from another process (e.g. named entities) to provide additional semantic information. This variant does not modify the element structure of the document, but stores the predictions as XML attributes.
You should generally use this class instead of
PredictionRewriter
since it generally has
superior results.
Instances of this class are not thread-safe and must not be used to
process multiple documents in parallel.
Field Summary | |
---|---|
static String |
ATTRIB_PRED
Name of the attribute to add. |
static String |
CONFIG_PRED_NONE
Configuration key: "None" marker to use for tokens that do not belong to any prediction -- if empty or missing, these tokens are not tagged. |
Constructor Summary | |
---|---|
PredictionRewriter2(String fileExtension,
String[] predictionClasses,
String myNoneMarker,
TokenizerFactory factory,
TiesConfiguration conf)
Creates a new instance. |
|
PredictionRewriter2(TiesConfiguration conf)
Creates a new instance. |
Method Summary | |
---|---|
void |
processToken(Element element,
String left,
TokenDetails details,
String right,
ContextMap context)
Processes a token in an XML element, optionally modifying the element or the document it is part of. |
Document |
rewrite(Document document,
File filename)
Rewrites a document. |
String |
toString()
Returns a string representation of this object. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final String ATTRIB_PRED
public static final String CONFIG_PRED_NONE
Constructor Detail |
---|
public PredictionRewriter2(TiesConfiguration conf) throws ProcessingException
conf
- used to configure this instance; must not be
null
ProcessingException
- if an error occurs while initializing the
combination strategiespublic PredictionRewriter2(String fileExtension, String[] predictionClasses, String myNoneMarker, TokenizerFactory factory, TiesConfiguration conf) throws ProcessingException
fileExtension
- extension of the files containing predictionspredictionClasses
- names of the prediction classes to use --
if empty array, all are usedmyNoneMarker
- "none" marker to use for tokens that do not belong to
any prediction -- if empty or null
, these tokens are not
taggedfactory
- used to instantiate tokenizersconf
- used to configure this instance; must not be
null
ProcessingException
- if an error occurs while initializing the
combination strategiesMethod Detail |
---|
public void processToken(Element element, String left, TokenDetails details, String right, ContextMap context) throws IOException
processToken
in interface TokenProcessor
element
- the element containing the tokenleft
- the textual contents of the element to the left of the
token
(in case of mixed contents, only up to the last
preceding child element, if any)details
- details about the token to processright
- the textual contents of the element to the right of the
token
(in case of mixed contents, only up to the next
following child element, if any)context
- a map of objects that are made available for processing
IOException
- if an I/O error occurspublic Document rewrite(Document document, File filename) throws IOException, ProcessingException
rewrite
in interface DocumentRewriter
document
- the document to modifyfilename
- the name of the document
document
passed it
IOException
- if an I/O error occurs
ProcessingException
- if an error occurs during rewritingpublic String toString()
toString
in class Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |