de.fu_berlin.ties.io
Class Externalize

java.lang.Object
  extended by de.fu_berlin.ties.ConfigurableProcessor
      extended by de.fu_berlin.ties.TextProcessor
          extended by de.fu_berlin.ties.io.Externalize
All Implemented Interfaces:
Processor

public class Externalize
extends TextProcessor

Externalizes the contents of a file in DSV format (or any other FieldContainer. For each entry, the contents of one specified field (read from the value of the CONFIG_KEY configuration parameter) are stored in an external file. The base name of the external file (without file extension) is stored in the output DSV file instead of its content.

Base name and extension of the external files are determined from the input file. For example, if the input file is named file.data and contains 87 entries, 87 externalized files named file01.data, file02.data, ..., file87.data will be created (the number of leading zeros is determined as required to ensure that all file names have the same length). Entries are skipped (but still counted for numbering purposes) if the value of the specified field is empty or missing.

Instances of this class are thread-safe.

Version:
$Revision: 1.7 $, $Date: 2006/10/21 16:04:22 $, $Author: siefkes $
Author:
Christian Siefkes

Field Summary
static String CONFIG_KEY
          Configuration key: The name of the field whose contents to externalize: "externalize.key".
 
Fields inherited from class de.fu_berlin.ties.TextProcessor
CONFIG_POST, KEY_DIRECTORY, KEY_LOCAL_NAME, KEY_OUT_DIRECTORY, KEY_URL
 
Constructor Summary
Externalize(String outExt)
          Creates a new instance, using the standard configuration.
Externalize(String outExt, TiesConfiguration conf)
          Creates a new instance.
 
Method Summary
protected  void doProcess(Reader reader, Writer writer, ContextMap context)
          Processes the contents of a reader, writing a modified version to a writer. This implementation delegates to externalize(FieldContainer, File, String, String), using DSV format for input and output.
 void externalize(FieldContainer container, File directory, String localName, String charset)
          Externalizes the contents of a field container.
 void externalize(FieldContainer container, File directory, String localName, String charset, String key)
          Externalizes the contents of a field container.
 
Methods inherited from class de.fu_berlin.ties.TextProcessor
getOutFileExt, process, process, process, process, process, process, toString
 
Methods inherited from class de.fu_berlin.ties.ConfigurableProcessor
getConfig
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

CONFIG_KEY

public static final String CONFIG_KEY
Configuration key: The name of the field whose contents to externalize: "externalize.key".

See Also:
Constant Field Values
Constructor Detail

Externalize

public Externalize(String outExt)
Creates a new instance, using the standard configuration.

Parameters:
outExt - the extension to use for output files

Externalize

public Externalize(String outExt,
                   TiesConfiguration conf)
Creates a new instance.

Parameters:
outExt - the extension to use for output files
conf - used to configure this instance; if null, the standard configuration is used
Method Detail

externalize

public void externalize(FieldContainer container,
                        File directory,
                        String localName,
                        String charset)
                 throws IOException
Externalizes the contents of a field container. This method delegates to externalize(FieldContainer, File, String, String, String), determining the name of the field to externalize from the CONFIG_KEY configuration parameter.

Parameters:
container - the container to externalize; will be modified by replacing the values stored in the key field with the base names (without extension) of the newly created external files containing them
directory - the directory to use for storing the externalized files; if null, the working directory is used
localName - the name of the input file, used to determine the names of externalized files
charset - the character set to use for the external files; if null, the default charset of the current platform is used
Throws:
IOException - if an I/O error occurs while writing the external files

externalize

public void externalize(FieldContainer container,
                        File directory,
                        String localName,
                        String charset,
                        String key)
                 throws IOException
Externalizes the contents of a field container.

Parameters:
container - the container to externalize; will be modified by replacing the values stored in the key field with the base names (without extension) of the newly created external files containing them
directory - the directory to use for storing the externalized files; if null, the working directory is used
localName - the name of the input file, used to determine the names of externalized files
charset - the character set to use for the external files; if null, the default charset of the current platform is used
key - the name of the field to externalize
Throws:
IOException - if an I/O error occurs while writing the external files

doProcess

protected void doProcess(Reader reader,
                         Writer writer,
                         ContextMap context)
                  throws IOException,
                         ProcessingException
Processes the contents of a reader, writing a modified version to a writer. This implementation delegates to externalize(FieldContainer, File, String, String), using DSV format for input and output.

Specified by:
doProcess in class TextProcessor
Parameters:
reader - reader containing the text to process; should not be closed by this method
writer - the writer to write the processed text to; might be flushed but not closed by this method; if this method does not use the writer, the underlying file will be deleted afterwards
context - a map of objects that are made available for processing; when called from the implemented process methods in this class, it will contain mappings from IOUtils.KEY_LOCAL_CHARSET to the character set of the output writer; from TextProcessor.KEY_OUT_DIRECTORY to the output directory (File); from ContentType.KEY_MIME_TYPE to the document's MIME type; from TextProcessor.KEY_LOCAL_NAME to the local name (String) and either from TextProcessor.KEY_DIRECTORY to the input directory (File), in case of a local file) or from TextProcessor.KEY_URL to the URL (otherwise) of the processed document
Throws:
IOException - if an I/O error occurs
ProcessingException - if an error occurs during processing


Copyright © 2003-2007 Christian Siefkes. All Rights Reserved.