|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.fu_berlin.ties.text.TextUtils
public final class TextUtils
A static class that provides utility constants and methods for working with texts and regular expressions. No instances of this class can be created, only the static members should be used.
Field Summary | |
---|---|
static String |
LINE_SEPARATOR
The line separator on the current operating system ("\n" on Unix). |
static String |
NEWLINE_ALTERNATIVES
Regex fragment listing the newline alternatives used by differents systems: "\r\n" (Windows), "\n" (Unix) or "\r" (Mac). |
static Pattern |
NEWLINE_PATTERN
A regular expression matching a single newlines (build by enclosing NEWLINE_ALTERNATIVES in a non-capturing group). |
static Pattern |
NEWLINES_PATTERN
A regular expression matching newlines, including surrounding whitespace. |
static Pattern |
PUNCTUATION_PATTERN
A simple regular expression for strings that contain only punctuation characters. |
static Pattern |
PUNCTUATION_SYMBOL_PATTERN
A simple regular expression for strings that contain only punctuation and symbol characters. |
static Pattern |
SINGLE_LINE_WS
A regular expression matching a non-line-breaking whitespace character (character class containing space and tab). |
static Pattern |
WHITESPACE_PATTERN
A simple regular expression for whitespace. |
Method Summary | |
---|---|
static int |
countFirst(String str,
char ch)
Counts how often a character is repeated at the begin of a string. |
static int |
countLast(String str,
char ch)
Counts how often a character is repeated at the end of a string. |
static void |
ensurePrintableName(String string)
Checks that a string is a printable name, meaning it has at at least one character and does not contain any whitespace. |
static String |
joinAlternatives(String[] alternatives)
Helper method for building a regular expression Pattern by
combining several alternatives. |
static String |
multipleReplaceAll(CharSequence input,
Map replacements)
Performs multiple replace-all operations on a text. |
static String |
normalize(String input)
Normalizes the whitespace in a string, replacing all internal whitespace sequences with a single space character and trimming any leading and trailing whitespace. |
static boolean |
punctuation(CharSequence text)
Checks whether a string contains only punctuation characters. |
static boolean |
punctuationOrSymbol(CharSequence text)
Checks whether a string contains only punctuation and symbol characters. |
static String |
replaceAll(String input,
Matcher matcher,
String replacement)
Replaces each substring of the input matched by the
given pattern matcher with the given
replacement. |
static String |
replaceAll(String input,
Pattern pattern,
String replacement)
Replaces each substring of the input that matches the given
Pattern with the given replacement. |
static String |
shorten(String input)
Delegates to shorten(String, int, int) , showing up to
24 characters at the start and the end of the shortened
string. |
static String |
shorten(String input,
int numChars)
Delegates to shorten(String, int, int) , using the same number
of characters at the start and the end of the shortened string. |
static String |
shorten(String input,
int startChars,
int endChars)
Shortens a string, inserting an ellipsis ("...") in the middle if the string is too long. |
static String[] |
splitLines(CharSequence input)
Splits a text into an array of lines. |
static String[] |
splitLinesExact(CharSequence input)
Splits a text into an array of lines, without trimming lines and discarding empty lines. |
static String[] |
splitString(String input)
Splits a string around whitespace. |
static String[] |
splitString(String input,
int splitMaximum)
Splits a string around whitespace. |
static String[] |
splitString(String input,
Pattern whitespacePattern,
int splitMaximum)
Splits a string around whitespace. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final String LINE_SEPARATOR
public static final String NEWLINE_ALTERNATIVES
public static final Pattern SINGLE_LINE_WS
public static final Pattern NEWLINE_PATTERN
NEWLINE_ALTERNATIVES
in a non-capturing group).
public static final Pattern NEWLINES_PATTERN
public static final Pattern PUNCTUATION_PATTERN
public static final Pattern PUNCTUATION_SYMBOL_PATTERN
public static final Pattern WHITESPACE_PATTERN
Method Detail |
---|
public static int countFirst(String str, char ch)
str
- the string to checkch
- the character to count
public static int countLast(String str, char ch)
str
- the string to checkch
- the character to count
public static void ensurePrintableName(String string) throws IllegalArgumentException
string
- the string to check
IllegalArgumentException
- if the given string null or empty
or contains whitespacepublic static String joinAlternatives(String[] alternatives)
Pattern
by
combining several alternatives.
alternatives
- the alternatives to combine
public static String multipleReplaceAll(CharSequence input, Map replacements)
input
- the character sequence to perform the replacements onreplacements
- a mapping of regular expression
Pattern
s to replacement String
s
public static String normalize(String input)
input
- the string to normalize
public static String replaceAll(String input, Matcher matcher, String replacement)
input
matched by the
given pattern matcher with the given
replacement. See Matcher.replaceAll(java.lang.String)
for details
of the replacement process and special characters in the
replacement
string.
This method only returns a new string if there is at least one match
to replace. Otherwise the reference to the input
object is
returned. Thus you can use the ==
operator to find out
whether replacements have been made, it is not necessary to use
String.equals(java.lang.Object)
. When there is nothing to
replace, it might be more efficient than
Matcher.replaceAll(java.lang.String)
(and certainly than
String.replaceAll(java.lang.String, java.lang.String)
, because
(as of JDK 1.4.2) these methods always create and return new objects.
Matchers are stateful and not thread-safe. It is not necessary to
Matcher.reset()
the matcher prior to calling this method but you
should reset it if you want to used it in other matching operations
afterwards.
input
- the string to processmatcher
- a matcher on the patternreplacement
- the replacement string
input
string if no replacements were madepublic static String shorten(String input, int startChars, int endChars)
input
String isn't larger than
startChars + endChars + 3
, return it.
startChars
characters of the
input
String, followed by "..." (an ellipsis) and the last
endChars
characters of the String
This method is similar to
StringUtils.abbreviate(String, int)
,
but the ellipsis is inserted in the middle of the string, not at the
end.
input
- the input stringstartChars
- the number of characters to include before the ellipsisendChars
- the number of characters to include after the ellipsis
public static String shorten(String input, int numChars)
shorten(String, int, int)
, using the same number
of characters at the start and the end of the shortened string.
input
- the input stringnumChars
- the number of characters to to use for
both startChars
and endChars
parameter
public static boolean punctuation(CharSequence text)
text
- the test to check
true
iff the text contains one or more
punctuation characters and no other characterspublic static boolean punctuationOrSymbol(CharSequence text)
text
- the test to check
true
iff the text contains one or more
punctuation or symbol characters and no other characterspublic static String shorten(String input)
shorten(String, int, int)
, showing up to
24 characters at the start and the end of the shortened
string.
input
- the input string
public static String replaceAll(String input, Pattern pattern, String replacement)
input
that matches the given
Pattern
with the given replacement. See
Matcher.replaceAll(java.lang.String)
for details of the
replacement process and special characters in the
replacement
string.
This method only returns a new string if there is at least one match
to replace. Otherwise the reference to the input
object is
returned. Thus you can use the ==
operator to find out
whether replacements have been made, it is not necessary to use
String.equals(java.lang.Object)
.
This method is thread-safe since pattern objects are stateless. On the
other hand, it needs to create a new Matcher
object, thus
replaceAll(String, Matcher, String)
is more efficient for
multiple replacements on the same pattern.
input
- the string to processpattern
- the regular expression Pattern
to replacereplacement
- the replacement string
input
string if no replacements were madepublic static String[] splitLines(CharSequence input)
input
- the text to split
public static String[] splitLinesExact(CharSequence input)
input
- the text to split
public static String[] splitString(String input)
input
- the string to split
public static String[] splitString(String input, int splitMaximum)
splitMaximum
.
If splitting results in more subsequences, only the last
splitMaximum
are kept, while the other ones are discarded.
This implementation splits around the WHITESPACE_PATTERN
.
input
- the string to splitsplitMaximum
- the maximum number of subsequences to keep;
or -1
if all subsequences should be kept
splitMaximum
elementspublic static String[] splitString(String input, Pattern whitespacePattern, int splitMaximum)
splitMaximum
.
If splitting results in more subsequences, only the last
splitMaximum
are kept, while the other ones are discarded.
input
- the string to splitwhitespacePattern
- the pattern around which to splitsplitMaximum
- the maximum number of subsequences to keep;
or -1
if all subsequences should be kept
splitMaximum
elements
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |