public class UnicodeProperties
extends java.lang.Object
Modifier and Type | Class and Description |
---|---|
static class |
UnicodeProperties.UnsupportedUnicodeVersionException |
Modifier and Type | Field and Description |
---|---|
private IntCharSet[] |
caselessMatches |
private java.lang.String |
caselessMatchPartitions |
private int |
caselessMatchPartitionSize |
private static java.lang.String |
DEFAULT_UNICODE_VERSION |
private int |
maximumCodePoint |
private java.util.Map<java.lang.String,IntCharSet> |
propertyValueIntervals |
static java.lang.String |
UNICODE_VERSIONS
Constant
UNICODE_VERSIONS="1.1, 1.1.5, 2, 2.0, 2.0.14, 2.1, 2.1.9,"{trunked} |
private static java.util.regex.Pattern |
WORD_SEP_PATTERN |
Constructor and Description |
---|
UnicodeProperties()
Unpacks the Unicode data corresponding to the default Unicode version: ""9.0"".
|
UnicodeProperties(java.lang.String version)
Unpacks the Unicode data corresponding to the given version.
|
Modifier and Type | Method and Description |
---|---|
private void |
bind(java.lang.String[] propertyValues,
java.lang.String[] intervals,
java.lang.String[] propertyValueAliases,
int maximumCodePoint,
java.lang.String caselessMatchPartitions,
int caselessMatchPartitionSize)
Unpacks data for the selected Unicode version, populating
propertyValueIntervals . |
private void |
bindInvariantIntervals()
Adds intervals for \p{ASCII} and \p{Any} to
propertyValueIntervals . |
IntCharSet |
getCaselessMatches(int c)
Returns a set of character intervals representing all characters that are case-insensitively
equivalent to the given character, including the given character itself.
|
IntCharSet |
getIntCharSet(java.lang.String propertyValue)
Returns the character interval set associated with the given property value for the selected
Unicode version.
|
int |
getMaximumCodePoint()
Returns the maximum code point for the selected Unicode version.
|
java.util.Set<java.lang.String> |
getPropertyValues()
Returns the set of all properties, property values, and their aliases supported by the
specified Unicode version.
|
private void |
init(java.lang.String version)
Based on the given version, selects and binds the corresponding Unicode data to facilitate
mappings from property values to character intervals.
|
private void |
initCaselessMatches()
Unpacks the caseless match data.
|
private java.lang.String |
normalize(java.lang.String identifier)
Normalizes the given identifier, by: downcasing; removing whitespace, underscores, hyphens, and
parentheses; and substituting '=' for every ':'.
|
public static final java.lang.String UNICODE_VERSIONS
UNICODE_VERSIONS="1.1, 1.1.5, 2, 2.0, 2.0.14, 2.1, 2.1.9,"{trunked}
private static final java.lang.String DEFAULT_UNICODE_VERSION
private static final java.util.regex.Pattern WORD_SEP_PATTERN
private int maximumCodePoint
private java.util.Map<java.lang.String,IntCharSet> propertyValueIntervals
private java.lang.String caselessMatchPartitions
private int caselessMatchPartitionSize
private IntCharSet[] caselessMatches
public UnicodeProperties() throws UnicodeProperties.UnsupportedUnicodeVersionException
UnicodeProperties.UnsupportedUnicodeVersionException
- if the default
version is not supported.public UnicodeProperties(java.lang.String version) throws UnicodeProperties.UnsupportedUnicodeVersionException
version
- The Unicode version for which to unpack dataUnicodeProperties.UnsupportedUnicodeVersionException
- if the given version
is not supported.public int getMaximumCodePoint()
public IntCharSet getIntCharSet(java.lang.String propertyValue)
propertyValue
- The Unicode property or property value (or alias for one of these) for
which to return the corresponding character intervals.public java.util.Set<java.lang.String> getPropertyValues()
public IntCharSet getCaselessMatches(int c)
The first call to this method lazily initializes the backing data.
c
- The character for which to return case-insensitive equivalents.private void initCaselessMatches()
getCaselessMatches(int)
to lazily
initialize.private void init(java.lang.String version) throws UnicodeProperties.UnsupportedUnicodeVersionException
version
- The Unicode version for which to bind dataUnicodeProperties.UnsupportedUnicodeVersionException
- if the given version is not supported.private void bind(java.lang.String[] propertyValues, java.lang.String[] intervals, java.lang.String[] propertyValueAliases, int maximumCodePoint, java.lang.String caselessMatchPartitions, int caselessMatchPartitionSize)
propertyValueIntervals
.propertyValues
- The list of property values, in same order as the packed data
corresponding to them, in the given intervals, for the selected Unicode version.intervals
- The packed character intervals corresponding to and in the same order as the
given propertyValues, for the selected Unicode version.propertyValueAliases
- Key/value pairs mapping property value aliases to property values,
for the selected Unicode version.maximumCodePoint
- The maximum code point for the selected Unicode version.caselessMatchPartitions
- The packed caseless match partition data for the selected
Unicode versioncaselessMatchPartitionSize
- The partition data record length (the maximum number of
elements in a caseless match partition) for the selected Unicode version.private void bindInvariantIntervals()
propertyValueIntervals
.private java.lang.String normalize(java.lang.String identifier)
identifier
- The identifier to normalize