BCP 47 Extensions

The Java SE 7 release conforms to the IETF BCP 47 standard, which supports adding extensions to a Locale. Any single character can be used to denote an extension, but there are two predefined extension codes: 'u' specifies a Unicode locale extension, and 'x' specifies a private use extension.

Unicode locale extensions are defined by the Unicode Common Locale Data Repository (CLDR) project. They are used to specify information that is non-language-specific such as calendars or currency. A private use extension may be used to specify any other information, such as platform (for example, Windows, UNIX, or Linux), or release information (for example, 6u23 or JDK 7).

An extension is specified as a key/value pair, where the key is a single character (typically 'u' or 'x'). A well-formed value has the following format:


In this format:

SUBTAG = [0-9a-zA-Z]{2,8}    For key='u'
SUBTAG = [0-9a-zA-Z]{1,8}    For key='x'

Note that a single-character value is allowed for the private use extension. However, there is a 2-character minimum for values in the Unicode locale extension.

Extension strings are case-insensitive, but the Locale class maps all keys and values to lowercase.

The getExtensionKeys() method returns the set of extension keys, if any, for a Locale. The getExtension(key) method returns the value string for the specified key, if any.

Unicode Locale Extensions

As previously mentioned, a Unicode locale extension is specified by the 'u' key code or the UNICODE_LOCALE_EXTENSION constant. The value itself is also specified by a key/type pair. Legal values are defined in the Key/Type Definitions table on the Unicode website. A key code is specified by two alphabetic characters. The following table lists the Unicode locale extension keys:

Key Code Description
ca calendar algorithm
co collation type
ka collation parameters
cu currency type
nu number type
va common variant type


Specifying a Unicode locale extension, such as number format, does not guarantee that the locale services for the underlying platform will honor that request.

The following table shows some examples of key/type pairs for a Unicode locale extension.

Key/Type pair Description
ca-buddhist Thai Buddhist calendar
co-pinyin Pinyin ordering for Latin
cu-usd U.S. dollars
nu-jpanfin Japanese financial numerals
tz-aldav Europe/Andorra

The following string represents the German language locale for the country of Germany using a phonebook style of ordering for the Linux platform. This example also contains an attribute named "email".


The following Locale methods can be used to access information about the Unicode locale extensionss. These methods are described using the previous German locale example.

  • getUnicodeLocaleKeys() – Returns the Unicode locale key codes or an empty set if the locale has none. For the German example, this would return a set containing the single string "co".
  • getUnicodeLocaleType(String) – Returns the Unicode locale type associated with the Unicode locale key code. Invoking getUnicodeLocaleType("co") for the German example would return the string "phonebk".
  • getUnicodeLocaleAttributes() – Returns the set of Unicode locale attributes associated with this locale, if any. In the German example, this would return a set containing the single string "email".

Private Use Extensions

The private use extension, specified by the 'x' key code or the PRIVATE_USE_EXTENSION constant, can be anything, as long as the value is well formed.

The following are examples of possible private use extensions: