My name
is
Jon Skeet

Java: can java.lang.Character be used for characters outside Basic Multilingual Plane?

As I understand java keeps string in uft16 which for every code points uses either 16 (for BMP) or 32 bits. But I am not sure if class Character can be used for keeping code point which need 32 bits. Reading http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html didn't help. So can it?

No, char and Character can't represent a code point outside the BMP. There's no specific type for this, but all the Java APIs just use int to refer to code points specifically as opposed to UTF-16 code units.

If you look at all the codePoint* methods in java.lang.Character, such as codePointAt(char[], int, int) you'll see they use int.

In my experience, very little code (including my own) correctly takes account of this, instead assuming that it's reasonable to talk about the length of a string as being the number of UTF-16 code units in it. Having said that, "length" is a pretty hard-to-pin-down concept for strings, in that it doesn't mean the number of displayed glyphs, and different normalization forms of logically-equivalent text can consist of different numbers of code points...

See more on this question at Stackoverflow