Opening InputStreamReader in the middle of UTF 8 stream

I am using a seekable InputStream which returns the stream to me at a specific position. The underlying data in the stream is encoded with UTF-8. I want to open this stream using inputStreamReader and read one character at a time.

Here is my code snippet

inputStream.seek(position-1);
InputStreamReader reader = new InputStreamReader(inputStream, "UTF-8");

The problem is that if position-1 could be pointing to the middle of a multi-byte UTF-8 sequence. How can I detect that make sure it starts from a new UTF-8 encoded sequence? Thanks in advance.

Jon Skeet
people
quotationmark

Assuming you can reposition the stream whenever you want, you can simply read bytes while the top two bits are "10". So something like:

// InputStream doesn't actually have a seek method, but I'll assume you're using
// a subclass which does...
inputStream.seek(position);
while (true) {
    int nextByte = inputStream.read();
    if (nextByte == -1 || (nextByte & 0xc0) != 0xc0) {
       break;
    }
    position++;
}
// Undo the last read, effectively
inputStream.seek(position);
InputStreamReader reader = new InputStreamReader(inputStream, StandardCharsets.UTF_8);

people

See more on this question at Stackoverflow