c# FileStream Read having problems with StreamReader EndOfStream

As the title says I found a problem. Little back story first: We have file.txt looking like this:

aaaabb
ccccddd
eeeefffffff

There are many ways to read this text line-by-line, one of which is this:

StreamReader sr = new StreamReader("file.txt");
while(!sr.EndOfStream)
{
    string s = sr.ReadLine();
}
sr.Close();

Works. s gets each line. Now I need the first 4 letters as bytes and the rest as string. After looking up things and experimenting a little, I found that the easiest way is this:

FileStream fs = new FileStream("file.txt", FileMode.Open);
StreamReader sr = new StreamReader(fs);
byte[] arr = new byte[4];
fs.Read(arr, 0, 4);
string s = sr.ReadLine();
sr.Close();
fs.Close();

Works. arr contains the first 4 letters as bytes and the rest of the line is saved in s. This is only a single line. If we add the while:

FileStream fs = new FileStream("file.txt", FileMode.Open);
StreamReader sr = new StreamReader(fs);
while(!sr.EndOfStream)
{
    byte[] arr = new byte[4];
    fs.Read(arr, 0, 4);
    string s = sr.ReadLine();
} 
sr.Close();
fs.Close();

Now there's a problem. Now arr doesn't get anything and s reads the whole line including the first 4 letters. Even more strange that if I use while(true) (and I assume anything else that is not the example) than it works as intended, 4 characters as bytes and rest is string, and this is the same for every line.

Question is that what am I missing? Why is this happening? How do I solve this? Or is it possible that this is a bug?

Jon Skeet
people
quotationmark

Now there's a problem. Now arr doesn't get anything and s reads the whole line including the first 4 letters.

Yes, that seems very plausible. StreamReader maintains a buffer - when you ask it to read a line of text, it may well read more from the stream than that single line, using that buffered data when it's next asked for information.

Fundamentally, I would strongly advise from directly reading from the stream that the StreamReader is reading from. It's going to be very fiddly to get right even where it's possible, and in some cases the API may just not let you do what you want.

If you want to remove the first four characters from each line, it would be much simpler to read the whole line, and then use Substring.

people

See more on this question at Stackoverflow