XML getting wrong UTF encoding

I am trying to encode my xml text using utf-8 with the code below. For some reason I am getting utf-16 instead of utf-8. Any reason why please?

        StringWriter writer = new StringWriter();
        xdoc.Save(writer);
        writer.Flush();
        string xml = writer.ToString();
        byte[] bytes = Encoding.UTF8.GetBytes(xml);
        System.IO.File.WriteAllBytes(pathDesktop + "\\22CRE002.XPO", bytes);
Jon Skeet
people
quotationmark

StringWriter itself "advertises" (via the TextWriter.Encoding property) an encoding of UTF-16, so the XmlWriter detects that and modifies the XML declaration accordingly. You are actually writing out the data as UTF-8 - it's just that the XML file itself will claim (incorrectly) that it's UTF-16, leading to all kinds of errors.

Your options are:

  • Use a subclass of StringWriter which advertises a different encoding by overriding the Encoding property
  • Just bypass the StringWriter entirely and write straight to the file.

Personally I'd go with the second option:

using (var writer = XmlWriter.Create(Path.Combine(pathDesktop, "22CRE002.XPO"))
{
    xdoc.Save(writer);
}

Why buffer it all in memory first? Note that XmlWriter will already default to UTF-8 if you don't specify an encoding.

people

See more on this question at Stackoverflow