Heyho,
I want to convert byte data, which can be anything, to a String. My question is, whether it is "secure" to encode the byte data with UTF-8 for example:
String s1 = new String(data, "UTF-8");
or by using base64:
String s2 = Base64.encodeToString(data, false); //migbase64
I'm just afraid that using the first method has negative side effects. I mean both variants work p̶e̶r̶f̶e̶c̶t̶l̶y̶ , but s1 can contain any character of the UTF-8 charset, s2 only uses "readable" characters. I'm just not sure if it's really need to use base64. Basically I just need to create a String send it over the network and receive it again. (There is no other way in my situation :/)
The question is only about negative side effects, not if it's possible!
You should absolutely use base64 or possibly hex. (Either will work; base64 is more compact but harder for humans to read.)
You claim "both variants work perfectly" but that's actually not true. If you use the first approach and data
is not actually a valid UTF-8 sequence, you will lose data. You're not trying to convert UTF-8-encoded text into a String
, so don't write code which tries to do so.
Using ISO-8859-1
as an encoding will preserve all the data - but in very many cases the string that is returned will not be easily transported across other protocols. It may very well contain unprintable control characters, for example.
Only use the String(byte[], String)
constructor when you've got inherently textual data, which you happen to have in an encoded form (where the encoding is specified as the second argument). For anything else - music, video, images, encrypted or compressed data, just for example - you should use an approach which treats the incoming data as "arbitrary binary data" and finds a textual encoding of it... which is precisely what base64 and hex do.
See more on this question at Stackoverflow