How to convert (Java) files with different encodings to the same?

I'm working on a big java web application in Eclipse, whose files have different encodings: some are in UTF-8, others in Cp1252, yet others are in ISO-8859-1 (with no distinction between JSP's or java source files, or CSS) — but I know the encoding of each file.

I'm converting the project to Maven, and this is a great occasion to turn all of them to UTF-8.
Of course I don't want to lose a single character (so fully automated conversions do not apply here).

How should I go about it? Is there a tool that can help me ensure I don't lose any special character?
The webapp is in Italian, so, especially in JSP's, there could be lots of accented letters (probably not everywhere HTML entities have been used).

The project is in Eclipse, but I can use an external editor if that could make the conversion easier.

Jon Skeet
people
quotationmark

It's very easy to write code to convert encodings - although I'd expect there are tools to do it anyway. Simply:

  • Create one FileInputStream to the existing file, and wrap it in an InputStreamReader with the appropriate encoding
  • Create one FileOutputStream to the new file, and wrap it in an OutputStreamWriter with the appropriate encoding
  • Loop over the reader, reading characters into a buffer and writing out the contents of that buffer (just as many characters as you read) until you've read the whole file
  • Close all resources (automatic with a try-with-resources block)

The first two steps are simpler with Files.newBufferedReader and Files.newBufferedWriter, too.

people

See more on this question at Stackoverflow