Is there any approach to convert large XML file(500+MBs) from 'Windows-1252' encoding to 'UTF-8' encoding in java?
Sure:
FileInputStream
wrapped in an InputStreamReader
with the Windows-1252 for the inputFileOutputStream
wrapped in an OutputStreamWriter
with the UTF-8 encoding for the outputRepeatedly read into the array and write however much has been written:
char[] buffer = new char[16 * 1024];
int charsRead;
while ((charsRead = input.read(buffer)) > 0) {
output.write(buffer, 0, charsRead);
}
Note that as it's XML, you may well need to manually change the XML declaration as well, as it should be specifying that it's in Windows-1252...
The fact that this works on a streaming basis means you don't need to worry about the size of the file - it only reads up to 16K characters in memory at a time.
See more on this question at Stackoverflow