I want to be able to read a document from database and extract some data out of it. I am able to do the second part but I'm having issue with doing it efficiently. Is there a more efficient way of reading BLOBs without writing them locally first? As you can see I save the data into a document on my hard drive which is terribly slow.
final String url = "connectioninfo";
final String username = "user";
final String password = "password";
try {
Connection conn = DriverManager.getConnection(url, username, password);
String sql = "SELECT document_id, file_name, data FROM documents WHERE file_name like '%.doc'";
PreparedStatement stmt = conn.prepareStatement(sql);
ResultSet resultSet = stmt.executeQuery();
while (resultSet.next()) {
String da_document_id = resultSet.getString(1);
String file_name = resultSet.getString(2);
File data = new File("c:\\databaseDoc.doc");
FileOutputStream fos = new FileOutputStream(data);
byte[] buffer = new byte[1];
InputStream is = resultSet.getBinaryStream(3);
try {
while (is.read(buffer) > 0) {
fos.write(buffer);
}
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("da_document_id= " + da_document_id);
System.out.println("file_name= " + file_name);
}
conn.close();
} catch (SQLException e) {
e.printStackTrace();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
Part of the reason that it's slow is that you're reading one byte at a time. That's never a good idea. If you're using Java 9, I'd also suggest using InputStream.transferTo(OutputStream)
to make things rather simpler:
But no, you don't need to write to a FileOutputStream
- you could copy the data into a ByteArrayOutputStream
instead, at which point you can easily create a byte[]
:
ByteArrayOutputStream output = new ByteArrayOutputStream();
try (InputStream input = resultSet.getBinaryStream(3)) {
input.transferTo(output);
}
byte[] data = output.toByteArray();
Alternatively, if the way you want to use the data is something that already accepts an InputStream
, just use the stream returned by getBinaryStream
. Make sure you know when it's okay to close the stream though - some usages may read lazily.
See more on this question at Stackoverflow