Null pointer when trying to do rewriteBatchedStatements for MySQL and Java - mysql

I am trying to do batch inserts into mysql at very high rates. I wanted to try the rewriteBatchedStatements config option as I have read it can make significantly affect performance. When I add the option however I get the following exception:
java.lang.NullPointerException
at com.mysql.jdbc.PreparedStatement.computeMaxParameterSetSizeAndBatchSize(PreparedStatement.java:1694)
at com.mysql.jdbc.PreparedStatement.computeBatchSize(PreparedStatement.java:1651)
at com.mysql.jdbc.PreparedStatement.executeBatchedInserts(PreparedStatement.java:1515)
at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1272)
at com.zaxxer.hikari.proxy.StatementProxy.executeBatch(StatementProxy.java:116)
at com.zaxxer.hikari.proxy.PreparedStatementJavassistProxy.executeBatch(PreparedStatementJavassistProxy.java)
This is my code that does the inserts:
try (Connection connection = DBUtil.getInstance().getConnection();
PreparedStatement preparedStatement = connection.prepareStatement(query)) {
connection.setAutoCommit(false);
for (TransactionBatch batch : batches) {
try {
preparedStatement.setString(1, batch.getDeviceID());
preparedStatement.setBinaryStream(2, new ByteArrayInputStream(dataArray));
preparedStatement.addBatch();
} catch (Exception e) {
e.printStackTrace();
}
}
preparedStatement.executeBatch();
} catch (Exception e) {
e.printStackTrace();
}
This is my jdbc url:
jdbc:mysql://url:port/tableName?user=userame&password=password&useServerPrepStmts=false&rewriteBatchedStatements=true
Also I am using HikariCP as my connection pool.
EDIT: Update - looks like the problem relates to having a varbinary(10000) column in the table

The solution was to stop using:
preparedStatement.setBinaryStream(inputstream)
instead I used
preparedStatement.setBytes(byte[])
In order to rewrite it must need to calculate the total size which it can't do upfront from an input stream. It is working great now and my write speeds are awesome!

Related

try-with-resources and catching Exceptions which are both thrown during normal execution and autoClose

I am pondering for a while with the following code:
try (Connection conn = dbHandler.getConnection()) {
// do something with the connection
dbConn.commit();
} catch (SQLException e) {
System.err.println("Too bad!");
e.printStackTrace();
}
Consider my application needs to be resilient to some SQL locking/dead-lock situations, so I would need to do some specific handling inside of the catch block. However, I cannot easily identify, if the exception was thrown during the "do something" part of my code or during autoClose(). And of course, each JDBC driver throws slightly different exceptions and if you throw JDBI in the mix, it gets even more complicated. So you can't really rely on different catch clauses do precisely identify, when the exception got thrown.
The only solution I could come up with is this code:
boolean finishedProcessing = false;
try (Connection conn = dbHandler.getConnection()) {
// do something with the connection
dbConn.commit();
finishedProcessing = true;
} catch (SQLException e) {
if (finishedProcessing) {
// this is an exception during autoClose
System.err.println("Too bad!");
e.printStackTrace();
} else {
// handle rollback cases, retries etc
System.err.println("SQL recovery performed.");
throw e;
}
}
Now the same issue comes up with IOExceptions for File* operations and probably in many other cases.
To be honest, I am just hoping I missed something completely obvious and I appreciate any insight from the Java experts out here.
As stated above, the only solution I found so far is introducing state-variables into the code, which doesn't really make sense to me.
I am fully aware of suppressing exceptions from the autoClose process, when an exception got thrown by the try block, however, as stated above the same kind of exception can be thrown, suppressed and both.
Don't use try-with-resources and add a finally block.
Connection conn = null;
try {
conn = dbHandler.getConnection()
// do something with the connection
dbConn.commit();
}
catch (SQLException e) {
// Handle rollback cases, retries, etc.
System.err.println("SQL recovery performed.");
e.printStackTrace();
}
finally {
if (conn != null) {
try {
conn.close();
}
catch (SQLException xSql) {
System.out.println("Failed to close database connection.");
xSql.printStackTrace();
}
}
}

How not to fail Hadoop MapReduce job for one database insert failure?

I'm writing a MapReduce job to mine webserver logs. The input is from text files, output goes to a MySQL database. Problem is, if one record fails to insert, for whatever reason, like data exceeding column size, the whole job fails and nothing gets written to the database. Is there a way so that the good records are still persisted? I guess one way would be to validate the data but that couples the client with the database schema too much for my taste.
I'm not posting the code because this is not particularly a code issue.
Edit:
Reducer:
protected void reduce(SkippableLogRecord rec,
Iterable<NullWritable> values, Context context) {
String path = rec.getPath().toString();
path = path.substring(0, min(path.length(), 100));
try {
context.write(new DBRecord(rec), NullWritable.get());
LOGGER.info("Wrote record {}.", path);
} catch (IOException | InterruptedException e) {
LOGGER.error("There was a problem when writing out {}.", path, e);
}
}
Log:
15/03/01 14:35:06 WARN mapred.LocalJobRunner: job_local279539641_0001
java.lang.Exception: java.io.IOException: Data truncation: Data too long for column 'filename' at row 1
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.io.IOException: Data truncation: Data too long for column 'filename' at row 1
at org.apache.hadoop.mapreduce.lib.db.DBOutputFormat$DBRecordWriter.close(DBOutputFormat.java:103)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
15/03/01 14:35:06 INFO mapred.LocalJobRunner: reduce > reduce
15/03/01 14:35:07 INFO mapreduce.Job: Job job_local279539641_0001 failed with state FAILED due to: NA
Answering my own question and looking at this SO post, I see that the database write in done in a batch, and on SQLException, the transaction is rolled back. So that explains my problem. I guess I'll just have to make the DB columns big enough, or validate first. I can also create a custom DBOutputFormat/DBRecordWriter but unless I insert one record at a time, there'll always be a risk of one bad record causing the whole batch to rollback.
public void close(TaskAttemptContext context) throws IOException {
try {
LOG.warn("Executing statement:" + statement);
statement.executeBatch();
connection.commit();
} catch (SQLException e) {
try {
connection.rollback();
}
catch (SQLException ex) {
LOG.warn(StringUtils.stringifyException(ex));
}
throw new IOException(e.getMessage());
} finally {
try {
statement.close();
connection.close();
}
catch (SQLException ex) {
throw new IOException(ex.getMessage());
}
}
}

calling ExecuteReader() from catch block

I have the following code and when exception happens the ExecuteReader() in Catch block will hang the app.
My question is why the hang? I can't perform query inside of Catch block if query exception happens in general?
Try {
// some SQL queries
}
catch (SqlException odbcEx) {
// do some queries with IDbCommand::ExecuteReader()
}
catch (Exception ex) {
// Handle generic ones here.
}
Thanks,
The ExecuteReader() keeps hold of your SQL connection. What you want to do is wrap a using statement round it. Also, you can't perform a SQL query because you have essentially errored and lost scope of your SQL connection variable. If you want, you can do some further SQL in the exception block by instantiating a new instance of your reader and connection, however ideally close of your existing connection before doing so. If you use a datatable you won't keep hold of the SQL connection. Perhaps something to look at.
For example:
using (var conn = new SqlConnection("ConnectionString"))
{
try {
// some SQL queries
}
catch (SqlException odbcEx) {
// do some queries with IDbCommand::ExecuteReader()
}
catch (Exception ex) {
// Handle generic ones here.
}
finally {
conn.Close();
}
}
This way you are disposing of your connection and not keeping hold of it.

unexpected java.lang.NoClassDefFoundError when calling DriverManager.getConnection()

I am writing a Java application which needs to insert some data to MySQL database through JDBC. Here's the related code:
public JDBCDecoder() {
try {
Class.forName("com.mysql.jdbc.Driver");
System.out.println("Loaded MySQL JDBC driver");
} catch (ClassNotFoundException e) {
System.out.println("Exception attempting to load MySQL JDBC driver");
}
String url = "jdbc:mysql://localhost/db";
Properties props = new Properties();
props.put("user", "root");
props.put("password", "root");
try {
conn = DriverManager.getConnection(url, props);
conn.setAutoCommit(false);
} catch (SQLException e) {
Throwables.propagate(e);
}
....
}
Here's the error stack trace that I got after trying to run the code:
java.lang.NoClassDefFoundError: Could not initialize class oracle.jdbc.OracleDriver
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:249)
at java.sql.DriverManager.getCallerClass(DriverManager.java:477)
at java.sql.DriverManager.getConnection(DriverManager.java:576)
at java.sql.DriverManager.getConnection(DriverManager.java:154)
at exportclient.JDBCExportClient$JDBCDecoder.<init>(JDBCExportClient.java:179)
at exportclient.JDBCExportClient.constructExportDecoder(JDBCExportClient.java:604)
at export.processors.GuestProcessor$1.run(GuestProcessor.java:113)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at utils.CoreUtils$1$1.run(CoreUtils.java:259)
at java.lang.Thread.run(Thread.java:680)
which seems weird to me because: 1) I am not trying to connect to Oracle database; 2) actually I do have an ojdbc6.jar (which contains oracle.jdbc.OracleDriver) in my classpath. So I am completely clueless why this error would happen.
Any suggestion will be appreciated. Thanks in advance!
See if this has something to do with your problem: "As part of its initialization, the DriverManager class will attempt to load the driver classes referenced in the "jdbc.drivers" system property."

hibernate: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException

When saving an object to database using hibernate, it sometimes fails because of certain fields in the object exceeding the maximum varchar length defined in the database.
Therefore I am using the following approach:
Attempt to save
If getting an DataException, I then truncate the fields in the object to the max length specified in the db definition, then try to save again.
However, in the second save after truncation, I'm getting the following exception:
hibernate: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Cannot add or update a child row: a foreign key constraint fails
Here's the relevant code, do you notice anything wrong with it?
public static void saveLenientObject(Object obj){
try {
save2(rec);
} catch (org.hibernate.exception.DataException e) {
// TODO Auto-generated catch block
e.printStackTrace();
saveLenientObject(rec, e);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private static void saveLenientObject(Object rec, DataException e) {
Util.truncateObject(rec);
System.out.println("after truncation ");
save2(rec);
}
public static void save2(Object obj) throws Exception{
try{
beginTransaction();
getSession().save(obj);
commitTransaction();
}catch(Exception e){
e.printStackTrace();
rollbackTransaction();
//closeSession();
throw e;
}finally{
closeSession();
}
}
All Hibernate exceptions (except for NonUniqueResultException) are irrecoverable. If you get an exception you should close the session and open another one for further operations.
See also:
13.2.3. Exception handling
The hibernate documentaiton is quite clear that once an exception is thrown the session will be left in an inconsistent state so is not safe to use for further operations. I suspect that what you're getting here is that the session is left half saving your first attempt so bad things happen.
Fundamentally you should not rely on database errors to check the length of your fields, instead you should pre-validate this in java code. If you know the lengths enough to truncate, then I suggest you simply call your trucate util every time.
Alternatively use Hibernate Validator to declaratively validate the objects before saving.