jdbc batch different statements - mysql

Now I've been searching this all over, i get the same answer. What I want is to have different statements batched under one variable using jdbc in java. So far what I get is batching statements that have the same pattern, e.g, INSERT INTO table VALUES('?','?'). This can be done using a preparedstatement. But I have tried to batch different types of statements using java.sql.Statement and they executed well. for example an update and an insert under one statement, commit once. But now the problem with java.sql.Statement is that it does now do what preparedStatement does, what people call escaping. Again the problem with preparedStatement is it only batches statements of the same pattern, as in, you can't update and insert. it has to be one of the two.
So now I thought I would use java.sql.Statement, but is there a library that does what preparedStatement does,String escaping to avoid Sql injection. Also, if I am mistakening batching with another terminology that I may not know, rather correct me and tell me what I am wanting to do is called, that is, to execute multiple different statements under one java.sql.Statement.
One last thing, when batching i realized there is no validation of syntax, which I wouldn't want, all errors are checked during executing, this might also fall under a library that can validate Sql.

Whatever you have mentioned is correct.
You can batch similar set of statements and can get executed at once. But as far as my knowledge there is no library in java which groups or batches different kinds of statements together and gets executed.
The last thing I want to tell is that the sql statement will be compiled only once when you are using the PreparedStatement object, if any errors in the sql statement, will be thrown, otherwise the statement will gets executed. If the same statement is sent to the database again with different values, the statement will not be compiled and simply executed by the database server.

Since you're looking for a library to do this kind of thing, yes, jOOQ can do it for you via its BatchedConnection, and you don't even have to use jOOQ's DSL to access this feature, though it works with the DSL as well. The following code snippet illustrates how this works.
Let's assume you have this logic:
// This is your original JDBC connection
try (Connection connection = ds.getConnection()) {
doSomethingWith(connection);
}
// And then:
void doSomethingWith(Connection c) {
try (PreparedStatement s = c.prepareStatement("INSERT INTO t (a, b) VALUES (?, ?)")) {
s.setInt(1, 1);
s.setInt(1, 2);
s.executeUpdate();
}
try (PreparedStatement s = c.prepareStatement("INSERT INTO t (a, b) VALUES (?, ?)")) {
s.setInt(1, 3);
s.setInt(1, 4);
s.executeUpdate();
}
try (PreparedStatement s = c.prepareStatement("INSERT INTO u (x) VALUES (?)")) {
s.setInt(1, 1);
s.executeUpdate();
}
try (PreparedStatement s = c.prepareStatement("INSERT INTO u (x) VALUES (?)")) {
s.setInt(1, 2);
s.executeUpdate();
}
}
Now, instead of re-writing your code, you can simply wrap it with jOOQ glue code:
// This is your original JDBC connection
try (Connection connection = ds.getConnection()) {
// Now wrap that with jOOQ and turn it into a "BatchedConnection":
DSL.using(connection).batched(c -> {
// Retrieve the augmented connection again from jOOQ and run your original logic:
c.dsl().connection(connection2 -> {
doSomethingWith(connection2);
});
});
}
Now, whatever your method doSomethingWith() is doing with a JDBC Connection, it is now getting batched as good as possible, i.e. the first two inserts are batched together, and so are the third and fourth one.

Related

SELECT...FOR UPDATE using Anorm

I'm trying to write a SQL SELECT...FOR UPDATE using Anorm in Play so that I can have multiple threads interact with the same database, but it's throwing an issue.
The code is:
db.withConnection { implicit connection: Connection =>
SQL"""
start transaction;
select * from push_messages where vendorId=$vendorId for update;
UPDATE push_messages set stageOne=$first, stageTwo=$second, stageThree=$third,
stageFour=$fourth, stageFive=$fifth, highestStage=$highestStage, iOSTotal=$iOSTotal,
androidTotal=$androidTotal, iOSRunningCount=$iOSRunningCount, androidRunningCount=$androidRunningCount,
problem=$newProblem, iOSComplete=$iOSCompleted, androidComplete=$newAndroidComplete,
totalStageThrees=$totalStageThrees, totalStageFours=$totalStageFours, expectedTotals=$expectedTotals,
startTime=$startTime, date=$date, topics=$topics, androidFailures=$androidFailures, iOSFailures=$iOSFailures where vendorId=$vendorId;
commit;
""".execute
}
But, it doesn't seem to like the use of .execute on the select statement. Is there a good way to break this up to do the select...for update so that I can use either execute() or executeUpdate?
Any and all help would be appreciate. Thanks.
As most JDBC base library, Anorm is using PreparedStatement to safely interact with the DB, so you should not pass it such multi statement string, but only a single statement to each SQL call.
Moreover, about start transaction, you'd better use the JDBC way for it (e.g. using Play DB DB.withTransaction { ... }).

Does Statement.RETURN_GENERATED_KEYS generate any extra round trip to fetch the newly created identifier?

JDBC allows us to fetch the value of a primary key that is automatically generated by the database (e.g. IDENTITY, AUTO_INCREMENT) using the following syntax:
PreparedStatement ps= connection.prepareStatement(
"INSERT INTO post (title) VALUES (?)",
Statement.RETURN_GENERATED_KEYS
);
while (resultSet.next()) {
LOGGER.info("Generated identifier: {}", resultSet.getLong(1));
}
I'm interested if the Oracle, SQL Server, postgresQL, or MySQL driver uses a separate round trip to fetch the identifier, or there is a single round trip which executes the insert and fetches the ResultSet automatically.
It depends on the database and driver.
Although you didn't ask for it, I will answer for Firebird ;). In Firebird/Jaybird the retrieval itself doesn't require extra roundtrips, but using Statement.RETURN_GENERATED_KEYS or the integer array version will require three extra roundtrips (prepare, execute, fetch) to determine the columns to request (I still need to build a form of caching for it). Using the version with a String array will not require extra roundtrips (I would love to have RETURNING * like in PostgreSQL...).
In PostgreSQL with PgJDBC there is no extra round-trip to fetch generated keys.
It sends a Parse/Describe/Bind/Execute message series followed by a Sync, then reads the results including the returned result-set. There's only one client/server round-trip required because the protocol pipelines requests.
However sometimes batches that can otherwise be streamed to the server may be broken up into smaller chunks or run one by on if generated keys are requested. To avoid this, use the String[] array form where you name the columns you want returned and name only columns of fixed-width data types like integer. This only matters for batches, and it's a due to a design problem in PgJDBC.
(I posted a patch to add batch pipelining support in libpq that doesn't have that limitation, it'll do one client/server round trip for arbitrary sized batches with arbitrary-sized results, including returning keys.)
MySQL receives the generated key(s) automatically in the OK packet of the protocol in response to executing a statement. There is no communication overhead when requesting generated keys.
In my opinion even for such a trivial thing a single approach working in all database systems will fail.
The only pragmatic solution is (in analogy to Hibernate) to find the best working solution for each target RDBMS (and
call it a dialect of your one for all solution:)
Here the information for Oracle
I'm using a sequence to generate key, same behavior is observed for IDENTITY column.
create table auto_pk
(id number,
pad varchar2(100));
This works and use only one roundtrip
def stmt = con.prepareStatement("insert into auto_pk values(auto_pk_seq.nextval, 'XXX')",
Statement.RETURN_GENERATED_KEYS)
def rowCount = stmt.executeUpdate()
def generatedKeys = stmt.getGeneratedKeys()
if (null != generatedKeys && generatedKeys.next()) {
def id = generatedKeys.getString(1);
But unfortunately you get ROWID as a result - not the generated key
How is it implemented internally? You can see it if you activate a 10046 trace (BTW this is also the best way to see
how many roundtrips were performed)
PARSING IN CURSOR
insert into auto_pk values(auto_pk_seq.nextval, 'XXX')
RETURNING ROWID INTO :1
END OF STMT
So you see the JDBC Standard 3.0 is implemented, but you don't get a requested result. Under the cover is used the
RETURNING clause.
The right approach to get the generated key in Oracle is therefore:
def stmt = con.prepareStatement("insert into auto_pk values(auto_pk_seq.nextval, 'XXX') returning id into ?")
stmt.registerReturnParameter(1, Types.INTEGER);
def rowCount = stmt.executeUpdate()
def generatedKeys = stmt.getReturnResultSet()
if (null != generatedKeys && generatedKeys.next()) {
def id = generatedKeys.getLong(1);
}
Note:
Oracle Release 12.1.0.2.0
To activate the 10046 trace use
con.createStatement().execute "alter session set events '10046 trace name context forever, level 12'"
con.createStatement().execute "ALTER SESSION SET tracefile_identifier = my_identifier"
Depending on frameworks or libraries to do things that are perfectly possible in plain SQL is bad design IMHO, especially when working against a defined DBMS. (The Statement.RETURN_GENERATED_KEYS is relatively innocuous, although it apparently does raise a question for you, but where frameworks are built on separate entities and doing all sorts of joins and filters in code or have custom-built transaction isolation logic things get inefficient and messy very quickly.)
Why not simply:
PreparedStatement ps= connection.prepareStatement(
"INSERT INTO post (title) VALUES (?) RETURNING id");
Single trip, defined result.

how to insert data into mysql database using spark java framework

I am new to the spark java framework, how to insert the values into mysql database using sparkjava framework?
Assuming you have already read the data you want to insert into an RDD, You can use the following code to insert the records into database.
rdd.foreach(new VoidFunction<String>() {
#Override
public void call(String s) throws Exception {
//You Code to parse the String and insert the values into MYSQL.
}
});
To build on Shivanand's answer, you really want to use a foreachPartition as opposed to foreach. With foreach, you will be opening a db connection for every element, as opposed to once per partition. This is beneficial for a couple reasons, but most importantly its going to take sometime to open the connection. That overhead will then be carried over for every element. You will also be trying to open a lot of connections, and will probably make one of the admins pretty mad when they see potentially millions of db connection requests.
If you have already read the file, then use the foreachpartition function. It will help you to insert it into MySQL.
Take Example-
rdd.foreachpartition(new VoidFunctioin<String> x)
{
//Here make a connection enter code here
public void call(Iterator<String> x)
{
Connection c=(Connection)DriverManager.getConnection("your conncetion name,localhost,dbname");
while(x.hasNext())
{
String it=x.next();
PreparedStatement ps=c.preparestatement("Insert into table_name (coln_name) values ("?")");
ps.execute();
}
}
});
It will insert it into MySQL.

Spring Batch: ItemProcessor query Database?

I have a scenario where I need to parse flat files and process those records into mysql database inserts (schema already exists).
I'm using the FlatFileItemReader to parse the files and a JdbcCursorItemWriter to insert in the database.
I'm also using an ItemProcessor to convert any column values or skip records that I don't want.
My problem is, some of those inserts need to have a foreign key to some other table that already has data into it.
So I was thinking to do a select to retrieve the ID and update the pojo, inside the ItemProcessor logic.
Is this the best way to do it? I can consider alternatives as I'm just beginning to write all this.
Thanks!
The ItemProcessor in a Spring Batch step is commonly used for enrichment of data and querying a db for something like that is common.
For the record, another option would be to use a sub select in your insert statement to get the foreign key value as the record is being inserted. This may be a bit more performant give it removes the additional db hit.
for the batch process - if you require any where you can call use below method anywhere in batch using your batch listeners
well the below piece of code which I wrote , worked for me --
In you Main class - load your application context in a static variable - APP_CONTEXT
If you are not using XML based approach - then get the dataSource by auto-wiring it and then you can use below code -
Connection conn = null;
PreparedStatement pstmt= null;
try {
DataSource dataSource = (DataSource) Main.APP_CONTEXT
.getBean("dataSource");
conn = dataSource.getConnection();
pstmt = conn.prepareStatement(" your SQL query to insert ");
pstmtMstr.executeQuery();
} catch (Exception e) {
}finally{
if(pstmt!=null){
pstmt.close();
}if(conn!=null){
conn.close();
}
}

Generic SQL for update / insert

I'm writing a DB layer which talks to MS SQL Server, MySQL & Oracle. I need an operation which can update an existing row if it contains certain data, otherwise insert a new row; All in one SQL operation.
Essentially I need to save over existing data if it exists, or add it if it doesn't
Conceptually this is the same as upsert except it only needs to work on a single table. I'm trying to make sure I don't need to delete then insert as this has a performance impact.
Is there generic SQL to do this or do I need vendor specific solutions?
Thanks.
You need vendor specific SQL as MySQL (unlike MS and Oracle) doesn't support MERGE
http://en.wikipedia.org/wiki/Merge_(SQL)
I suspect that sooner rather than later, you're going to need a vendor specific implementation of your DB layer - SQL portability is pretty much a myth as soon as you do anything even slightly advanced.
I am pretty sure this is going to be vendor specific. For SQL Server, you can accomplish this using the MERGE statement.
If you are using SQL Server 2008, use Merge Statement. But keep in mind that if your Insert part has some condition involve, then it cannot be used. In which case you need to write your own way for accomplishing this. And in your case it has to be since you are involving MySQL which does not have a Merge Statement.
Why are you not using an ORM layer (like Entity Framework) for this purpose?
Just some pseudo code(in C#)
public int SaveTask(tblTaskActivity task, bool isInsert)
{
int result = 0;
using (var tmsEntities = new TMSEntities())
{
if (isInsert) //for insert
{
tmsEntities.AddTotblTaskActivities(task);
result = tmsEntities.SaveChanges();
}
else //for update
{
var taskActivity = tmsEntities.tblTaskActivities.Where(i => i.TaskID == task.TaskID).FirstOrDefault();
taskActivity.Priority = task.Priority;
taskActivity.ActualTime = task.ActualTime;
result = tmsEntities.SaveChanges();
}
}
return result;
}
In MySQL you have something similar to merge:
insert ... on duplicate key update ...
MySQL Reference - Insert on duplicate key update