Best way to perform insert/delete batch spring-jdbc/mysql - mysql

First, I will try to describe what I am willing to do and then, I will ask my questions.
I need to do the following:
List all rows corresponding to some conditions
Do some tests (e.g: check if it wasn't already inserted), if test passes then insert row into another database
Delete row (whether it passed tests or not)
The following is my implementation
List<MyObject> toAdd = new ArrayList<MyObject>();
for(MyObject obj:list){
if(!notYetInserted(obj){
toAdd.add(obj);
}
}
myObjectDAO.insertList(toAdd);
myObjectDAO.deleteList(list);
The service method is marked transactional.
In my DAO methods for deleteList and insertList are pretty similar so I will just put here method for insert.
public void insertList(final List<MyObject> list){
String sql = "INSERT INTO table_test " +
"(col_id, col2, col3,col4) VALUES (?, ?, ?,?)";
List<Object[]> paramList = new ArrayList<Object[]>();
for (MyObject myObject : list) {
paramList.add(new Object[] {myObject.getColId(),
myObject.getCol2(), myObject .getCol3(), myObject.getCol4()}
);
}
simpleJdbcTemplate.batchUpdate(sql, paramList);
}
I am not sure about the best way to perform such operations, I read here that calling for update inside a loop may slow down the system (especially in my case, I will have about 100K insert/delete at a time). I wonder if these additional loops inside DAO won't slow down my system even more and what would happen if problem happened repeatedly while processing that batch (I thought also about moving test from service to DAO to have only one loop and an additional test, I don't really know if it's a good idea). So, I would like your advices. Thanks a lot.
PS: if you need more details feel free to ask!

This is not necessarily a bad approach, but you are right, it might be really slow. If I were to do a process like this that inserted or deleted this many rows I would probably do it all in a stored procedure. My code would just execute the proc and the proc would handle the list and the enumeration through it (as well as the inserts and deletes).

Related

Changes done in a hibernate session in one method are not visible to select query using criteria within the same session

I have a table A with many columns including a column "c".
In a method, I update the value of "c" for row "r1" to "c1" and in one of the subsequent methods (still running in the same thread), I try to read all rows with value of "c" equal to "c1" using hibernate's criteria.
The code snippet is shown below:
#Transactional
public void updateA(long id, long c1)
{
Session currentSession = sessionFactory.getCurrentSession();
A a1 = (A) currentSession.get(A.class.getName(), id);
a1.setC(c1);
currentSession.saveOrUpdate(a1);
}
#Transactional
public void getAllAsForGivenC(long c1)
{
Criteria criteria = sessionFactory.getCurrentSession().createCriteria(A.class.getName());
Criterion cValue= Restrictions.eq("c", "c1");
criteria.add(cValue);
return criteria.list();
}
But when the method getAllAsForGivenC executes, "r1" row is not returned. Both methods run in the same thread and use same hibernate session. Why is getAllAsForGivenC not able to see the row updated in updateA()? What am I doing wrong?
P.S: I run this on MySQL DB (if that matters)
Thanks in advance,
Shobhana
Do session.flush() between your method calls and then try.
e.g.
updateA(1l, 2l);
//do Flush
session.flush();
getAllAsForGivenC(2l);
--Update--
As the documentation says, The process flush occurs by default at the following points:
before some query executions
from org.hibernate.Transaction.commit()
from Session.flush()
Except when you explicitly flush(), there are absolutely no guarantee about when the Session executes the JDBC calls, only the order in which they are executed.
Flushing does not happen before every query! Remember, the purpose of the Hibernate session is to minimize the number of writes to the database, so it will avoid flushing to the database if it thinks that it isn’t needed.
It would have been more intuitive if the framework authors had chosen to name it FlushMode.SOMETIMES.

After issuing an DBQuery with an insert, how can I get the generated id, so I can add it back to the DAO?

Its more than just insert, really. If I already have a partially loaded DAO, how can I load the rest of it?
What I'm going to do is to do a select query, and then use BeanCopy. I'd rather have the result set mapper directly set the properties on the DAO.
Ok, let me try to answer this. For all generated values (like auto-generated IDs) you can use the following flow:
q = DbEntitySql.insert(foo).query();
// ... or any other way to get DbQuery
q.setGeneratedColumns("ID");
q.executeUpdate();
DbOomUtil.populateGeneratedKeys(dao, q);
Basically, for each query/dao you need to specify fields that are autogenerated. Currently there is no annotation for doing so - we are trying to keep number of annotations small as possible. We are working on making this more automatic.
Now, for populating the DAO. I would not use BeanCopy - simply load new DAO instance and ditch the old one. So after you execute the full select query you will get the full DAO loaded, and just continue with it.

LINQ-to-SQL performance question

I am getting an IQueryable from my database and then I am getting another IQueryable from that first one -that is, I am filtering the first one.
My question is -does this affect performance? How many times will the code call the database? Thank you.
Code:
DataContext _dc = new DataContext();
IQueryable offers =
(from o in _dc.Offers
select o);
IQueryable filtered =
(from o in offers
select new { ... } );
return View(filtered);
The code you have given will never call the database since you're never using the results of the query in any code.
IQueryable collections aren't filled until you iterate through them...and you're not iterating through anything in that code sample (ah, the beauty of lazy initialization).
That also means that each of those statements will be executed as its own query against the database which results in no performance cost over doing two completely independent queries.
SO is not a replacement for developer tools. There are many good free tools able to tell you exactly what this code translates into and how it works. Use Reflector on this method and look at what code is generated and reason for yourself what is going on from there.

How to find number of rows affected in LINQ to SQL?

Does anybody know how to find the number of rows affected AFTER I have submitted changes to the data context in LINQ to SQL?
At first I was doing something like this:
Using db as New MyDataContext()
db.Users.Attach(modifiedUser, True)
db.SubmitChanges()
Dim rowsUpdated As Integer = db.GetChangeSet().Updates.Count
End Using
I have since figured out that this doesn't work and that
db.GetChangeSet().Updates.Count
Only tells you how many updates there will be BEFORE you call SubmitChanges().
Is there anyway to find out how many rows have actually been affected?
L2S issues individual insert/update/delete statements for each row affected, so counting entities in the GetChangeSet results will give you the correct 'rows affected' numbers*.
If any row can not be updated due to a change conflict or similar, you'll get an exception during submitchanges, and the transaction will be rolled back.
(* = ...with one exception; if you have any updatable views with instead-of-triggers you could potentially have a situation where the instead-of-trigger hits multiple underlying rows for every row updated. but that is a bit of an edge case... :) )
I have not worked on LINQ to SQL. But, I think it might not be possible.
The reason that comes to mind is: you could do updates to multiple entities before calling SubmitChanges. So, which "records affected" you are looking for won't be known, I guess.
Because a number of different operations could potentially be committed, it's highly unlikely that you'll be able to get back that kind of information.
The SubmitChanges() command will commit inserts, updates and deletes and as far as I can tell there's no way to retrieve the number of rows affected for each (# rows deleted/updated/inserted etc). All you can do is see what is going to be committed, as you've already discovered.
If you had one operation in particular you wanted to perform you could use the ExecuteCommand() method which returns the affected row count.
Add this extension method to app:
/// <summary>
/// Saves all chanches made in this context to the underlying database.
/// </summary>
/// <returns></returns>
/// <exception cref="System.InvalidOperationException">
/// </exception>
public static int SaveChanges(this System.Data.Linq.DataContext context)
{
try
{
int count1 = context.GetChangeSet().Inserts.Count + context.GetChangeSet().Updates.Count + context.GetChangeSet().Deletes.Count;
context.SubmitChanges();
int count2 = context.GetChangeSet().Inserts.Count + context.GetChangeSet().Updates.Count + context.GetChangeSet().Deletes.Count;
return count1 - count2;
}
catch (Exception e)
{
throw new InvalidOperationException(e.Message);
}
}
Call it in below sequence, you will get the count
Dim rowsUpdated As Integer = db.GetChangeSet().Updates.Count
db.SubmitChanges();

Linq to Sql: Can the DataContext instance return collections that include pending changes?

Consider the following code block:
using (PlayersDataContext context = new PlayersDataContext())
{
Console.WriteLine(context.Players.Count()); // will output 'x'
context.Players.InsertOnSubmit(new Player {FirstName = "Vince", LastName = "Young"});
Console.WriteLine(context.Players.Count()); // will also output 'x'; but I'd like to output 'x' + 1
}
Given that I haven't called
context.SubmitChanges();
the application will output the same player count both before and after the InsertOnSubmit statement.
My two questions:
Can the DataContext instance return collections that include pending changes?
Or must I reconcile the DataContext instance with context.GetChangeSet()?
Sure, use:
context.GetChangeSet()
and for more granularity, there are members for Inserts, Updates, and Deletes.
EDIT: I understand your new question now. Yes, if you wanted to include changes in the collection, you would have to somehow combine the collections returned by GetChangeSet() and your existing collections.