Jooq batchInsert().execute() - configuration

My process looks like:
select some data 50 rows per select,
do sth with data (set some values)
transform row to object of another table
call batchInsert(myListOfRecords).execute()
My problem is how to set up when data should be inserted ? In my current setup data is only inserted at the end of my loop. This is some kind of problem for me because i want process much more data then i do in my tests. So if i will agree with this then my proccess will end with exception (OutOfMemory). Where i should define max amount of data in batch to call instert?

The important thing here is to not fetch all the rows you want to process into memory in one go. When using jOOQ, this is done using ResultQuery.fetchLazy() (possibly along with ResultQuery.fetchSize(int)). You can then fetch the next 50 rows using Cursor.fetchNext(50) and proceed with your insertion as follows:
try (Cursor<?> cursor = ctx
.select(...)
.from(...)
.fetchSize(50)
.fetchLazy()) {
Result<?> batch;
for (;;) {
batch = cursor.fetchNext(50);
if (batch.isEmpty())
break;
// Do your stuff here
// Do your insertion here
ctx.batchInsert(...);
}
}

Related

How to ensure that updating to database on certain interval

I am working in a application Spring java8
I have one function that generate Labels(pdf generation) asynchronously.
it contains a loop, usually it will run more than 1000, it generate more than 1000 pdf labels.
after every loop ends we need to update the database, so that we just saving the status, ie initially it save numberOfgeneratedCount=0 , after each label we just increment the variable and update the table.
It is not Necessary to save this incremented count to db at every loop ends, what we need is in a fixed intervals only we need to update the database to reduce load on dataBase inserts.
currently my code is like
// Label is a database model class labeldb is variable of that
//commonDao.saveLabelToDb function to save Label object
int numberOfgeneratedCount =0;
labeldb.setProcessedOrderCount(numberOfgeneratedCount);
commonDao.saveLabelToDb(labeldb);
for(Order order: orders){
generated = true;
try{
// pdf generation code
}catch Exception e{
// catch block here
generated = false;
}
if(generated){
numberOfgeneratedCount++;
deliveryLabeldb.setProcessedOrderCount(numberOfgeneratedCount);
commonDao.saveLabelToDb(labeldb );
}
}
to improve the performance we need to update database only an interval of 10 seconds. Any help would appreciated
I have done this using the following code, I am not sure about whether this is a good solution, Some one please improve this using some built in functions
int numberOfgeneratedCount =0;
labeldb.setProcessedOrderCount(numberOfgeneratedCount);
commonDao.saveLabelToDb(labeldb);
int nowSecs =LocalTime.now().toSecondOfDay();
int lastSecs = nowSecs;
for(Order order: orders){
nowSecs = LocalTime.now().toSecondOfDay();
generated = true;
try{
// pdf generation code
}catch Exception e{
// catch block here
generated = false;
}
if(generated){
numberOfgeneratedCount++;
deliveryLabeldb.setProcessedOrderCount(numberOfgeneratedCount);
if(nowSecs-lastSecs > 10){
lastSecs=nowSecs;
commonDao.saveLabelToDb(labeldb );
}
}
}

How to save the results of an optimal power flow in MATPOWER for multiple runs?

I am using MATPOWER for optimal power flow of IEEE30 bus system. I am changing the real power generation of a particular bus 12 times and want to save the result 12 times also. But while doing so only the result of last run is saved in the result struct. The code is given:
P=xlsread('C:\Users\User\Documents\MATLAB\output\sp.xlsx');
for h=1:12
P(h);
**mpc.gen(NG,PG)=P(h);**
mpopt = mpoption('pf.alg', 'NR', 'verbose', 1, 'out.all', 0);
results= runopf(mpc,mpopt);
end
You could store the result struct you get from each opf run in a struct array like so:
for h=1:12
P(h);
**mpc.gen(NG,PG)=P(h);**
mpopt = mpoption('pf.alg', 'NR', 'verbose', 1, 'out.all', 0);
results(h) = runopf(mpc,mpopt);
end
Adressing results should then be possible by calling e.g. results(3).branch or whatever you want to evaluate.

EF6 Transaction

I have a scenario where i have a running total of user deposits.
I am trying to implement the concurrency mechanism that will ensure that two concurrent operation will not take place.
I could have used optimistic concurrency but it seem it wont do the job in my case.
In my case a new deposit transaction will depend on the previous one so i will have one read and one write in the database.
As understand i should have something like this done
public DepositTransaction DepositAdd(int userId, decimal ammount)
{
using (var cx = this.Database.GetDbContext())
{
using (var trx = cx.Database.BeginTransaction(System.Data.IsolationLevel.RepeatableRead))
{
try
{
//here the last deposit ammount is read and new created with same context
var transaction = this.DepositTransaction(userId, ammount, SharedLib.BalanceTransactionType.Deposit, cx);
trx.Commit();
return transaction;
}
catch
{
trx.Rollback();
throw;
}
}
}
}
I spawn multiple threads that call the function but it seem they are not able to get the last data committed by previous call nor does function block and wait for the previous thread to complete.
After getting deeper to documentation of SQL server i found that correct isolation level to achieve this is Serializable.

Possible multiple enumeration of IEnumerable when counting and skipping

I'm preparing data for a datatable in Linq2Sql
This code highlights as a 'Possible multiple enumeration of IEnumerable' (in Resharper)
// filtered is an IEnumerable or an IQueryable
var total = filtered.Count();
var displayed = filtered
.Skip(param.iDisplayStart)
.Take(param.iDisplayLength).ToList();
And I am 100% sure Resharper is right.
How do I rewrite this to avoid the warning
To clarify, I get that I can put a ToList on the end of filtered to only do one query to the Database eg.
var filteredAndRun = filtered.ToList();
var total = filteredAndRun.Count();
var displayed = filteredAndRun
.Skip(param.iDisplayStart)
.Take(param.iDisplayLength).ToList();
but this brings back a ton more data than I want to transport over the network.
I'm expecting that I can't have my cake and eat it too. :(
It sounds like you're more concerned with multiple enumeration of IQueryable<T> rather than IEnumerable<T>.
However, in your case, it doesn't matter.
The Count call should translate to a simple and very fast SQL count query. It's only the second query that actually brings back any records.
If it is an IEnumerable<T> then the data is in memory and it'll be super fast in any case.
I'd keep your code exactly the same as it is and only worry about performance tuning when you discover you have a significant performance issue. :-)
You could also do something like
count = 0;
displayed = new List();
iDisplayStop = param.iDisplayStart + param.iDisplayLength;
foreach (element in filteredAndRun) {
++count;
if ((count < param.iDisplayStart) || (count > iDisplayStop))
continue;
displayed.Add(element);
}
That's pseudocode, obviously, and I might be off-by-one in the edge conditions, but that algorithm gets you the count with only a single iteration and you have the list of displayed items only at the end.

Matlab: Abort function call with Strg+C but KEEP return value

I have a function in matlab with something like this:
function [ out ] = myFunc(arg1, arg2)
times = [];
for i = 1:arg1
tic
% do some long calculations
times = [times; toc];
end
% Return
out = times;
end
I want to abort the running function now but keep the values of times which are currently already taken. How to do it? When I press strg+c, I simply loose it because it's only a local function variable which is deleted when the function leaves the scope...
Thanks!
Simplest solution would be to turn it from a function to a script, where times would no longer be a local variable.
The more elegant solution would be to save the times variable to a .mat file within the loop. Depending on the time per iteration, you could do this on every loop, or once every ten loops, etc.
Couldn't you use persistent variables to solve your problem, e.g.
function [ out ] = myFunc(arg1, arg2)
persistent times
if nargin == 0
out = times;
return;
end;
times = [];
for i = 1:arg1
tic
% do some long calculations
times = [times; toc];
end
% Return
out = times;
end
I'm not sure whether persistent variables are cleared upon Ctrl-C, but I don't think it should be the case. What this should do: if you supply arguments, it will run as before. When you omit all arguments however, the last value of times should be returned.
onCleanup functions still fire in the presence of CTRL-C, however I don't think that's really going to help because it will be hard for you to connect the value you want to the onCleanup function handle (there are some tricky variable lifetime issues here). You may have more luck using a MATLAB handle object to track your value. For example
x = containers.Map(); x('Value') = [];
myFcn(x); % updates x('Value')
% CTRL-C
x('Value') % contains latest value
Another possible solution is to use the assignin function to send the data to your workspace on each iteration. e.g.
function [ out ] = myFunc(arg1, arg2)
times = [];
for i = 1:arg1
tic
% do some long calculations
times = [times; toc];
% copy the variable to the base workspace
assignin('base', 'thelasttimes', times)
end
% Return
out = times;
end