I am working in a application Spring java8
I have one function that generate Labels(pdf generation) asynchronously.
it contains a loop, usually it will run more than 1000, it generate more than 1000 pdf labels.
after every loop ends we need to update the database, so that we just saving the status, ie initially it save numberOfgeneratedCount=0 , after each label we just increment the variable and update the table.
It is not Necessary to save this incremented count to db at every loop ends, what we need is in a fixed intervals only we need to update the database to reduce load on dataBase inserts.
currently my code is like
// Label is a database model class labeldb is variable of that
//commonDao.saveLabelToDb function to save Label object
int numberOfgeneratedCount =0;
labeldb.setProcessedOrderCount(numberOfgeneratedCount);
commonDao.saveLabelToDb(labeldb);
for(Order order: orders){
generated = true;
try{
// pdf generation code
}catch Exception e{
// catch block here
generated = false;
}
if(generated){
numberOfgeneratedCount++;
deliveryLabeldb.setProcessedOrderCount(numberOfgeneratedCount);
commonDao.saveLabelToDb(labeldb );
}
}
to improve the performance we need to update database only an interval of 10 seconds. Any help would appreciated
I have done this using the following code, I am not sure about whether this is a good solution, Some one please improve this using some built in functions
int numberOfgeneratedCount =0;
labeldb.setProcessedOrderCount(numberOfgeneratedCount);
commonDao.saveLabelToDb(labeldb);
int nowSecs =LocalTime.now().toSecondOfDay();
int lastSecs = nowSecs;
for(Order order: orders){
nowSecs = LocalTime.now().toSecondOfDay();
generated = true;
try{
// pdf generation code
}catch Exception e{
// catch block here
generated = false;
}
if(generated){
numberOfgeneratedCount++;
deliveryLabeldb.setProcessedOrderCount(numberOfgeneratedCount);
if(nowSecs-lastSecs > 10){
lastSecs=nowSecs;
commonDao.saveLabelToDb(labeldb );
}
}
}
Related
My process looks like:
select some data 50 rows per select,
do sth with data (set some values)
transform row to object of another table
call batchInsert(myListOfRecords).execute()
My problem is how to set up when data should be inserted ? In my current setup data is only inserted at the end of my loop. This is some kind of problem for me because i want process much more data then i do in my tests. So if i will agree with this then my proccess will end with exception (OutOfMemory). Where i should define max amount of data in batch to call instert?
The important thing here is to not fetch all the rows you want to process into memory in one go. When using jOOQ, this is done using ResultQuery.fetchLazy() (possibly along with ResultQuery.fetchSize(int)). You can then fetch the next 50 rows using Cursor.fetchNext(50) and proceed with your insertion as follows:
try (Cursor<?> cursor = ctx
.select(...)
.from(...)
.fetchSize(50)
.fetchLazy()) {
Result<?> batch;
for (;;) {
batch = cursor.fetchNext(50);
if (batch.isEmpty())
break;
// Do your stuff here
// Do your insertion here
ctx.batchInsert(...);
}
}
I have a SSIS package which is trying to read data from a text file. The issue I am facing is that the text file doesn't have very straight forward data as in it has special characters which are creating trouble
For Example, right after the header row, there's a row full of hyphens, something like -----------------------------------------------------------------------------------------
This SSIS is reading as the first value of the first column beacause of which it fails. How do I get rid of this, without actually removing the row from the file itself?
Also, in later part of the file as well, there are some unwanted rows which I would like to ignore, the format of the file is something like this :
Header
Data
Random Rows
Same header row as above
Data
and so on.....
I would like to know if there's a way to handle this with script task or any other way before or while the 'Flat File source' task gets executed, without actually making changes in the original file.
I don't know of anyway to filter these rows on input using the Flat File Source component, but you can definitely do some filtering if you read the file in with a Script Component.
If you add a reference to Microsoft.VisualBasic, you can use the below function to read your CSV into a datatable:
public static DataTable ReadInDataFromCSV(string fileName, string delimiter)
{
DataTable dtOutput = new DataTable();
//How many lines to read in. 0 for unlimited
int numberOfLines = 0;
using (TextFieldParser parser = new TextFieldParser(fileName))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(delimiter);
//Are column names in first row?
bool columnNamesInFirstRow = true;
int rowCounter = 0;
string[] currentRow;
while (!parser.EndOfData && rowCounter <= numberOfLines)
{
try
{
currentRow = parser.ReadFields();
/*****************************
Add some kind of logic here to skip over rows you don't
want to read in
*****************************/
if (columnNamesInFirstRow == true)
{
foreach (string column in currentRow)
{
dtOutput.Columns.Add(column);
}
columnNamesInFirstRow = false;
}
else
{
DataRow dr;
dr = dtOutput.NewRow();
dr.ItemArray = currentRow;
dtOutput.Rows.Add(dr);
columnNamesInFirstRow = false;
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
rowCounter += (numberOfLines == 0) ? 0 : 1;
}
}
return dtOutput;
}
By default, the above code will read a flat file into a DataTable by calling something like:
DataTable myInputData = ReadInDataFromCSV(#"Path to file",",")
If you modify the commend I added inside the try/catch, you can filter out the rows you aren't interested in. For example, to skip the rows with hypens, you can add a simple check like:
if (currentRow.IndexOf("-----") > 0)
{
continue;
}
else
{
//If/else statement from the original code that adds the data to a DataRow and then adds it to the DataTable
}
Then you can simply add more similar checks to include/not include certain rows in your file. Good luck!
I am transferring some data from one table to another using SSIS with EzAPI. How can I get the number of rows that were transferred?
My setup is as follows
EzPackage package = new EzPackage();
EzOleDbConnectionManager srcConn;
EzOleDbSource src;
EzOleDbConnectionManager destConn;
EzOleDbDestination dest;
EzDataFlow dataFlow;
destConn = new EzOleDbConnectionManager(package); //set connection string
srcConn = new EzOleDbConnectionManager(package);
dataFlow = new EzDataFlow(package);
src = Activator.CreateInstance(typeof(EzOleDbSource), new object[] { dataFlow }) as EzOleDbSource;
src.Connection = srcConn;
src.SqlCommand = odbcImport.Query;
dest = Activator.CreateInstance(typeof(EzOleDbDestination), new object[] { dataFlow }) as EzOleDbDestination;
dest.Connection = destConn;
dest.AttachTo(src, 0, 0);
dest.AccessMode = AccessMode.AM_OPENROWSET_FASTLOAD;
DTSExecResult result = package.Execute();
Where in this can I add something to get the number of rows? For all versions of SQL server 2008r2 and up
The quick answer is that the Row Count Transformation isn't included out of the box. I had a brief post about that: Row Count with EzAPI
I downloaded the source project from CodePlex and then edited EzComponents.cs (in EzAPI\src) and added the following code
[CompID("{150E6007-7C6A-4CC3-8FF3-FC73783A972E}")]
public class EzRowCountTransform : EzComponent
{
public EzRowCountTransform(EzDataFlow dataFlow) : base(dataFlow) { }
public EzRowCountTransform(EzDataFlow parent, IDTSComponentMetaData100 meta) : base(parent, meta) { }
public string VariableName
{
get { return (string)Meta.CustomPropertyCollection["VariableName"].Value; }
set { Comp.SetComponentProperty("VariableName", value); }
}
}
The component id above is only for 2008.
For 2012, it's going to be E26997D8C-70DA-42B2-8208-A19CE3A9FE41 I don't have a 2012 installation at the moment to confirm I didn't transpose a value there but drop a Row Count component onto a data flow, right click and look at the properties. The component/class id is what that value needs to be. Similar story if you're dealing with 2005.
So, once you have the ability to use EzRowCountTransform, you can simply patch it into your existing script.
// Create an instance of our transform
EzRowCountTransform newRC = null;
// Create a variable to use it
Variable newRCVariable = null;
newRCVariable = package.Variables.Add("RowCountNew", false, "User", 0);
// ...
src.SqlCommand = odbcImport.Query;
// New code here too
newRC = new EzRowCountTransform(dataFlow);
newRC.AttachTo(src);
newRC.Name = "RC New Rows";
newRC.VariableName = newRCVariable.QualifiedName;
// Continue old code
I have a presentation on various approaches I've used over time and what I like/don't like about them. Type more, click less: a programmatic approach to building SSIS. It contains sample code for creating the EzRowCountTransform and usage.
I have something similar to the code below in LINQPAD using C# Statements. My goal is to get the actual SQL Insert statments not actually update the database.
I can easily delete the data after it inserts with this small sample, but I will need this for a larger push of data. I hope that I have missed something simple either in L2S or LINQPad.
Is there an easier way retrieve the SQL Insert?
var e1 = new MyEntity(){ Text = "First" };
var e2 = new MyEntity(){ Text = "Second" };
MyEntities.InsertOnSubmit(e1);
MyEntities.InsertOnSubmit(e2);
SubmitChanges();
A quick-n-dirty way is to wrap everything in a transaction scope that is never commited:
using(TransactionScope ts = new TransactionScope())
{
var e1 = new MyEntity(){ Text = "First" };
var e2 = new MyEntity(){ Text = "Second" };
MyEntities.InsertOnSubmit(e1);
MyEntities.InsertOnSubmit(e2);
SubmitChanges();
// Deliberately not committing the transaction.
}
This works well for small volumes. If the data volume is large and you have full recovery model on the database the transaction log growth might become a problem.
When we did the samples for "LINQ in Action", we used the following method which gets the scheduled changes from the context:
public String GetChangeText(System.Data.Linq.DataContext context)
{
MethodInfo mi = typeof(DataContext).GetMethod("GetChangeText",
BindingFlags.NonPublic | BindingFlags.Instance);
return mi.Invoke(context, null).ToString();
}
If you want to see this in action, download the samples in LINQPad (see http://www.thinqlinq.com/Default/LINQ-In-Action-Samples-available-in-LINQPad.aspx) and check out chapter 6 example 6.29.
I have a need to select a set of records which contain an IsLocked field. I then need to immediately update the IsLocked value from false to true, within a transaction, such that other programs do not fetch those already-fetched records for processing, but can fetch other, unlocked records.
Here's the code I have so far. Is this correct? And how do I do the update? Visit each record in a foreach, update the value and then SubmitChanges()? It seems that when I run the code below, I lose the collection associated with emails, thus cannot do the processing I need to do. Does closing the transaction early result in losing the records loaded?
To focus the question: how does one load-and-update-in-a-transaction records, close the transaction to not lock for any longer than necessary, process the records, then save subsequent changes back to the database.
using (ForcuraDaemonDataContext ctx = new ForcuraDaemonDataContext(props.EmailLogConnectionString))
{
System.Data.Common.DbTransaction trans = null;
IQueryable<Email> emails = null;
try
{
// get unlocked & unsent emails, then immediately lock the set for processing
ctx.Connection.Open();
trans = ctx.Connection.BeginTransaction(IsolationLevel.ReadCommitted);
ctx.Transaction = trans;
emails = ctx.Emails.Where(e => !(e.IsLocked || e.IsSent));
/// ???
ctx.SubmitChanges();
trans.Commit();
}
catch (Exception ex)
{
if (trans != null)
trans.Rollback();
eventLog.WriteEntry("Error. Could not lock and load emails.", EventLogEntryType.Information);
}
finally
{
if (ctx.Connection.State == ConnectionState.Open)
ctx.Connection.Close();
}
// more stuff on the emails here
}
Please see this question for an answer to a similar, simpler form of the problem.