Why do I get a DuplicateKeyException in Linq2Sql after I checked for the keys existence? - linq-to-sql

I have a program that adds a lot of new data to a database using Linq2SQL.
In order to avoid DuplicateKeyExceptions, I check for the existence of the key, before trying to add a new value into the database.
As of now, I can't provide an isolated test-case, but I have simplified the code as much as possible.
// newValue is created outside of this function, with data read from a file
// The code is supposed to either add new values to the database, or update existing ones
var entryWithSamePrimaryKey = db.Values.FirstOrDefault(row => row.TimestampUtc == newValue.TimestampUtc && row.MeterID == newValue.MeterID);
if (entryWithSamePrimaryKey == null)
{
db.Values.InsertOnSubmit(newValue);
db.SubmitChanges();
}
else if(entryWithSamePrimaryKey.VALUE != newValue.VALUE)
{
db.Values.DeleteOnSubmit(entryWithSamePrimaryKey);
db.SubmitChanges();
db.Values.InsertOnSubmit(newValue);
db.SubmitChanges();
}
Strangely enough, when I look at the exceptions in the application log, as to which items cause trouble, I am unable to find ANY of them in the database.
I suspect this happens within the update code, so that the items get removed from the database, but not added again.
I will update my code to deliver more information, and then update this post accordingly.

If the error is generated in the update block, you can merge the object in the update case without deleting entryWithSamePrimaryKey, but valorizing it with the property value of newValue and than save the changes.

Related

Updating a single row in TaffyDB

I currently have a database setup within an html page and my requirement is to update a single row within the application.
I could refresh the database with "fresh" data, but that would require too much time.
I had a look at
dbSports().update("aName", object.aname);
However it seems to update all the records in my database instead of just one. Are there any answers to this particular issue?
The Documentation on the matter is missing a major chunk of information, but is covered in a presentation done by the author of the library (http://www.slideshare.net/typicaljoe/better-data-management-using-taffydb-1357773) [Slide 30]
The querying object needs to be pointing to the object you want to update and editing happening from there. i.e.
var obj = dbObject({
Id : value.id
}).update(function() {
this.aName = object.aname;
return this;
});
Where the object in the query points to the ID of the row and the update function then points to it aswell and the callback updates the value that the application needs to update
you first have to find the matching record, then update it
yourDB({"ID":recordID}).update({
"col1":val1,
"col2":val2,
"col3":val3
});

Prevent Duplicate headers in flat file destination - SSIS

I need some help.
I am importing some data in .csv file from an oledb source. I don't want the headers to appear twice in the destination. If i Uncheck the "Column names in first data row" property , the headers don't get populated in the first execution as well.
Output as of now.
Col1,Col2
A,B
Col1,Col2
C,D
How can I make the package run in such a way that if the file is empty , the headers get inserted. Then if the execution happens again, headers are not included,just the data.
there was a similar thread, but wasn't able to apply the solution as how to use expressions to get the number of rows of destination itself. It was long back , so I created a new.
Your help is deeply appreciated.
-Akshay
Perhaps I'm missing something but this works for me. I am not having the read only trouble with ColumnNamesInFirstDataRow
I created a package level variable named AddHeader, type Boolean and set it to True. I added a Flat File Connection Manager, named FFCM and configured it to use a CSV output of 2 columns HeadCount (int), AddHeader (boolean). In the properties for the Connection Manager, I added an Expression for the property 'ColumnNamesInFirstDataRow' and assigned it a value of #[User::AddHeader]
I added a script task to test the size of the file. It has read/write access to the Variable AddHeader. I then used this script to determine whether the file was empty. If your definition of "empty" is that it has a header row, then I'd adjust the logic in the if check to match that length.
public void Main()
{
string path = Dts.Connections["FFCM"].ConnectionString;
System.IO.FileInfo stats = null;
try
{
stats = new System.IO.FileInfo(path);
// checking length isn't bulletproof based on how the disk is configured
// but should be good enough
// http://stackoverflow.com/questions/3750590/get-size-of-file-on-disk
if (stats != null && stats.Length != 0)
{
this.Dts.Variables["AddHeader"].Value = false;
}
}
catch
{
// no harm, no foul
}
Dts.TaskResult = (int)ScriptResults.Success;
}
I looped through twice to ensure I'd generate the append scenario
I deleted my file and ran the package and only had a header once.
The property that controls whether the column names will be included in the output file or not is ColumnNamesInFirstDataRow. This is a readonly property.
One way to achieve what you are trying to do it would be to have two data flow tasks on the control flow surface preceded by a script task. these two data flow tasks will be identical except that they will be referring to two different flat file connection managers. Again, the only difference between these two would be the different values for the ColumnsInTheFirstDataRow; one true, another false.
Use this Script task to decide whether this is the first run or subsequent runs. Persist this information and check it within the script. Either you can have a separate table for this information, or use some log table to infer it.
Following solution is worked for me.You can also try the following.
Create three variables.
IsHeaderRequired
RowCount
TargetFilePath
Get the source row counts using Execute SQL task and save it in
RowCount variable.
Have script task. Add readonly variables TargetFilePath and
RowCount. Add read and write variable IsHeaderRequired.
Edit the script and add the following line of code.
string targetFilePath = Dts.Variables["TargetFilePath"].Value.ToString();
int rowCount = (int)Dts.Variables["RowCount"].Value;
System.IO.FileInfo targetFileInfo = new System.IO.FileInfo(targetFilePath);
if (rowCount > 0)
{
if (targetFileInfo.Length == 0)
{
Dts.Variables["IsHeaderRequired"].Value = true;
}
else
{
Dts.Variables["IsHeaderRequired"].Value = false;
}
}
Dts.TaskResult = (int)ScriptResults.Success;
Connect your script component to your database
Click connection manager of flat file[i.e your target file] and go
to properties. In the expression, mention the following as shown in
the screenshot.
Map the connectionString to variable "TargetFilePath".
Map the ColumnNamesInFirstDataRow to "IsHeaderRequired".
Expression for Flat file connection Manager.
Final package[screenshot]:
Hope this helps
A solution ....
First, add an SSIS integer variable in the scope of the Foreach Loop or higher - I'll call this RowCount - and make its default value negative (this is important!). Next, add a Row Count to your Data Flow, and assign the result to the RowCount SSIS variable we just made. Third, select your Connection Manager (don't double-click) and open the Properties window (F4). Find the Expressions property, select it, and hit the ellipsis (...) button. Select the ColumnNamesInFirstDataRow property, and use an expression like this:
[#User::RowCount] < 0
Now, when your package starts, RowCount has the static value of -1 or another negative number. When the data flow starts for the first time in your loop, the ColumnNamesInFirstDataRow property will have a value of TRUE. When the first data flow completes, the row count (even if it's zero) is written to the RowCount variable. On the second interation of the loop, the Connection Manager is then reconfigured to NOT write column names...

Do views immediately reflect data changes in their underlying tables?

I have a view ObjectDisplay that is composed of two relevant tables: Object and State. State represents the state of an Object, and the view pulls some of the details from the most recent State for each Object.
On the page that is displaying this information, a user can enter some comments, which creates a new State. After creating the new State, I immediately pull the Object from ObjectDisplay and send it back to be dropped into a partial view and replace the Object in the grid on the page.
// Add new State.
db.States.Add(new State()
{
ObjectId = objectId,
Comments = comments,
UserName = username
});
// Save the changes (executes all of the above).
db.SaveChanges();
// Return the new Object information.
return db.Objects.Single(c => c.ObjectId == objectId);
According to my db trace, the Single call occurs about 70 ms after the SaveChanges call, and it occurs on the same SPID.
Now for the issue: The database defaults the value of RecordDate in State to GETUTCDATE() - I don't provide the date myself. What I'm seeing is that the Object returned has the State's RecordDate of the old State and the Comments of the new State information of the old State. I am seeing that the Object returned has the old State's information. When I refresh the page, all the correct information is there, but the wrong information is returned in the initial call from the database/EF.
So.. what could be wrong? Could the view not be updating quickly enough? Could something be going on with EF? I don't really know where to start looking.
If you've previously loaded the same Object entity in the same DbContext, EF will return the cached instance with the stale values, and ignore the values returned from SQL.
The simplest solution is to reload the entity before returning it:
var result = db.Objects.Single(c => c.ObjectId == objectId);
db.Entry(result).Reload();
return result;
This is indeed odd. In SQL Server views are not persisted by default and therefore show changes in the underlying data right away. You can create a clustered index on a view with effectively persists the query, but in that case the data is updated synchronously, so you should see the change right away.
If you are working with snapshot isolation level your changes might not be visible to other SPIDs right away, but as you are on the same SPID and do not use snapshot isolation, this cant be the culprit either.
The only thing left at this point is the application layer. Are you actually using the result of the Single call higher up in the call stack or does that get lost somewhere. I assume that a refresh of the page uses a different code path, which would explain why it is working there.

INSERT and UPDATE the same row in the same TRANSACTION? (MySQL)

So here's my problem:
I have an article submission form with an optional image upload field.
When the user submits the form - this is roughly what happens:
if($this->view->form->isValid($_POST){
$db->beginTransaction();
try{
// save content of POST to Article table
if(!$this->_saveArticle($_POST)){
return;
}
// resize and save image using ID generated by previous condition
if(!$this->_saveImage($_FILES){
$db->rollback();
return;
}
// update record if image successfully generated
if(!$this->_updateArticle(){
$db->rollback();
}
$db->commit();
}
}catch (Exception $e){
$db->rollback()
}
All Models are saved using mappers, which automate "UPSERT" functionality by checking for the existence of a surrogate key
public function save($Model){
if(!is_null($Model->id_article){
$Mapper->insert($Model->getFields());
return;
}
$Mapper->update($Model->getFields(),$Model->getIdentity());
}
The article table has a composite UNIQUE index of ID,Title and URL. In addition, I'm generating a UID that gets added to the ID field of the Model prior to insert (instead of auto-incrementing)
When I try to execute this, it runs fine for the first article inserted into the table - but subsequent calls (with radically different input) triggers a DUPLICATE KEY error. MySQL throws back the ID generated in condition 1 (_saveArticle) and complains that the key already exists...
I've dumped out the Model fields (and the condition state - i.e. insert | update) and they proceed as expected (pseudo):
inserting!
id = null
title = something
content = something
image = null
updating!
id = 1234123412341234
title = something
content = something else
image = 1234123412341234.jpg
This row data is not present in the database.
I figure this could be one of a few things:
1: I'm loading a secondary DB adapter on user login, allowing them to interface with several sites from one login - this might be confusing the transaction somehow
2: It's a bug of some description in the Zend transaction implementation (possibly triggered by 1)
3: I need to replace the save() with an INSERT ... ON DUPLICATE
4: I should restructure the submission process, or generate a name for the image that isn't dependent on the UID of the previously inserted row.
Still hunting, but I was wondering if anyone else has encountered this kind of issue or could point me in the direction of a solution
best SWK
OK - just for the record, this is entirely possible. The problem was in my application architecture. I was catching Exceptions in my Mapper classes that were handling persistence - and then querying them to return boolean states and thus interrupt the process. This was in turn breaking the try/catch loop which was preventing the insert/update from working correctly.
To summarise - Yes - you CAN insert and update the same row in a single transaction. I've ticked community wiki to cancel rep out

DLINQ- Entities being inserted without .InsertOnSubmit(...)?

I ran into an interesting problem while using DLINQ. When I instantiate an entity, calling .SubmitChanges() on the DataContext will insert a new row into the database - without having ever called .Insert[All]OnSubmit(...).
//Code sample:
Data.NetServices _netServices = new Data.NetServices(_connString);
Data.ProductOption[] test = new Data.ProductOption[]
{
new Data.ProductOption
{
Name="TEST1",
//Notice the assignment here
ProductOptionCategory=_netServices.ProductOptionCategory.First(poc => poc.Name == "laminate")
}
};
_netServices.SubmitChanges();
Running the code above will insert a new row in the database. I noticed this effect while writing an app to parse an XML file and populate some tables. I noticed there were 1000+ inserts when I was only expecting around 50 or so - then I finally isolated this behavior.
How can I prevent these objects from being persisted implicitly?
Thanks,
-Charles
Think of the relationship as having two sides. When you set one side of the relationship the other side needs to be updated so in the case above as well as setting the ProductOptionCategory it is effectively adding the new object to the ProductOptions relationship on the laminate ProductOptionCategory side.
The work-around is as you have already discovered and to set the underlying foreign key instead so LINQ to SQL will not track the objects in the usual way and require implicit indication it should persist the object.
Of course the best solution for performance would be to determine from the source data which objects you don't want to add and never create the instance in the first place.