Entity Framework 4.2 - How to realize TPT-Inheritance with Database-generated Primarykey Value? - entity-framework-4.1

I want to use the EF (4.2) in the following scenario:
There exists a database already (so I chose the database-first approach) and it is a SQL Anywhere DB.
I want to use persistence-ignorant business objects, so I use the DbContext Template to generate POCO classes from the EDM.
There is one simple inheritance hierarchy among my entities: an abstract base entity and two concrete derived entities.
In the database there is one table for each type of the inheritance hierarchy (Table-Per-Type Strategy).
Each of these three tables has a primary key column (Id, type:integer), and the association of a concrete entity to the base entity is done by having the same Id in both tables (that means that the primary key (Id) of the concrete type tables is at the same time a foreign key to the base table; a pretty common approach I think).
I had to define the Inheritance manually in the designer, since the EDM assistant does not automatically recognize, that is want to have an inheritance association between the described entities.
Until this point there wasn't any bigger problem. Now to the issue at hand:
There is a restriction for the database I use: Primarykey values have to be generated by the database, using a database function.
I want to call this function in a before-insert-trigger defined on the base-table.
To let the entity framework know that a value is generated by the database, I set the StoreGeneratedPattern property of the Id Property of the base-entity to Identity (As I understood, this is the way to tell EF to get the generated value after inserting a new instance of an entity).
When I create a new instance of a derived entity, add it to the corresponding DbSet of the DbContext and call SaveChanges on the context, a DbUpdateException is thrown, stating that a foreignkey constraint is violated.
By checking the request-log of the DB, I see that the base entity got inserted in the base table, but on inserting the row in the derived table, the above mentioned error occurs, because it obviously doesn't use the newly generated Id of the new entry in the base table.
Since I don't think there is much I can do on a database level against that, the question is, if the EDM or DbContext can be configured (or modified) to insert the base row first, then take the generated Id and use it for insertion of the derived row.
I know there are several way to avoid this situation (not using inheritance, using a stored procedure to insert a new derived entity, calling the id-generating db-function before inserting and set the Id property myself on the entity), but at the moment the above-described behavior would be the most preferable, so I want to make sure not to overlook something before deciding for any "plan B".
Any suggestions on this topic are much appreciated,
Thanks in advance.
Here is the code of the trigger:
ALTER TRIGGER "TRG_GENERATE_ID" before insert order 1 on
BASE_TABLE
referencing new as NewEntry
for each row
begin
declare NewID integer;
set NewID = F_GET_NEW_ID('BASE_TABLE', NewEntry.SOME_OTHER_ID);
set NewEntry.ID = NewID
end
The function "F_GET_NEW_ID" is called in the trigger to generate the new ID for a new entry in the base table. It has two parameters:
"Tablename" -> The name of the table for which a new ID should be generated,
and a second parameter that takes the value of a standardcolumn in all tables of the database (it is required to generate the new ID).

Related

Is JpaRepository.save() suitable for entities with auto generated IDs if we want to UPSERT them?

I'm facing a problem with duplicate records in a MySQL database when I'm upserting using JpaRepository.saveAll() entities which are already present in the database. I want new entities to be inserted, existing ones to be updated(if there are changes to any of the properties), otherwise no action is expected.
The entity classes id property is annotated with #GeneratedValue(GenerationType.IDENTITY) and the id column in the corresponding table in MySQL has auto-increment enabled. I'm pointing that out because JpaRepository.save(), which is invoked for each entity in saveAll(), does a check by id if the entity is already present in the database.
Here is where in my opinion the contradiction between save(), when used for updating, and auto-generation of IDs occurs: You can't update existing records because all of the entities passed into saveAll() will have newly generated IDs and thus the check in save() will always say that they are not present in the database.
Is my logic correct?
The only solution to the problem that I can think of is to create a custom query that compares the records in the database with the newly passed entities by the values of another column whose values are unique. I can't compare them by id because I will encounter the same problem as in save().
Is that good enough and are there any other solutions?
Depending how you look at it, your are either wrong or right.
What you describe in terms of behaviour is correct: If you pass in an entity with id set to null to save you will always create a new row in the database and never perform an update. In that sense you are correct. That behaviour is independent of how the id gets generated for new entities.
But the id of an entity defines its identity. If two entities have the same id they represent the same row in the database thus the same logical entity and the behaviour of save is exactly that of an upsert. This is also independent of how the id gets generated for new entities.
If you want an upsert based on a different column (or columns) you need to write custom code, for example using an actual upsert statement in a query annotation. Alternatively you can try to load the entity by the columns in question and if you succeed set its values as desired and otherwise create a new entity and save that.

Why can't I create a 1 to many relationship without a primary key?

I've been messing with the Design view of my DBML class for hours now. I have one class, call it A, that has a 1 to many relationship with B, C, D, and E. In the generated code I can see that Class A has generated
private EntitySet<BB> _bb;
private EntitySet<CC> _cc;
private EntitySet<EE> _ee;
But it hasn't generated one for D. Finally for giggles I added a primary key to D; all the other classes had one except for D; and NOW it's generating a EntitySet _dd. But why is this? I don't need that table to have a specified primary key.
I assume you are using LINQ to SQL due to the .dbml files. LINQ to SQL (and Entity Framework to some degree) struggle with tables that do not contain primary keys. Specifically, the table needs a primary key to implement INotifyPropertyChanged (the interface that tracks changes for a specific identity... how can an entity be tracked if it does not have a primary key?). A good example of why this is needed can be found here.
https://social.msdn.microsoft.com/Forums/en-US/f3b216d2-fa06-49a1-a901-11702e80b38c/linq-to-sql-table-doesnt-have-primary-key?forum=linqtosql
As a follow up, is there a specific reason why the table does not have a primary key? Does it not represent a entity in your data model? If it is a "lookup" table perhaps you can wrap the functionality in a stored procedure and then call the stored procedure via LINQ to SQL.

Problem with hibernate trigger-generated ids (MySQL)

I'm using before and after insert triggers to generate ids (primary key) of the form "ID_NAME-000001" in several tables. At the moment, the value of the hibernate generator class of these pojos is assigned. A random string is assigned to the object to be persisted and when it's inserted by hibernate, the trigger assigns a correct id value.
The problem with this approach is that I'm unable to retrieve the persisted object because the id only exists in the database, not in the object I just saved.
I guess I need to create a custom generator class that could retrieve the id value assigned by the trigger. I've seen an example of this for oracle (https://forum.hibernate.org/viewtopic.php?f=1&t=973262) but I haven't been able to create something similar for MySQL. Any ideas?
Thanks,
update:
Seems that this is a common and, yet, not solved problem. I ended up creating a new column to serve as a unique key to use a select generator class.
Hope this won't spark a holy war for whether using surrogate key or not. But it's time to open the conversation here.
Another approach would be just, use the generated key as surrogate key and assign a new field for your trigger assigned id. The surrogate key is the primary key. You have the logically named key (such as the "ID_NAME-000001" in your example). So your database rows will have 2 keys, the primary key is surrogate key (could be UUID, GUID, running number).
Usually this approach is preferable, because it can adapt to new changes better.
Say, you have these row using surrogate key instead of using the generated id as natural key.
Surrogate key:
id: "2FE6E772-CDD7-4ACD-9506-04670D57AA7F", logical_id: "ID_NAME-000001", ...
Natural key:
id: "ID_NAME-000001", ...
When later a new requirement need the logical_id to be editable, auditable (was it changed, who changed it when) or transferable, having the logical_id as primary key will put you in trouble. Usually you cannot change your primary key. It's horribly disadvantage when you already have lots of data in your database and you have to migrate the data because of the new requirement.
With surrogate key solution, it'll be easy, you just need to add
id: "2FE6E772-CDD7-4ACD-9506-04670D57AA7F", logical_id: "ID_NAME-000001", valid: "F", ...
id: "0A33BF97-666A-494C-B37D-A3CE86D0A047", logical_id: "ID_NAME-000001", valid: "T", ...
MySQL doesn't support sequence (IMO autoincrement isn't comparable to sequence). It's different from Oracle/PostgreSQL's sequence. I guess that's the cause why it's difficult to port the solution from Oracle database to MySQL. PostgeSQL does.

Relational tables design problem

How do you retain historical relational data if rows are changed? In this example, users are allowed to edit the rows in the Property table at any time. Tests can have any number of properties. If they edit the field 'Name' in the Property table, or drop a row in the Property table, Test rows might not hold conditions at the time of the test. Would you change the design of the Test table by adding a property names column, and dropping the TestProperty mapping table? The property names column would have to be something like a delimited list of strings. How is problem usually handled?
3 tables:
Test:
TestId AUTONUMBER,
Name CHAR,
TestDate DATE
Property:
PropertyId AUTONUMBER,
Name CHAR
TestProperty: (maps properties to tests)
TestId
PropertyId
I do not think the question has been answered fully.
If they edit the field 'Name' in the Property table ... Would you change the design of the Test table by adding a property names column, and dropping the TestProperty mapping table?
Definitely not. That would add massive duplication for no purpose.
If your requirement is to maintain the integrity of the data values (in Property) at the time of the Test, the correct (database) method is to implement a History table. That should be an exact copy of the source table, plus one item: a TIMESTAMP or DATETIME column is added to the PK.PropertyHistory
PropertyId AUTONUMBER,
Name CHAR
CONSTRAINT PRIMARY KEY CLUSTERED UC_PK (PropertyId)
PropertyHistory
PropertyId INT,
AuditedDtm DATETIME,
Name CHAR
CONSTRAINT PRIMARY KEY CLUSTERED UC_PK (PropertyId, AuditedDtm)
For this to be meaningful and useable, the Test table needs a timestamp as well, to identify which version of ProperyHistory to reference:TestProperty
TestId
PropertyId
TestDtm DATETIME
The property names column would have to be something like a delimited list of strings.
That would break basic design rules as well as Database Normalisation rules, and prevent you from performing ordinary Relational operations on it. Never store more than one data value in a single column.
... or drop a row in the Property table
Deletion is something different again. If it is a "database" then it has Integrity. Therfore you cannot delete a parent row if it has child rows in some other table (and you can delete it if it does not have children). This is usually implemented as a "soft delete", an Indicator such as IsObsolete is added. This is referenced in the various SELECTS to exclude the row from being used (to add new children) but remains available as the parent for existing children.
If you want to retain property relations, even if the property doesn't exist. Make it so that Properties aren't necessarily deleted, but add a flag that denotes if the property is currently active. If a property's name is changed, create a new property with the new name and set the old property to inactive.
If you do this, you'll have to create some way of garbage collecting the inactive properties.
I'd never make a single column into a field that imitates a one-to-multi relationship with a comma-denoted list. Otherwise, you defeat the purpose of relational database.
Seems like you're using Test as both a template for a particular instance of a test, as well as the test itself. Maybe every time a user performs a test according to the specification in Test, create a row in, say, TestRun? This would preserve the particular Propertys, and if the entries in Property change later, then subsequent TestRuns would reflect the new changes.

When are LinqToSql Entity Id's assigned?

Are they assigned at SubmitChanges? or when a new object is created? If the latter, I would imagine there would be collisons?
If the id field is an autogenerated (identity/guid) field, the id is assigned when the record is inserted into the database. LINQToSQL does a select after insert to get the assigned value and updates it in the object. There are no collisions using identity columns as long as you don't turn on allow identity insert. If the id is not autogenerated, then you will be responsible for creating the id and ensuring that there aren't collisions.