Entity framework code first custom Id - entity-framework-4.1

I need to have a custom Id when creating a model. For eg.
These are my constraints:
8 digits.
based on contraints, an Id must begin with certain numbers.
How do I make sure I generate no duplicates? I'm using a repository pattern, so My save method looks like:
public User SaveUser(User user);

You can be sure that you will not generate duplicates because it is Id = Primary key which must be unique. Once you try to save the entity whit duplicate id you will get an exception.
The generation algorithm depends on many other factors including where do you want to generate Id, how do you want to differ new Id, how complex is the logic for generation and how do constraint changes and one more important thing - can you have a gaps in the sequence of subsequent ids (for example if you rollback transaction with already generated id)? You will have to find your own mechanism based on these requirements.
I did this few times and I used separate table for sequences storing last used number for each sequence type (constraint) + stored procedure for generating next, storing it and returning it back to application + intensive locking / restrictive transaction isolation levels.

Related

Is JpaRepository.save() suitable for entities with auto generated IDs if we want to UPSERT them?

I'm facing a problem with duplicate records in a MySQL database when I'm upserting using JpaRepository.saveAll() entities which are already present in the database. I want new entities to be inserted, existing ones to be updated(if there are changes to any of the properties), otherwise no action is expected.
The entity classes id property is annotated with #GeneratedValue(GenerationType.IDENTITY) and the id column in the corresponding table in MySQL has auto-increment enabled. I'm pointing that out because JpaRepository.save(), which is invoked for each entity in saveAll(), does a check by id if the entity is already present in the database.
Here is where in my opinion the contradiction between save(), when used for updating, and auto-generation of IDs occurs: You can't update existing records because all of the entities passed into saveAll() will have newly generated IDs and thus the check in save() will always say that they are not present in the database.
Is my logic correct?
The only solution to the problem that I can think of is to create a custom query that compares the records in the database with the newly passed entities by the values of another column whose values are unique. I can't compare them by id because I will encounter the same problem as in save().
Is that good enough and are there any other solutions?
Depending how you look at it, your are either wrong or right.
What you describe in terms of behaviour is correct: If you pass in an entity with id set to null to save you will always create a new row in the database and never perform an update. In that sense you are correct. That behaviour is independent of how the id gets generated for new entities.
But the id of an entity defines its identity. If two entities have the same id they represent the same row in the database thus the same logical entity and the behaviour of save is exactly that of an upsert. This is also independent of how the id gets generated for new entities.
If you want an upsert based on a different column (or columns) you need to write custom code, for example using an actual upsert statement in a query annotation. Alternatively you can try to load the entity by the columns in question and if you succeed set its values as desired and otherwise create a new entity and save that.

Migrate data from MySQL with auto-increment Ids to the Google Datastore?

I am trying to migrate some data from MySql to the Datastore. I have a table called User with auto-increment primary keys (Bigint(20)). Now I want to move the data from the User table to the datastore.
My plan was let the Datastore generate new Ids for the migrated users and all the new user created after the migration is done. However we have many services (notifications, urls etc) that depend on the old ids. So I want to use the old ids for the migrated user, however how can I guarantee that all new generated ids won't collide with the migrated Ids?
Record the maximum and minimum ids before migrating. Migrate all the sql rows to datastore entities, setting entity.key.id = sql.row.id.
To prevent new datastore ids from colliding with the old ones, always call AllocateIds() to allocate new ids. In C#, the code looks like this:
Key key;
Key incompleteKey = _db.CreateKeyFactory("Task").CreateIncompleteKey();
do
{
key = _db.AllocateId(incompleteKey);
} while (key.Path[0].Id >= minOldId && key.Path[0].Id <= maxOldId);
// Use new key for new entity.
In reality, you are more likely to win the lottery than to see a key collide, so it won't cost anything more to check against the range of old ids.
You cannot hint/tell the Datastore to reserve specific IDs. So, if you manually set IDs when inserting existing data, and later have the Datastore assign an ID, it my pick an ID that you have already used. Depending on the operation you are using (e.g. INSERT or UPSERT), the operation may fail or overwrite the existing entity.
You need to come up with a migration plan to map existing IDs to Datastore IDs. Depending on the number of tables you have and the complexity of relations between them, this could become a time consuming project, but you should still be able to do it.
Let's take a simple example and assume you have two tables:
USER (USER_ID is primary key)
USER_DATA (USER_ID is foreign key)
You could possibly add another column to USER (or another way) to map the USER_ID to DATASTORE_ID. Here, you call Datastore's allocateID method for the Kind you want to use and store the returned ID into the new column.
Now, you can move USER data to Cloud Datastore ignoring the MySQL User ID, instead use the ID from the new column.
To migrate the data from USER_DATA, do a join between the two tables and push the data using datastore ID.
Also, note that using sequential IDs (referred to as monotonically increasing values) could cause performance issues with Datastore. So, you probably want to use IDs that are generated by the Datastore.

How to achieve working with autogenerated composite keys in Hibernate + MySql - MariaDB

I've been reading some articles about usage of composite keys in MySql and found that a composite key can't own a auto_increment id column. However, I'm interested in using a similar feature. Let's explain it:
Using MariaDB 10 (InnoDB) and Hibernate 3.6.9
I want to do some of my application table fields translatable. I have thought an only table for translations should be enough. This table has a composite key which has an int value as a key for the translation and also the locale value for the concrete text. The same id and locale values can't place as entries.
So that's how the model should look like:
I don't want the translations to be loaded with each of the random entities as a Collection, I'm thinking about a method like String translationFor(Integer id, Locale loc) could do it for my current locale. However, when I save some translation Set I want to assign them the same id. Let's take this case:
Spanish: Cuchara
English: Spoon
The table should look as:
id locale translation
1 es Cuchara
1 en Spoon
But I can't tell MySql to have a composite id with auto_increment column. So, I consider I should assign it manually, performing these steps:
Build the Translation entities with the locale values
Begin a transaction in Hibernate session
Retrieve the last id value in the translations table
Assign it manually to the entities
Save them
Commit the transaction
Is it the most proper way? Am I doing it atomically?
I assume you are planning on having multiple tables needing the translation of 'spoon'? If so, let me move your focus away from id.
The translation table needs PRIMARY KEY(code, locale) where code is what you have as some_translatable_value in `random_table_1.
code could be the string (perhaps abbreviated) in your favorite language. Note that if you later change the phrasing of the text (to "silver spoon"), do not go back and change code; it can stay the same ("spoon").
I do not know whether you can achieve this in Hibernate; I am not fluent in that. (I tend to avoid 3rd party packages; they tend to get int the way.) If Hibernate forces you to have an AUTO_INCREMENT id on each table, so be it. It will be a harmless waste. You should then declare the pair (code, locale) as unique (in order to get the desired index).

Linking Database Tables Standard Practice

I am looking for the standard way to handle the following Database Situation.
Two Database Tables - One called Part, one called Return. In Part we have information about Part Number, Cost, Received Date, etc.
Return is for if that part is being returned to the vendor. It will have Return Tracking Number, Shipped Date, and If Credited.
A Part can only have one Return but may have none if Part is not returned to vendor.
The 3 options I see are:
Put both Part and Return in the same Table but I do not like this idea, table will get too large.
Create a field in the 'Part' Table to reference the Id of the Return record that it is related to. My Concern here is there could possibly be free floating Return records not attached to a Part
Create a field in the Return Table to reference the Id of the Part record it is related to, making the PartId field unique so I cannot duplicate Part Id.
Is there any advantage or disadvantage to using #2 or #3 (or I guess #1 if that is a viable option)?
UPDATE:
I should have mentioned in reality these tables will be much bigger, and in the application I will be viewing Returns and Parts information in seperate views.
Basically you have 2 entities, Part and Number with 1-1 relationship where one entity is optional.
In this case you should create a table for each entity (i.e. 2 tables) and use the PK of Part as a reference in the Return table. That is the standard way to represent relationships of this kind.
solution 3:
with the exception that you do not need a unique constraint on part_id, just make it the PK (which is almost the same)

database storing multiple types of data, but need unique ids globally

a while ago, i asked about how to implement a REST api. i have since made headway with that, but am trying to fit my brain around another idea.
in my api, i will have multiple types of data, such as people, events, news, etc.
now, with REST, everything should have a unique id. this id, i take it, should be unique to the whole system, and not just each type of data.
for instance, there should not be a person with id #1 and a news item with id of #1. ultimately, these two things would be given different ids altogether: person #1 with unique id of #1 and news item #1 with unique id #2, since #1 was taken by a person.
in a database, i know that you can create primary keys that automatically increment. the problem is, usually you have a table for each data "type", and if you set the auto increment for each table individually, you will get "duplicate" ids (yes, the ids are still unique in their own table, but not the whole DB).
is there an easy way to do this? for instance, can all of these tables be set to work off of one incrementer (the only way i could think of how to put it), or would it require creating a table that holds these global ids, and ties them to a table and the unique id in that table?
You could use a GUID, they will be unique everywhere (for all intents and purposes anyway).
http://en.wikipedia.org/wiki/Globally_unique_identifier
+1 for UUIDs (note that GUID is a particular Microsoft implementation of a UUID standard)
There is a built-in function uuid() for generating UUID as text. You may probably prefix it with table name so that you may easily recognize it later.
Each call to uuid() will generate you a fresh new value (as text). So with the above method of prefixing, the INSERT query may look like this:
INSERT INTO my_table VALUES (CONCAT('my_table-', UUID()), ...)
And don't forget to make this column varchar of large enough size and of course create an index for it.
now, with REST, everything should have a unique id. this id, i take
it, should be unique to the whole system, and not just each type of
data.
That's simply not true. Every resource needs to have a unique identifier, yes, but in an HTTP system, for example, that means a unique URI. /people/1 and /news/1 are unique URI's. There is no benefit (and in fact quite a lot of pain, as you are discovering) from constraining the system such that /news/1 has to instead be /news/0983240-2309843-234802/ in order to avoid conflict.