Many-to-many and hierarchical database design - mysql

i'm designing a database and run into some problems:
I have a Document entity consisting of many fields. Mostly i want to use the same version of that document but sometimes users should be able to customize some of the fields for a specific usage of the document. Not customized fields should use the value of the parent document as a default. Changes to the parent document should get propagated to any not customized field of a specific usage.
I have a table Document:
Document:
id | field1| field2 | field3| parentDocumentId
For parent documents is parentDocumentId = Null. Customized usages of a document have the new value for the customized fields saved and for not customized fields simply null.
Is this a good design?
Furthermore the Document entity has a many-To-many relationship with another entity Course. My problem now is when i look at a row of Document with parentDocumentId != null and courses = null i can't determine if the relationship is either not customized by the user yet and i should use the value of the parent document or the field is customized by the user and simply has no courses. How can i solve that?
Thanks

Related

Is JpaRepository.save() suitable for entities with auto generated IDs if we want to UPSERT them?

I'm facing a problem with duplicate records in a MySQL database when I'm upserting using JpaRepository.saveAll() entities which are already present in the database. I want new entities to be inserted, existing ones to be updated(if there are changes to any of the properties), otherwise no action is expected.
The entity classes id property is annotated with #GeneratedValue(GenerationType.IDENTITY) and the id column in the corresponding table in MySQL has auto-increment enabled. I'm pointing that out because JpaRepository.save(), which is invoked for each entity in saveAll(), does a check by id if the entity is already present in the database.
Here is where in my opinion the contradiction between save(), when used for updating, and auto-generation of IDs occurs: You can't update existing records because all of the entities passed into saveAll() will have newly generated IDs and thus the check in save() will always say that they are not present in the database.
Is my logic correct?
The only solution to the problem that I can think of is to create a custom query that compares the records in the database with the newly passed entities by the values of another column whose values are unique. I can't compare them by id because I will encounter the same problem as in save().
Is that good enough and are there any other solutions?
Depending how you look at it, your are either wrong or right.
What you describe in terms of behaviour is correct: If you pass in an entity with id set to null to save you will always create a new row in the database and never perform an update. In that sense you are correct. That behaviour is independent of how the id gets generated for new entities.
But the id of an entity defines its identity. If two entities have the same id they represent the same row in the database thus the same logical entity and the behaviour of save is exactly that of an upsert. This is also independent of how the id gets generated for new entities.
If you want an upsert based on a different column (or columns) you need to write custom code, for example using an actual upsert statement in a query annotation. Alternatively you can try to load the entity by the columns in question and if you succeed set its values as desired and otherwise create a new entity and save that.

Symfony 4 - How to dynamically add field in an entity?

I want to have a form where I can add new fields (columns) in an specific entity. Is there a function for this?
Kind regards
Adding a whole column to the table through an HTML form is a weird use case.
If you want to stick to the ORM way of managing the persisted data, you'll have to dynamically add properties to existing entities, which might be a sign of bad schema design.
What I would guess you probably need is an automated way to add this column to your Entity. In such a case I would use the maker bundle.
Supposing that your Entity is called Employee, all you have to do is to type in the following command:
bin/console make:entity
When you'll be asked for the Entity name, enter Employee. The interpreter will tell you that this entity exists and if you want to extend it with news fields, and there you go.

How to handle MySQL too much tables trouble?

I'm building application where users can import their applications. For every application that user import he has option to import his users. Also, he has ability to add custom fields to the table where his users are stored. When I proposed solution that creates a new table for every users application, my chief told me that's not good solution because there will be too much tables (for 1000 applications 2000 tables). Now I'm wondering what is the optimal solution to this MySQL resource trouble?
You have a single core users table with the most common fields (id, username, password, email, application id, etc.), where the application id distinguishes between users of different applications.
You create a property definition table, where you have application id, property id and property name fields as a minimum (if other tables are extendible as well, then you should have a field identifiying the table the property relates to).
You will also need a properties table to hold the actual propery values. If multiple tables are extendable, then you may consider having multiple properties table. The properties table will have application id, user id, property id, property value fields as a minimum. Property value field should have a general data type, such as varchar because you can store almost all data types as text.
This way you can avoid creating separate user table for each of the applications. The drawback is that it will be more complicated to retrieve and edit user data.

Database Design - Creating Profile based on User Type

I have a requirement to show Different fields to Different User Types.
For Example, Admin UserType, show the form with 10 attributes
Super UserType, show the form with 2 attributes
Normal UserType, show the form with 2 attributes
How do I design the database table such a way UserType and the attributes are dynamic ?
Raja K
I imagine there are some common attributes among the users, right? You might approach this by "supertyping" the tables. First create a base table with the common attributes. Something like this:
Users
----------
ID (PK)
Username
AccountCreatedDate
etc.
Any user account would have a record in this table. Depending on whether or not it makes ongoing operations simpler you might even include a flag in the table indicating the user type.
You might then add additional tables for the other user types, where their PK is also a FK to this base table. Something like this:
AdminUsers
----------
ID (PK, FK to Users)
etc.
That would contain the attributes specific to an admin user. Another table would contain attributes specific to a super user. And so on. An added benefit here is that a single user can have multiple roles and be interpreted in multiple ways depending on the use case. And you can have some simple compiled views in the database which make querying the table structure easier.
This would work well for a static set of user types. If that set is going to change often during normal application usage (that is, if one of the operations of the application is that people can add user types) then you wouldn't want a rigid schema.
In cases like that you might make treat the fields as meta-attributes on a generic table of user properties. So you might have your base table again:
Users
----------
ID (PK)
Username
UserType
etc.
And then you might have a generic table of properties:
UserProperties
----------
ID (PK)
UserID (FK to Users)
PropertyName
PropertyValue
This is more dynamic, but it has some drawbacks that come to mind:
You can't maintain data types in the database. Everything becomes "stringly typed" and it's up to the application to interpret the types correctly. This will result in a ton of defensive programming code in the application.
You can't maintain the schema in the database. Things like required properties would need to be maintained by the application, the database couldn't guarantee it. So the potential for dirtier data is higher.
It's more difficult to query and report on this structure.
So there are pros and cons either way, whichever approach you take is up to you and the needs of the system you're building.

Implementing custom fields with ALTER TABLE

We are currently thinking about different ways to implement custom fields for our web application. Users should be able to define custom fields for certain entities and fill in/view this data (and possibly query the data later on).
I understand that there are different ways to implement custom fields (e.g. using a name/value table or using alter table etc.) and we are currently favoring using ALTER TABLE to dynamically add new user fields to the database.
After browsing through other related SO topics, I couldn't find any big drawbacks of this solution. In contrast, having the option to query the data in fast way (e.g. by directly using SQL's where statement) is a big advantage for us.
Are there any drawbacks you could think of by implementing custom fields this way? We are talking about a web application that is used by up to 100 users at the same time (not concurrent requests..) and can use both MySQL and MS SQL Server databases.
Just as an update, we decided to add new columns via ALTER TABLE to the existing database table to implement custom fields. After some research and tests, this looks like the best solution for most database engines. A separate table with meta information about the custom fields provides the needed information to manage, query and work with the custom fields.
The first drawback I see is that you need to grant your application service with ALTER rights.
This implies that your security model needs careful attention as the application will be able to not only add fields but to drop and rename them as well and create some tables (at least for MySQL).
Secondly, how would you distinct fields that are required per user? Or can the fields created by user A be accessed by user B?
Note that the cardinality of the columns may also significantly grow. If every user adds 2 fields, we are already talking about 200 fields.
Personally, I would use one of the two approaches or a mix of them:
Using a serialized field
I would add one text field to the table in which I would store a serialized dictionary or dictionaries:
{
user_1: {key1: val1, key2, val2,...},
user_2: {key1: val1, key2, val2,...},
...
}
The drawback is that the values are not easily searchable.
Using a multi-type name/value table
fields table:
user_id: int
field_name: varchar(100)
type: enum('INT', 'REAL', 'STRING')
values table:
field_id: int
row_id: int # the main table row id
int_value: int
float_value: float
text_value: text
Of course, it requires a join and is a bit more complicated to implement but far more generic and, if indexed properly, quite efficient.
I see nothing wrong with adding new custom fields to the database table.
With this approach, the specific/most appropriate type can be used i.e. need an int field? define it as int. Whereas with a name/value type table, you'd be storing multiple data types as one type (nvarchar probably) - unless you complete that name/value table with multiple columns of different types and populate the appropriate one but that is a bit horrible.
Also, adding new columns makes it easier to query/no need to involve a join to a new name/value table.
It may not feel as generic, but I feel that's better than having a "one-size fits all" name/value table.
From an SQL Server point of view (2005 onwards)....
An alternative, would be to store create 1 "custom data" field of type XML - this would be truly generic and require no field creation or the need for a separate name/value table. Also has the benefit that not all records have to have the same custom data (i.e. the one field is common, but what it contains doesn't have to be). Not 100% on the performance impact but XML data can be indexed.