We're building a new piece of software for our company, where we want to manage our inventory.
The goal for the tool is to be customizable by the customer.
My part is mostly on the DB side. We have chosen MariaDB as our DB engine, and while we are working with the rather static functionality of a relational DB, we want to realize a rather dynamic solution.
Our chief programmer has explained to me the basics of the concept I shall implement into our DB:
We want a table which basically just consists of other tables.
Lets call it "maintable".
Maintable shall then reference its "attributes", which are the other tables.
For example, maintable references "Workstations".
"Workstations" then contains attributes like CPU, RAM, Drives, PSU etc..
And now comes the part which I didn't completely understand. The actual VALUES to these attributes in "Workstations" shall not be inserted into "Workstations". Instead, they are packed into another (junction?) table.
The reason for this approach is that the customer shall be able to customize the DB to his needs.
When the customer wants to add another attribute, he shall be able to do so. For example, if a new PSU now requires another attribute for an additional serial number, then the customer shall be able to simply create this new attribute in the front-end input form and then persist it to the DB.
If someone could point to good tutorials explaining this type of DB concept, then I would be glad as well! :=)
In my current design, I have app_group, student and group_article:
To structurally ensure that a group_article is only associated with a student from that same group, the foreign keys "publisher" and "app_group" are taken from the join entity group_member (1) as opposed to having them issued from student and app_group individually. This way, someone with the right to insert new records into the database cannot introduce incoherent data such as adding an article that have been written by a student that isn't even in that group which would be poor design. Now, I want generalize this approach into multiple students or multiple groups. I now have group_message, group_message_in and group_message_out which is an inheritance chain (group_message is the base which is an abstract entity in Symfony, and both group_message_in and group_message_out extend it):
Initially, I was planning to embed the group foreign key on the base class (group_message) and have the sender/recipient (respectively on group_message_out and group_message_in) be taken from student directly:
However, this will leave the database vulnerable to incoherence as per the first example, eg: student from group A can be associated with a message that targets student from group B which is not desirable (only students from the same group can exchange group_message).
I'm well aware that I can amend this risk in code but I want a similar solution to (1) and to know if this is achievable with Doctrine since MySQL itself might have ways of solving a similar problem that aren't supported by Doctrine.
A relational solution to your problem would look something like this:
The integrity that you seek would be achieved by the PK-FK relationships and by assigning a student to a group using the groupName colums.
Your question then becomes something like "How can I use Doctrine to do the same thing?"
To the best of my knowledge Doctrine uses a set of PHP libraries to create what its proponents call a "persistence layer" that stores what it calls "Entities". With Doctrine, the term "Entity" is a synonym for "Class" in the OO paradigm.
In other words Doctrine stores classes in the data layer.
And now we can see the problem.
A relational schema is a structure of relations which is a completely different kind of artefact than a collection of classes.
The OO/Relational divide has been called an "impedance mismatch". Unfortunately this term obscures more than it reveals.
To quote from the Wikipedia article: "There have been some attempts at building object-oriented database management systems (OODBMS) that would avoid the impedance mismatch problem. They have been less successful in practice than relational databases however, partly due to the limitations of OO principles as a basis for a data model."
I suggest that you also review Ted Neward's article "The Vietnam of Computer Science."
This new answer shows the object-role model, the relational schema that it generates and the logic that is implied by the new constraint (shown by the red arrow)
The object- role model.
This is the logic that is asserted by the fact type Student(.id) is a member of Group(.name)
Now as the domain expert, you can read this verbalization and tell me whether it is True or False in your domain.
Please note that all I did as the modeler, was to change the constraint (shown by the red arrow) and the ORM tool called NORMA generated the new verbalization that you see here.
When the domain expert agrees that the model conforms to the requirements then it takes a few seconds to generate the SQL DDL that can then be used to create a new database schema in an RDBMS.
This question aims to get the most clean and "best" way to handle this kind of problem.
I've read many questions about how to handle inheritance in SQL and like the Table Per Type model most and would like to use it. The problem with this is that you have to know what type you are going to query to do the proper join.
Let's say we have three tables:Son, Daughter and Child.
This works very well if you for example want to query all daughters. You can simply join the child and get all the information.
What I'm trying to do is to query a Child by ID and get the associated sub class information. What I could do is to add a column Type to the child and select the associated data with a second select, but that does not seem pretty nice. Another way to do it would be to join all sub tables, but that doesn't seem to be that nice either.
Is there an inheritance model to solve this kind of problem in a clean, nice and performant way?
I'm using MySQL btw
Given your detailed definition in the comment with the use case
The Server gets the http request domain.com/randomID.
it becomes apparent, that you have a single ID at hand for which you want to retrieve the attributes of derived entities. For your case, I would recommend to use the LEFT JOIN approach:
SELECT age,
son.id is not null as isSon,
randomColumn,
daughter is not null as isDaughter,
whatEver
FROM child
LEFT JOIN son on child.id = son.id
LETT JOIN daughter on child.id = daughter.id
WHERE
child.id = #yourRandomId
This approach, BTW, stays very close to your current database design and thus you would not have to change much. Yet, you are able to benefit from the storage savings that the improved data model provides.
Besides that, I do not see many chances to do it differently:
You have different columns with different datatypes (esp. if looking at your use case), so it is not possible to reduce the number of columns by combining some of them.
Introducing a type attribute is already rejected in your question; sending single SELECT statements as well.
In the comment you are stating that you are looking for something like Map<ID, Child> in MySQL. Please note that this java'ish expression is a compile-time expression which gets instantiated during runtime with the corresponding type of the instance. SQL does not know the difference between runtime and compile-time. Thus, there is also no need for such a generic expression. Finally, also please note that in case of your Java program, you also need to analyse (by introspection or usage of instanceof) which type your value instance has -- and that is also a "single-record" activity which you need to perform.
I'm fairly new to Tridion and I have to implement functionality that will allow a content editor to create a component and assign multiple date ranges (available dates) to it. These will need to be queried from the broker to provide a search functionality.
Originally, this was only require a single start and end date and so were implemented as individual meta data fields.
I am proposing to use an embedded schema within the schema's 'available dates' metadata field to allow multiple start and end dates to be assigned.
However, as the field is now allowing multiple values, the data is stored in the broker as comma separated values in the 'KEY_STRING_VALUE' column rather than as a date value in the 'KEY_DATE_VALUE' column as it was when it was only allowed a single start and end values.
eg.
KEY_NAME | KEY_STRING_VALUE
end_date | 2012-04-30T13:41:00, 2012-06-30T13:41:00
start_date | 2012-04-21T13:41:00, 2012-06-01T13:41:00
This is now causing issues with my broker querying as I can no longer use simple query logic to retrieve the items I require for the search based on the dates.
Before I start to write C# logic to parse these comma separated dates and search based on those, I was wondering if anyone had had similar requirements/experiences in the past and had implemented this in a different way to reduce the amount of code parsing required and to use the broker querying to complete the search.
I'm developing this on Tridion 2009 but using the 5.3 Broker (for legacy reasons) so the query currently looks like this (for the single start/end dates):
query.SetCustomMetaQuery((KEY_NAME='end_date' AND KEY_DATE_VALUE>'" + startDateStr + "') AND (ITEM_ID IN(SELECT ITEM_ID FROM CUSTOM_META WHERE KEY_NAME='start_date' AND KEY_DATE_VALUE<'" + endDateStr + "')))";
Any help is greatly appreciated.
Just wanted to come back and give some details on how I finally approached this should anyone else face the same scenario.
I proposed the set number of fields to the client (as suggested by Miguel) but the client wasn't happy with that level of restriction.
Therefore, I ended up implementing the embeddable schema containing the start and end dates which gave most flexibility. However, limitations in the Broker API meant that I had to access the Broker DB directly - not ideal, but the client has agreed to the approach to get the functionality required. Obviously this would need to be revisited should any upgrades be made in the future.
All the processing of dates and the available periods were done in C# which means the performance of the solution is actually pretty good.
One thing that I did discover that caused some issues was that if you have multiple values for the field using the embedded schema (ie in this case, multiple start and end dates) then the meta data is stored in the KEY_STRING_VALUE column in the CUSTOM_META table. However, if you only have a single value in the field (i.e. one start and end date) then these are stored as dates in the KEY_DATE_VALUE column in the same way as if you'd just used single fields rather than an embeddable schema. It seems a sensible approach for Tridion to take but it serves to make it slightly more complicated when writing the queries and the parsing code!
This is a complex scenario, as you will have to go throughout all the DCPs and parse those strings to determine if match the search criteria
There is a way you could convert that metadata (comma separated) in single values in the broker, but the name of the fields need to be different Range1, Range2, ...., RangeN
You can do that with a deployer extension where you change the XML Structure of the package and convert each those strings in different values (1,2, .., n).
This extension can take some time if you are not familiar with deployer extensions and doesn't solve 100% your scenario.
The problem of this is that you still have to apply several conditions for retrieve those values and there is always a limit you have to set (Versus the User that can add as may values as wants)
Sample:
query.SetCustomMetaQuery((KEY_NAME='end_date1'
query.SetCustomMetaQuery((KEY_NAME='end_date2'
query.SetCustomMetaQuery((KEY_NAME='end_date3'
query.SetCustomMetaQuery((KEY_NAME='end_date4'
Probably the fastest and easiest way to achieve that is instead to use an multi-value field, use different fields. I understand that is not the most generic scenario and there are Business Requirements implications but can simplify the development.
My previous comments are in the context of use only the Broker API, but you can take advantage of a search engine if is part of your architecture.
You can index the Broker Database and massage the data.
Using the Search Engine API you can extract the ids of the Components/Component Templates and use the Broker API to retrieve the proper information
I'm a software developer. I love to code, but I hate databases... Currently, I'm creating a website on which a user will be allowed to mark an entity as liked (like in FB), tag it and comment.
I get stuck on database tables design for handling this functionality. Solution is trivial, if we can do this only for one type of thing (eg. photos). But I need to enable this for 5 different things (for now, but I also assume that this number can grow, as the whole service grows).
I found some similar questions here, but none of them have a satisfying answer, so I'm asking this question again.
The question is, how to properly, efficiently and elastically design the database, so that it can store comments for different tables, likes for different tables and tags for them. Some design pattern as answer will be best ;)
Detailed description:
I have a table User with some user data, and 3 more tables: Photo with photographs, Articles with articles, Places with places. I want to enable any logged user to:
comment on any of those 3 tables
mark any of them as liked
tag any of them with some tag
I also want to count the number of likes for every element and the number of times that particular tag was used.
1st approach:
a) For tags, I will create a table Tag [TagId, tagName, tagCounter], then I will create many-to-many relationships tables for: Photo_has_tags, Place_has_tag, Article_has_tag.
b) The same counts for comments.
c) I will create a table LikedPhotos [idUser, idPhoto], LikedArticles[idUser, idArticle], LikedPlace [idUser, idPlace]. Number of likes will be calculated by queries (which, I assume is bad). And...
I really don't like this design for the last part, it smells badly for me ;)
2nd approach:
I will create a table ElementType [idType, TypeName == some table name] which will be populated by the administrator (me) with the names of tables that can be liked, commented or tagged. Then I will create tables:
a) LikedElement [idLike, idUser, idElementType, idLikedElement] and the same for Comments and Tags with the proper columns for each. Now, when I want to make a photo liked I will insert:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Photo'
INSERT (user id, typeId, photoId)
and for places:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Place'
INSERT (user id, typeId, placeId)
and so on... I think that the second approach is better, but I also feel like something is missing in this design as well...
At last, I also wonder which the best place to store counter for how many times the element was liked is. I can think of only two ways:
in element (Photo/Article/Place) table
by select count().
I hope that my explanation of the issue is more thorough now.
The most extensible solution is to have just one "base" table (connected to "likes", tags and comments), and "inherit" all other tables from it. Adding a new kind of entity involves just adding a new "inherited" table - it then automatically plugs into the whole like/tag/comment machinery.
Entity-relationship term for this is "category" (see the ERwin Methods Guide, section: "Subtype Relationships"). The category symbol is:
Assuming a user can like multiple entities, a same tag can be used for more than one entity but a comment is entity-specific, your model could look like this:
BTW, there are roughly 3 ways to implement the "ER category":
All types in one table.
All concrete types in separate tables.
All concrete and abstract types in separate tables.
Unless you have very stringent performance requirements, the third approach is probably the best (meaning the physical tables match 1:1 the entities in the diagram above).
Since you "hate" databases, why are you trying to implement one? Instead, solicit help from someone who loves and breathes this stuff.
Otherwise, learn to love your database. A well designed database simplifies programming, engineering the site, and smooths its continuing operation. Even an experienced d/b designer will not have complete and perfect foresight: some schema changes down the road will be needed as usage patterns emerge or requirements change.
If this is a one man project, program the database interface into simple operations using stored procedures: add_user, update_user, add_comment, add_like, upload_photo, list_comments, etc. Do not embed the schema into even one line of code. In this manner, the database schema can be changed without affecting any code: only the stored procedures should know about the schema.
You may have to refactor the schema several times. This is normal. Don't worry about getting it perfect the first time. Just make it functional enough to prototype an initial design. If you have the luxury of time, use it some, and then delete the schema and do it again. It is always better the second time.
This is a general idea
please donĀ“t pay much attention to the field names styling, but more to the relation and structure
This pseudocode will get all the comments of photo with ID 5
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "comment"
This pseudocode will get all the likes or users who liked photo with ID 5
(you may use count() to just get the amount of likes)
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "like"
as far as i understand. several tables are required. There is a many to many relation between them.
Table which stores the user data such as name, surname, birth date with a identity field.
Table which stores data types. these types may be photos, shares, links. each type must has a unique table. therefore, there is a relation between their individual tables and this table.
each different data type has its table. for example, status updates, photos, links.
the last table is for many to many relation storing an id, user id, data type and data id.
Look at the access patterns you are going to need. Do any of them seem to made particularly difficult or inefficient my one design choice or the other?
If not favour the one that requires the fewer tables
In this case:
Add Comment: you either pick a particular many/many table or insert into a common table with a known specific identifier for what is being liked, I think client code will be slightly simpler in your second case.
Find comments for item: here it seems using a common table is slightly easier - we just have a single query parameterised by type of entity
Find comments by a person about one kind of thing: simple query in either case
Find all comments by a person about all things: this seems little gnarly either way.
I think your "discriminated" approach, option 2, yields simpler queries in some cases and doesn't seem much worse in the others so I'd go with it.
Consider using table per entity for comments and etc. More tables - better sharding and scaling. It's not a problem to control many similar tables for all frameworks I know.
One day you'll need to optimize reads from such structure. You can easily create agragating tables over base ones and lose a bit on writes.
One big table with dictionary may become uncontrollable one day.
Definitely go with the second approach where you have one table and store the element type for each row, it will give you a lot more flexibility. Basically when something can logically be done with fewer tables it is almost always better to go with fewer tables. One advantage that comes to my mind right now about your particular case, consider you want to delete all liked elements of a certain user, with your first approach you need to issue one query for each element type but with the second approach it can be done with only one query or consider when you want to add a new element type, with the first approach it involves creating a new table for each new type but with the second approach you shouldn't do anything...