Best way for 1 to 1 relationship - relational-database

I'm making a social network kind of website where users can make multiple descriptions on stuff on the website. I created a ResourceVersions table which holds all actual content of the database, I then point it to another object in the database where the relationship and type of content is stored. This way I can query the ResourceVersion table to see the most recently changed/added items. For example:
A "Topic" has a "User" which points to a "ResourceVersion" for the description of the user.
OR:
A "Topic" has a "Trick" which points to a "Resource" which has a "ResourceVersion" with a description of the trick.
OR:
A "Trick" has a "UserTrick" which has a "UserVideoClip" which points to a "ResourceVersion" for the description of that clip.
The question now is, what is the best way to setup the 1:1 relationship between the Resources/ResourceVersions and other tables in the database? I now point with the relationship table towards the Resources/ResourceVersions, but should it maybe be the other way around? Or does this not matter?
The point of entry is usually the ResourceVersion (while loading multiple items at least, on the wall for example). So when I query the ResourceTable and join the table which it points to, it has to do a search on the foreign key. If I turn it around it does the search with the primairy key.
Is it true that this will change the speed?

The answer was yes, it works better to search for primary keys.

Related

SQL many-to-many relation 3 ways

Hej all,
Let's say I have 4 tables named "user", "office", "product", "event".
And another table named "document". A same document can be assigned to
one or many users, offices, products and events. So here we need a
many-to-many relationship. But I have 3 ways to do that :
-a table named "user_document", another named "office_document", "product_document" and "event_document" which all have a field named
"document_id" which is foreign key for document id and another field
"user_id" (for user_document) which is foreign key to user id (and so
on with office, product and event of course...)
OR
-a table named "document_ownership" which has these fields : "document_id", "user_id", "office_id", "product_id" and "event_id".
Here document_id should be not Null and one (or more) of other fields
that can be Null. For example if I set a same document for a user and
a product, I will have a row with document_id, user_id and product_id
not Null.
OR
-a table named "document_ownership" that will have these fields : "document_id", "relation_type" and "relation_id". Here relation_type
field is for example a string (which represent the relation table
name) or a foreign key pointing to another additionnal table named for
example "relationtype" in which we have strings like "user" (id=1),
"office" (id=2), "product" (id=3) and "event" (id=4) (which also
represent the relation table name), and relation_id which is the id of
the specified relation table (relation_type)
My question is, what is the pro/cons of all these 3 ways of doing what I want and what should be the best practice please ?
Thanks in advance for your advices,
Michal
This question is not really answerable as asked. A purist would say that approach 1 is correct but it is not always that simple. Think of it like this - your database design should express the relationships between the data and what the data means. So each of your approaches imply several things about the nature of the data.
Approach 1 says that user, office, product and event are important, and oh yeah they can have documents. Maybe.
Approach 2 says that documents are important, and we need to track what each document relates to. So the document is the key thing and everything else is annotated around that.
Approach 3 is more complicated and technical and does not really give an idea of how you want the data to be used.
In all cases the data is same. It is just designing the data to tell the story of how it should be used.
Sorry to wax lyrical. Just my $0.02.
In a data conception (Merise) view you have :
Document-0,n---------0,n-User
Document-0,n---------0,n-Event
...
This is the logical view.
When you transform this to physical data view you will end up with 1 more table for each relation.
So the 1st solution is the way to go, if you want to apply best practice in data modelisation.
Concerning the two other solutions, which breaks some normal form :
the second solution is a total no go. You will have a lot of null value everywhere and will strugle to do some basic statistic because of that.
The third solution, that looks like a spaghetthi plate, will globally work and is, in my point of view, a good alternative. IF you can handle the loss of constraint integrity

Optimising a database with two separate category tables

I have a database for a website that provides all the data storage capabilities of the website. It stores articles in a knowlegebase, and services for internal and end-user access.
Both articles and services are stored in categories which can have an indefinite amount of parent categories by self-referencing. It is possible to add multiple categories to either via the connecting table.
It needs to be possible to find the categories of a service or an article, including all the way up the category-parent tree. It also needs to be possible to find the services or articles of a category. Of course, a category can't have both.
Is this an optimal way of doing this? It doesn't feel right, and I'd welcome alternate ideas.
EDIT: Does this way usually work? The categories all have roughly the same content, just a name and description and perhaps an image.
The primary keys of category_service and category_article should include both fields in the respective tables (if a category can have more than one service or article). Also, do you really need a VARCHAR(45) type indicator? I recommend a short ENUM instead.
Otherwise, the basic design in the second diagram looks good. I suggest you add a closure table for efficiently querying recursive hierarchies.
If you want to enforce consistency between the category type and records in category_article/category_service, you can duplicate the type indicator in those tables and include it in the foreign key constraint. Yes, doing so feels redundant, but it's effective. Resist the temptation to combine these two tables, mixing values from different domains in a single column usually leads to more difficulties.

how bad could be duplicate tables or left null fields when designing a database?

I am currently designing a DB for a website, its quite simple, but I will have many entries of information and I dont want to leave a weak design. So , basically, I found a problem in create two tables with same structure or create just one but at other leave some text fields as null.
I have a table area , but I need also to create other named sub-area , both will have their own set of images, but some data only from area and subarea will be shared, and subarea might have many long text information that area wont have, text fields.
So, basically, what I did was, create a table named area and created a boolean field that will tell me if is sub-area or not, also a foreign key to itself that can be null but will be used to point the parent area in case the area its a subarea, and at the images table create a foreign key to area (because both area and subarea can have many images).
My problem is now, I have an area-information table (because its gonna have quite many fields that I wont use, so i dont really want to load it for nothing) , that table has a one-one relation to area table, but some fields of that area-information are specific of sub-area only, due I dont have a sub-area only table, I thought about leave them as NULL at the schema, fields are TEXT and i dont know if this is a big mistake or is an accurate decision, taking in consideration I want not to overload the server with queries (due the info will be plenty enough, so traffic will)
Any idea? Thanks.
One of several ways you could start to approach this. Note despite people running around talking about Sixth Normal form, practical database design is as much art as science.
Based on what I could glean from your question
InfoGroups InfoGroupID (PK)
InfoType, (Unique Key with InfoGroupID ?)
Info
Areas AreaID (PK),
any other attributes soley down to area,
InfoGroupID (FK to InfoGroups)
SubAreas SubAreaID (PK)
any other attributes soley down to sub area,
AreaID (FK to Areas),
InfoGroupID (FK to InfoGroups)
You could go further if Info's are common to areas/subareas and Make InfoGroups a many to many and have an Info Table...
Whether Info Type is a magic number, and enum, a string or an FK to an InfoTypes table is another set of options.
If the only difference between Areas and SubAreas was the link, you go for a self referential table, though I personally wouldn't unless subAreas had further subAreas..
Not seeing this being too expensive off the bat, but I don't know your needs. Data wise it's simplish, neat and efficient and it's way better than a shed load of ambiguous null columns.

Refactoring a One-to-many relation to a Many-to-Many in MySQL: How to formulate the query?

In the initial 'version' of the application that I'm working on, a design consideration wasn't taken into account - no one thought of it.
However, it seems that the original one-to-many relation needs to be refactored into a many-to-many. My question is how best to do this? I'm using MySQL for persistence.
Populating the relationship table will only be a one time effort, I'd rather go with a simple query or a stored procedure approach (I'm not well versed with the latter); rather than write java/jdbc based logic to do it (I know I can and it's not too difficult, but that's not what I want)
So here's an example of the relation:
|VirtualWhiteBoard| -1------*- |Post|
A virtual white board can have many posts. The new functionality is: 1 post should belong to multiple white boards if the user chooses to 'duplicate' current white board (not thought of before)
The schema looks like this:
VirtualWhiteBoard (wallName, projectName,dateOfCreation,..., Primary_Key(wallName, projectName));
Post(post_id, wallName,postData,..., Primary_Key(post_id), Foreign_Key(wallName, projectName));
The virtual white board has a composite primary key (wallName, projectName) and each post has a post_id as primary key
Question: Take the primary keys from VirtualWhiteBoard and Post and add it to the new relation 'has_posts':
|VirtualWhiteBoard| -1------*- |has_Post| -*------1- |Post|
To keep the previous relationships intact and then drop the foreign key column of wallName in Post.
How best to achieve this? Would a query suffice or stored procedures would be required?
(Although I can do this in the 'application' I'd prefer to do it this way, since such refactorings are bound to arise and I don't want unnecessary java-code lying around that'll need to be maintained and would personally prefer to have such a skill too :)
Create your has_Post table with two columns post_id and wallName and populate it with this query:
INSERT INTO has_Post(post_id, wallName) SELECT post_id, wallName FROM Post
Then delete the wallName column from Post table.

Best way to link a table to 2 different keys?

I'm designing a mySQL DB and I'm having the following issue:
Say I have a wall_posts table. Walls can belong to either an event or a user.
Hence the wall_posts table must references either event_id or user_id (foreign key constraint).
What is the best way to build such a relationship, considering I must always be able to know who the walls belong to ... ?
I've been considering using 2 tables, such as event_wall_posts and user_wall_posts so one got an event_id field and the other a user_id one, but I believe there must be something much better than such a redundant workaround ...
Is there a way to use an intermediate table to link the wall_posts to either an event_id or a user_id ?
Thanks in advance,
Edit : seems there is not a clear design to do this and both approach seem okay, so,
which one will be the fastest if there is a lots of data ?
Is it preferable to have 2 separates table (so queries might be faster, since there will be twice less data in tables ...), or is it still preferable to have a better OO approach with a single wall_posts table referencing a wall table (and then both users and events will have a uniquewall_id`)
Why is it redundant? You won't write code twice to handle them, you will use the same code, just change the name of the table in the SQL.
Another reason to take this approach is that some time in the future you will discover you need new different fields for each entity.
What you're talking about is called an exclusive arc and it's not a good practice. It's hard to enforce referential integrity. You're probably better off using, in an object sense, a common supertype.
That can be modelled in a couple of ways:
A supertype table called, say, wall. Two subtype tables (user_wall and event_wall) that link to a user and event respectively as the owner. The wall_posts table links to the supertype table; or
Putting both entity types into one table and having a type column. That way you're not linking to two separate tables.
Go for the simplest solution: add both an event_id and a user_id column to the wall_posts table. Use constraints to enforce that one of them is null, and the other is not.
Anything more complex smells like overnormalization to me :)
A classical approach to this problem is:
Create a table called wall_container and keep properties common to both users and events in it
Reference both users and events to wall_container
Reference wall_posts to wall_container
However, this is not very efficient and it's not guaranteed that this wall_container doesn't containt records that are not either a user or an event.
SQL is not particularly good in handling multiple inheritance.
Your wall and event has their own unique IDs .. right?? then their is no need for another table . let the wall_post table have a attribute as origin which will direct to the record of whatever the record is event's or users. '
If the wall and event may have same ID then make a table with three attributes origin(primary), ID number and type. There ID number will be what you set, type defining what kind of entity does the ID represent and origin will be a new ID which you will generate maybe adding different prefix. In case of same ID the origin table will help you immensely for other things to other than wall posts.