I'd be grateful if someone could advise me what the correct way is to approach this:
I have a parent table, with primary key master_id.
I have five child tables associated with the master_id, in a 1-to-n relationship, that record semantically different types of data (and wouldn't lend themselves to abstraction)
The only common fields in each child table are master-id (foreign key), created_by (user_id), created_time (timestamp).
The aim is to publish each child row associated with a master id in a chronological order on a webpage (imagine a forum posts style display), with the PHP building each "post" (i.e. row) slightly differently depending on the child table (and hence fields of data available to it).
Am I right in thinking there's no easy way to query this regardless of the table structure (and that ordering would be best done in PHP)? Is there any advantage to vertically partitioning out the 3 common fields into a single table combined with a table field?
This is a really difficult problem to solve, one I've worked on myself. Unfortunately, I don't have a good answer. You could use UNIONs and various types of JOINs, but you're going to potentially heavily kill your database's performance. Plus, you run into issues like, say you want the most 30 recent chronological entries. You'd have to query every child table for 30 entries (possibly 150 entries in all) and sort through them in your code since you don't know which child table(s) have the most recent 30 entries. 150 rows just to pull 30, blech.
Honestly, the best way I've found to implement such a thing is to have a table, possibly your master table, have columns dedicated specifically to what you want to show from your child table. For example, have columns in your master table that are something like, master_id, created_by, created_time, and notification_text (if you're trying to implement something like Facebook's timeline, for example). You could set up triggers on the child tables so that when you insert data into one of them, it automatically populates the data in the master table as well. Then when you're displaying the timeline, you just query the master table without bothering with the child tables at all.
Your schema would look something like:
Table master
- master_id
- created_by
- created_time
- notification_text
Table child1
- id
- master_id
- data1 (used to generate notification_text in master)
Table child2
- id
- master_id
- data21 (collectively used to generate notification_text in master)
- data22 (collectively used to generate notification_text in master)
- data23 (collectively used to generate notification_text in master)
...
What about you do a star model? It's commonly use to create datawarehouses however you could try to use something similar in your solution. If you are not familiar with the star model you can see more here.
Hope it helps.
Related
Scenario: Multiple Types to a single type; one to many.
So for example:
parent multiple type: students table, suppliers table, customers table, hotels table
child single type: banking details
So a student may have multiple banking details, as can a supplier, etc etc.
Layout Option 1 students table (id) + students_banking_details (student_id) table with the appropriate id relationship, repeat per parent type.
Layout Option 2 students table (+others) + banking_details table. banking_details would have a parent_id column for linking and a parent_type field for determining what the parent is (student / supplier / customers etc).
Layout Option 3 students table (+others) + banking_details table. Then I would create another association table per parent type (eg: students_banking_details) for the linking of student_id and banking_details_id.
Layout Option 4 students table (+others) + banking_details table. banking_details would have a column for each parent type, ie: student_id, supplier_id, customers_id - etc.
Other? Your input...
My thoughts on each of these:
Multiple tables of the same type of information seems wrong. If I want to change what gets stored about banking details, thats also several tables I have to change as opposed to one.
Seems like the most viable option. Apparently this doesnt maintain 'referential integrity' though. I don't know how important that is to me if I'm just going to be cleaning up children programatically when I delete the parents?
Same as (2) except with an extra table per type so my logic tells me this would be slower than (2) with more tables and with the same outcome.
Seems dirty to me with a bunch of null fields in the banking_details table.
Before going any further: if you do decide on a design for storing banking details which lacks referential integrity, please tell me who's going to be running it so I can never, ever do business with them. It's that important. Constraints in your application logic may be followed; things happen, exceptions, interruptions, inconsistencies which are later reflected in data because there aren't meaningful safeguards. Constraints in your schema design must be followed. Much safer, and banking data is something to be as safe as possible with.
You're correct in identifying #1 as suboptimal; an account is an account, no matter who owns it. #2 is out because referential integrity is non-negotiable. #3 is, strictly speaking, the most viable approach, although if you know you're never going to need to worry about expanding the number of entities who might have banking details, you could get away with #4 and a CHECK constraint to ensure that each row only has a value for one of the four foreign keys -- but you're using MySQL, which ignores CHECK constraints, so go with #3.
Index your foreign keys and performance will be fine. Views are nice to avoid boilerplate JOINs if you have a need to do that.
When someone registers on my site they get variables in 3 different tables.
However, due to unknown reasons, the ID's for some tables are more pushed up than others.
So someone will register and get these ID's
Table 1 - ID 12
Table 2 - ID 15
Table 3 - ID 13
I utilise these ID's being the same for various table joins, every user that signs up I have to manually go in and change the ID's!
I'm not sure what to do, it's really tedious. Should I just wipe the databases and restart?
It sounds like all three of your tables are set to auto-increment, and you are trying to imply a foreign-key relationship between them using their primary keys. This can work in the short-term, but if records are manually inserted, or if you have a scenario where not all three tables have data inserted, it will throw things off. You can reset the auto increment value to whatever you'd like, but this is only a temporary fix.
If this is the case, you should identify which of these is really the "master" table in relation to the others. Then ask yourself, is it really necessary to split this data into three tables, when in fact it all relates 1:1? And finally, if it is necessary to do so, then a best-practice you should consider would be to declare separate fields in the child tables and define these as having explicit foreign-key relationships to the master table.
I've only just registered so I can't comment, but it would be helpful if you could show the "create table" SQL statements.
If you know what you are doing and really want it to work this way. Then you can set the Auto Increment value to a specific number, like this ALTER TABLE Persons AUTO_INCREMENT=100.
If you don't really know what you are doing. You better read more about using databases, primary keys, relationship tables and more.
I would guess this is a semi-common question but I can't find it in the list of past questions. I have a set of tables for products which need to share a primary key index. Assume something like the following:
product1_table:
id,
name,
category,
...other fields
product2_table:
id,
name,
category,
...other fields
product_to_category_table:
product_id,
category_id
Clearly it would be useful to have a shared index between the two product tables. Note, the idea of keeping them separate is because they have largely different sets of fields beyond the basics, however they share a common categorization.
UPDATE:
A lot of people have suggested table inheritance (or gen-spec). This is an option I'm aware of but given in other database systems I could share a sequence between tables I was hoping MySQL had a similar solution. I shall assume it doesn't based on the responses. I guess I'll have to go with table inheritance... Thank you all.
It's not really common, no. There is no native way to share a primary key. What I might do in your situation is this:
product_table
id
name
category
general_fields...
product_type1_table:
id
product_id
product_type1_fields...
product_type2_table:
id
product_id
product_type2_fields...
product_to_category_table:
product_id
category_id
That is, there is one master product table that has entries for all products and has the fields that generalize between the types, and type-specified tables with foreign keys into the master product table, which have the type-specific data.
A better design is to put the common columns in one products table, and the special columns in two separate tables. Use the product_id as the primary key in all three tables, but in the two special tables it is, in addition, a foreign key back to the main products table.
This simplifies the basic product search for ids and names by category.
Note, also that your design allows each product to be in one category at most.
It seems you are looking for table inheritance.
You could use a common table product with attributes common to both product1 and product2, plus a type attribute which could be either "product2" or "product1"
Then tables product1 and product2 would have all their specific attributes and a reference to the parent table product.
product:
id,
name,
category,
type
product1_table:
id,
#product_id,
product1_specific_fields
product2_table:
id,
#product_id,
product2_specific_fields
First let me state that I agree with everything that Chaos, Larry and Phil have said.
But if you insist on another way...
There are two reasons for your shared PK. One uniqueness across the two tables and two to complete referential integrity.
I'm not sure exactly what "sequence" features the Auto_increment columns support. It seem like there is a system setting to define the increment by value, but nothing per column.
What I would do in Oracle is just share the same sequence between the two tables. Another technique would be to set a STEP value of 2 in the auto_increment and start one at 1 and the other at 2. Either way, you're generating unique values between them.
You could create a third table that has nothing but the PK Column. This column could also provide the Autonumbering if there's no way of creating a skipping autonumber within one server. Then on each of your data tables you'd add CRUD triggers. An insert into either data table would first initiate an insert into the pseudo index table (and return the ID for use in the local table). Likewise a delete from the local table would initiate a delete from the pseudo index table. Any children tables which need to point to a parent point to this pseudo index table.
Note this will need to be a per row trigger and will slow down crud on these tables. But tables like "product" tend NOT to have a very high rate of DML in the first place. Anyone who complains about the "performance impact" is not considering scale.
Please note, this is provided as a functioning alternative and not my recommendation as the best way
You can't "share" a primary key.
Without knowing all the details, my best advice is to combine the tables into a single product table. Having optional fields that are populated for some products and not others is not necessarily a bad design.
Another option is to have a sort of inheritence model, where you have a single product table, and then two product "subtype" tables, which reference the main product table and have their own specialized set of fields. Querying this model is more painful than a single table IMHO, which is why I see it as the less-desirable option.
Your explanation is a little vague but, from my basic understanding I would be tempted to do this
The product table contains common fields
product
-------
product_id
name
...
the product_extra1 table and the product_extra2 table contain different fields
these tables habe a one to one relationship enforced between product.product_id and
product_extra1.product_id etc. Enforce the one to one relationship by setting the product_id in the Foreign key tables (product_extra1, etc) to be unique using a unique constraint.
you will need to decided on the business rules as to how this data is populated
product_extra1
---------------
product_id
extra_field1
extra_field2
....
product_extra2
---------------
product_id
different_extra_field1
different_extra_field2
....
Based on what you have above the product_category table is an intersecting table (1 to many - many to 1) which would imply that each product can be related to many categories
This can now stay the same.
This is yet another case of gen-spec.
See previous discussion
I do have a datbase with multiple tables.
this multiple table is related to single name for example..
Table 1 contains name of the person, joined date,position,salary..etc
Table2 contains name of the person,current projects,finished,assigned...etc
Table 3 contains name of the person,time sheets,in,out,etc...
Table 4 contains name of the person,personal details,skill set,previous experiance,...etc
All table contains morethan 50000 names, and their details.
so my question is all tables contains information related to a name say Jose20856 this name is unique index of all 4 tables. when I search for Jose20856 all four table will give result and output to a front end software/html.
so do I need to keep multiple table or combined to a single table??
If so
CASE 1
Single table -> what are the advantages? will result will be faster? what about the system resource usage?
CASE 2
Multiple table ->what are the advantages? will result will be faster? what about the system resource usage?
As I am new to MySQL I would like to have your valuable opinion to move ahead
You can combine these into a single table but only if it makes sense. It's hard to tell if the relationships in your tables are one-to-one or one-to-many but seem to be one-to-many. e.g. A single employee from table 1 should be able to have multiple projects, skills, time sheets in the other tables. These are all one-to-many relationships.
So, keep the multiple table design. You also should consider using an integer-based primary key for the employee rather than the name. Use this pkey as the fkey in your other tables and you'll see performance improvement. (Also consider the amount of work you need to do if and when you want to change the name. You have to change all the names in all the tables. If you use a surrogate key, the int pkey, as suggested above, you only have to update a single row.)
Read on the web about database normalization.
E.g. http://en.wikipedia.org/wiki/Database_normalization
I think you can even add more tables to it. It all depends on the data and the relations.
Table1 = users incl. userdata
Table2 = Projects (if multiple users work on the same project)
Table3 = Linking user to projects (if multiple users work on the same project)
Table4 = Time spent? Contains the links to the user and to the project.
I think your table 4 can be merged into table 1 cause it also contains data specific to 1 user.
There is probably more you can do but as already stated it all depends and the relations.
What we're talking about here is vertical table partitioning (as opposed to horizontal table partitioning). It is a valid database design pattern, which can be useful in these cases:
There are too many columns to fit into one table. That's pretty obvious.
There are columns which are accessed relatively often, and some that are accessed relatively rarely. For example, if you very often need to display columns joined date,position,salary and columns personal details,skill set,previous experiance very rarely, then it makes sense to move these columns to separate a table, as it will (probably) improve performance in accessing those most commonly used. In MySQL this is especially true in case of TEXT and BLOB columns, since they're stored apart from the rest of the fileds, so accessing them takes more time.
There are NULLable columns, where majority of rows are NULL. Once again, if it's mostly null, moving it to a separate table will let you reduce size of your 'mani' table and improve performance. The new table should not allow null values and have entries only for rows where value is set. This way you reduce amount of storeage/memory resources as well.
MySQL specific - You might want tom move some of your columns from nnoDB table to MyISAM, so that you can use full text indexing, while still being able to use some of the features InnoDB provides. It's not a good design gnerally speaking though - it's better to use a full text search engine like Sphinx.
Last but not least. I'd suggest using a numeric field as a key joining all these tables, not a string.
Additional reading aboout MySQL partitioning (a bit outdated, since MySQL 5.5 added some new features)
So imagine you have multiple tables in your database each with it's own structure and each with a PRIMARY KEY of it's own.
Now you want to have a Favorites table so that users can add items as favorites. Since there are multiple tables the first thing that comes in mind is to create one Favorites table per table:
Say you have a table called Posts with PRIMARY KEY (post_id) and you create a Post_Favorites with PRIMARY KEY (user_id, post_id)
This would probably be the simplest solution, but could it be possible to have one Favorites table joining across multiple tables?
I've though of the following as a possible solution:
Create a new table called Master with primary key (master_id). Add triggers on all tables in your database on insert, to generate a new master_id and write it along the row in your table. Also let's consider that we also write in the Master table, where the master_id has been used (on which table)
Now you can have one Favorites table with PRIMARY KEY (user_id, master_id)
You can select the Favorites table and join with each individual table on the master_id and get the the favorites per table. But would it be possible to get all the favorites with one query (maybe not a query, but a stored procedure?)
Do you think that this is a stupid approach? Since you will perform one query per table what are you gaining by having a single table?
What are your thoughts on the matter?
One way wold be to sub-type all possible tables to a generic super-type (Entity) and than link user preferences to that super-type. For example:
I think you're on the right track, but a table-based inheritance approach would be great here:
Create a table master_ids, with just one column: an int-identity primary key field called master_id.
On your other tables, (users as an example), change the user_id column from being an int-identity primary key to being just an int primary key. Next, make user_id a foreign key to master_ids.master_id.
This largely preserves data integrity. The only place you can trip up is if you have a master_id = 1, and with a user_id = 1 and a post_id = 1. For a given master_id, you should have only one entry across all tables. In this scenario you have no way of knowing whether master_id 1 refers to the user or to the post. A way to make sure this doesn't happen is to add a second column to the master_ids table, a type_id column. Type_id 1 can refer to users, type_id 2 can refer to posts, etc.. Then you are pretty much good.
Code "gymnastics" may be a bit necessary for inserts. If you're using a good ORM, it shouldn't be a problem. If not, stored procs for inserts are the way to go. But you're having your cake and eating it too.
I'm not sure I really understand the alternative you propose.
But in general, when given the choice of 1) "more tables" or 2) "a mega-table supported by a bunch of fancy code work" ..your interests are best served by more tables without the code gymnastics.
A Red Flag was "Add triggers on all tables in your database" each trigger fire is a performance hit of it's own.
The database designers have built in all kinds of technology to optimize tables/indexes, much of it behind the scenes without you knowing it. Just sit back and enjoy the ride.
Try these for inspiration Database Answers ..no affiliation to me.
An alternative to your approach might be to have the favorites table as user_id, object_id, object_type. When inserting in the favorites table just insert the type of the favorite. However i dont see a simple query being able to work with your approach or mine. One way to go about it might be to use UNION and get one combined resultset and then identify what type of record it is based on the type. Another thing you can do is, turn the UNION query into a MySQL VIEW and simply query that VIEW.
The benefit of using a single table for favorites is a simplicity, which some might consider as against the database normalization rules. But on the upside, you dont have to create so many favorites table and you can add anything to favorites easily by just coming up with a new object_type identifier.
It sounds like you have an is-a type relationship that needs to be modeled. All of the items that can be favourited are a type of "item". It sounds like you are on the right track, but I wouldn't use triggers. What could be the right answer if I have understood correctly, is to pull all the common fields into a single table called items (master is a poor name, master of what?), this should include all the common data that would be needed when you need a users favourite items, I'd expect this to include fields like item_id (primary key), item_type and human_readable_name and maybe some metadata about when the item was created, modified etc. Each of your specific item types would have its own table containing data specific to that item type with an item_id field that has a foreign key relationship to the item table. Then you'd wrap each item type in its own insertion, update and selection SPs (i.e. InsertItemCheese, UpdateItemMonkey, SelectItemCarKeys). The favourites table would then work as you describe, but you only need to select from the item table. If your app needs the specific data for each item type, it would have to be queried for each item (caching is your friend here).
If MySQL supports SPs with multiple result sets you could write one that outputs all the items as a result set, then a result set for each item type if you need all the specific item data in one go. For most cases I would not expect you to need all the data all the time.
Keep in mind that not EVERY use of a PK column needs a constraint. For example a logging table. Even though a logging table has a copy of the PK column from the table being logged, you can't build a constraint.
What would be the worst possible case. You insert a record for Oprah's TV show into the favorites table and then next year you delete the Oprah Show from the list of TV shows but don't delete that ID from the Favorites table? Will that break anything? Probably not. When you join favorites to TV shows that record will fall out of the result set.
There are a couple of ways to share values for PK's. Oracle has the advantage of sequences. If you don't have those you can add a "Step" to your Autonumber fields. There's always a risk though.
Say you think you'll never have more than 10 tables of "things which could be favored" Then start your PK's at 0 for the first table increment by 10, 1 for the second table increment by 10, 2 for the third... and so on. That will guarantee that all the values will be unique across those 10 tables. The risk is that a future requirement will add table 11. You can always 'pad' your guestimate