This should be a simple question I think, but is it OK to have NULL foreign keys?
I guess to elaborate, let's say I'm making a database for users and different types of users require different data sets... what would be the best practice to design this database?
This was my thought, as a rough example: (am i correct or way off?)
"users":
id | type (ie. '1' for basic, '2' for advanced) | basic_id (nullable foreign key) | advanced_id (nullable foreign key) | email | name | address | phone (etc etc)
"users_basic":
id | user_id (foreign key) | (other data only required for basic users)
"users_advanced"
id | user_id (forgein key) | (other data only required for advanced users)
I get the feeling it's bad design cause there's no way to get all the data in one query without checking what type of user it is first, but I really don't like the idea of having ONE table with a ton of NULL data. What is the best way to design this?
Of course it is fine to have NULL foreign keys.
In your case, though, I'd be inclined to do one of two things. If there really aren't very many columns for the basic and advanced users, you can just include them in the users table. This would be the typical approach.
Otherwise, you can declare user_id as the primary key in all three tables, and still have a foreign key relationship from the secondary tables (users_basic and users_advanced) to the primary (users). Maintaining the distinctiveness of the relationship is tricky in MySQL and probably not worth doing.
Related
is it possible to have two foreign key in table from same table
example post table has user_id and username from table user
--------------------------------
|table user | table post |
|-------------|----------------|
|user_id | post_id |
|username | post_title |
|password | post_content |
|email | user_id Fk |
| | username Fk |
--------------------------------
Technically fine.
But keep in mind that the structure listed above could lead to some strange situations.
If a user is allowed to change his own username (which is a separate discussion as to whether that is advisable or not) then you could end up with some scenarios where either a user would not be able to change his name (because then Foreign Key integrity would be broken with a post that refers to his name) or a user could try to hijack another user's posts by changing name to match.
All these problems can be prevented easily enough, but as a general rule I think is better to stick to a single Foreign Key and, generally, to use a number (like user_id) instead of text.
#Alan mentions the idea of two fields in a table referencing the same Foreign Key field of another table - that is quite common and generally OK.
yes, it is possible to use two foreign keys in same table .
But in your case you may not need 'username' as foreign key. because 'user_id' can be used to select username .
Let's assume I have a very large database with tons of tables in it.
Certain of these tables contain datasets to be connected to each other like
table: album
table: artist
--> connected by table: album_artist
table: company
table: product
--> connected by table: company_product
The tables album_artist and company_product contain 3 columns representing primary key, albumID/artistID meanwhile companyID/productID...
Is it a good practice to do something like an "assoc" table which is made up like
---------------------------------------------------------
| id int(11) primary | leftID | assocType | rightID |
|---------------------------------------------------------|
| 1 | 10 | company:product | 4 |
| 2 | 6 | company:product | 5 |
| 3 | 4 | album:artist | 10 |
---------------------------------------------------------
I'm not sure if this is the way to go or if there's anything else than creating multiple connection tables?!
No, it is not a good practice. It is a terrible practice, because referential integrity goes out the window. Referential integrity is the guarantee provided by the RDBMS that a foreign key in one row refers to a valid row in another table. In order for the database to be able to enforce referential integrity, each referring column must refer to one and only one referred column of one and only one referred table.
No, no, a thousand times no. Don't overthink your many-to-many relationships. Just keep them simple. There's nothing to gain and a lot to lose by trying to consolidate all your relationships in a single table.
If you have a many to many relationship between, say guiarist and drummer, then you need a guitarist_drummer table with two columns in it: guitarist_id and drummer_id. That table's primary key should be comprised of both columns. And you should have another index that's made of the two columns in the opposite order. Don't add a third column with an autoincrmenting id to those join tables. That's a waste, and it allows duplicated pairs in those tables, which is generally confusing.
People who took the RDBMS class in school will immediately recognize how these tables work. That's good, because it means you don't have to be the only programmer on this project for the rest of your life.
Pro tip: Use the same column name everywhere. Make your guitarist table contain a primary key called guitarist_id rather than id. It makes your relationship tables easier to understand. And, if you use a reverse engineering tool like Sql Developer that tool will have an easier time with your schema.
The answer is that it "depends" on the situation. In your case and most others, no, it does not make sense. It does make sense if you are doing a many <-> many relationship, the constraints can be enforced by the link table with foreign keys and a unique constraint. Probably the best use case would be if you had numerous tables pointing to a single table. Each table could have a link table with indexes on it. This would be beneficial if one of the tables is a large table, and you need to fetch the linked records separately.
i'm refactoring a db structure and have a little problem.
This DB have various tables with same structure, like:
People -> People_contacts
Activities -> Activities_contacts
Now, i want to create only one Contact table, and use an ENUM() to distinguish from the nature of the parent (for search requirements and data reversibility)
the structure will be:
People -> Contacts[People]
Activities -> Contacts[Activities]
But now i need to put a Foreign-key, and based on the ENUM property distinguish from two different tables...
How i can effort this? There are a way or is better maintain the old tables?
why you are using view? if the People_contacts and Activities_contacts are exactly the same, you can try this:
create view `test` as select *,'People' as Type from `People_contacts` union select *,'Activities' from `Activities _contacts` union;
and then select what you want from the view:
select * from `test` where Type = 'People' and .....
and your query answer should be this
+----+------+ +--------+
| ID | Data |...| Type |
+----+------+ +--------+
| 1 | foo |...| People |
| 2 | foo |...| People |
+----+------+ +--------+
You cannot have a declared foreign key, pointing to one table or another depending on a field.
You can do a few things, but none of then are really clean.
You can have the integer field, and the enum, but do not declare the field as a foreign key. You will have to implement all the logic by yourself, and it will be harder to maintain, and harder to decouple database from programing.
You can have 2 nullable foreign keys (people_id and activity_id), and forget the enum field. if one FK is null, the other will have the real relation. This is better since you declare the foreign keys as usual and the model is stronger
If you prefer to keep your contact table clean, you can have a relation table where you put this dirty stuff. So in this table you store the contact_id, and the id of the activity or the person, as explained in whatever 1 or 2
But anyways, probably you are obfuscated and you dont need to have the foreign key in the contact table. I would bet you will always access first the people or the activities table, so you probably can change this tables, and add a contact_id foreign key. In the contact table you just need to add, if you dont have it already, de id primary key, and delete de ENUM field, and the foreign keys, since you dont really need them
I've created a database with three tables in it:
Restaurant
restaurant_id (autoincrement, PK)
Owner
owner_id (autoincrement, PK)
restaurant_id (FK to Restaurant)
Deal
deal_id (autoincrement)
owner_id (FK to Owner)
restaurant_id (FK to Restaurant)
(PK: deal_id, owner_id, restaurant_id)
There can be many owners for each restaurant. I chose two foreign keys for Deal so I can reference the deal by either the owner or the restaurant. The deal table would have three primary keys, two being foreign keys. And it would have two one-to-many relationships pointing to it. All of my foreign keys are primary keys and I don't know if I'll regret doing it like this later on down the road. Does this design make sense, and seem good for what I'm trying to achieve?
Edit: What I really need to be able to accomplish here is when a owner is logged in and viewing their account, I want them to be able to see and edit all the deals that are associated with that particular restaurant. And because there can be more that one owner per restaurant, I need to be able to perform a query something like: select *from deals where restaurant_id = restaurant_id. In other words, if I'm an owner and I'm logged in, I need to be able to make query: get all of the deal that are related to not just me, the owner, but to all of the owners associated with this restaurant.
You're having some trouble with terminology.
A table can only ever have a one primary key. It is not possible to create a table with two different primary keys. You can create a table with two different unique indexes (which are much like a primary key) but only one primary key can exist.
What you're asking about is whether you should have a composite or compound primary key; a primary key using more than one column.
Your design is okay, but as written you probably have no need for the column deal_id. It seems to me that restaurant_id and owner_id together are enough to uniquely identify a row in Deal. (This may not be true if one owner can have two different ownership stakes in a single restaurant as the result of recapitalization or buying out another owner, but you don't mention anything like that in your problem statement).
In this case, deal_id is largely wasted storage. There might be an argument to be made for using the deal_id column if you have many tables that have foreign keys pointing to Deal, or if you have instances in which you want to display to the user Deals for multiple restaurants and owners at the same time.
If one of those arguments sways you to adopt the deal_id column, then it, and only it, should be the primary key. There would be nothing added by including the other two columns since the autoincrement value itself would be unique.
If u have a unique field, this should be the PK, that would be the incremented field.
In this specific case it gives u nothing at all to add more fields to this key, it actually somewhat impacts performance (don't ask me how much, u bench it).
if you want to create 2 foreign keys in the deal table which are the restaurant and the owner the logic is something like a table could exist in the deal even without an owner or an owner could exist in the deal even without identifying the table on it but you could still identify the table because it's being used as a foreign key on the owner table, but if your going to put values on each columns that you defined as foreign key then I think it's going to be redundant cause I'm not sure how you would use the deal table later on but by it's name I think it speaks like it would be used to identify if a restaurant table is being reserved or not by a customer and to see how you have designed your database you could already identify the table which they have reserved even without specifying the table as foreign key in the deal table cause by the use of the owner table you would able to identify which table they have reserved already since you use it as foreign key on the owner table you just really have to be wise on defining relationships between your tables and avoid redundancy as much as possible. :)
I think it is not best.
First of all, the Deal table PK should be the deal_id. There is no reason to add additional columns to it--and if you did want to refer to the deal_id in another table, you'd have to include the restaurant_id and owner_id which is not good. Whether deal_id should also be the clustered index (a.k.a. index organized on this column) depends on the data access pattern. Will your database be full of data_id values most often used for lookup, or will you primarily be looking deals up by owner_id or restaurant_id?
Also, using two separate FKs way the you have described it (as far as I can tell!) would allow a deal to have an owner and restaurant combination that are not a valid (combining an owner that does not belong to that restaurant). In the Deal table, instead of one FK to Owner and one FK to Restaurant, if you must have both columns, there should be a composite FK to only the Owner table on (OwnerID, RestaurantID) with a corresponding unique key in the Owner table to allow this link up.
However, with such a simple table structure I don't really see the problem in leaving RestaurantID out of the Deal table, since the OwnerID always fully implies the RestaurantID. Obviously your deals cannot be linked only with the restaurant, because that would imply a 1:M relationship on Deal:Owner. The cost of searching based on Restaurant through the Owner table shouldn't really be that bad.
Its not wrong, it works. But, its not recommended.
Autoincrement Primary Keys works without Foreign Keys (or Master Keys)
In some databases, you cannot use several fields as a single primary key.
Compound Primary Keys or Compose Primary Keys are more difficult to handle in a query.
Compound Primary Key Query Example:
SELECT
D.*
FROM
Restaurant AS R,
Owner AS O,
Deal AS D
WHERE
(1=1) AND
(D.RestaurantKey = D.RestaurantKey) AND
(D.OwnerKey = D.OwnerKey)
Versus
Single Primary Key Query Example:
SELECT
D.*
FROM
Restaurant AS R,
Owner AS O,
Deal AS D
WHERE
(D.OwnerKey = O.OwnerKey)
Sometimes, you have to change the value of foreign key of a record, to another record. For Example, your customers already order, the deal record is registered, and they decide to change from one restaurant table to another. So, the data must be updated, in the "Owner", and "Deal" tables.
+-----------+-------------+
| OwnerKey | OwnerName |
+-----------+-------------+
| 1 | Anne Smith |
+-----------+-------------+
| 2 | John Connor |
+-----------+-------------+
| 3 | Mike Doe |
+-----------+-------------+
+-----------+-------------+-------------+
| OwnerKey | DealKey | Food |
+-----------+-------------+-------------+
| 1 | 1 | Hamburguer |
+-----------+-------------+-------------+
| 2 | 2 | Hot-Dog |
+-----------+-------------+-------------+
| 3 | 3 | Hamburguer |
+-----------+-------------+-------------+
| 1 | 3 | Soda |
+-----------+-------------+-------------+
| 2 | 1 | Apple Pie |
+-----------+-------------+-------------+
| 3 | 3 | Chips |
+-----------+-------------+-------------+
If you use compound primary keys, you have to create a new record for "Owner", and new records for "Deals", copy the other fields, and delete the previous records.
If you use single keys, you just have to change the foreign key of Table, without inserting or deleting new records.
Cheers.
I've seen a lot of discussion regarding this. I'm just seeking for your suggestions regarding this. Basically, what I'm using is PHP and MySQL. I have a users table which goes:
users
------------------------------
uid(pk) | username | password
------------------------------
12 | user1 | hashedpw
------------------------------
and another table which stores updates by the user
updates
--------------------------------------------
uid | date | content
--------------------------------------------
12 | 2011-11-17 08:21:01 | updated profile
12 | 2011-11-17 11:42:01 | created group
--------------------------------------------
The user's profile page will show the 5 most recent updates of a user. The questions are:
For the updates table, would it be possible to set both uid and date as composite primary keys with uid referencing uid from users
OR would it be better to just create another column in updates which auto-increments and will be used as the primary key (while uid will be FK to uid in users)?
Your idea (under 1.) rests on the assumption that a user can never do two "updates" within one second. That is very poor design. You never know what functions you will implement in the future, but chances are that some day 1 click leads to 2 actions and therefore 2 lines in this table.
I say "updates" quoted because I see this more as a logging table. And who knows what you may want to log somewhere in the future.
As for unusual primary keys: don't do it, it almost always comes right back in your face and you have to do a lot of work to add a proper autoincremented key afterwards.
It depends on the requirement but a third possibility is that you could make the key (uid, date, content). You could still add a surrogate key as well but in that case you would presumably want to implement both keys - a composite and a surrogate - not just one. Don't make the mistake of thinking you have to make an either/or choice.
Whether it is useful to add the surrogate or not depends on how it's being used - don't add a surrogate unless or until you need it. In any case uid I would assume to be a foreign key referencing the users table.