Foreign key constraints and bridging tables

Foreign key constraints and bridging tables - mysql

I am currently working with 6 tables: users, categories, videogames, videogames_categories_bridge, users_favorites, users_dislikes. I am trying to layout the tables in the best manner possible to show video games preference for user(see below example). However, I am getting a foreign key constraint error when creating the tables. How could I achieve(if possible) the below with my current tables schema? Also, Is there a way in avoiding that both values inserted(favorite and dislike) are marked true for a game? SQLFIDDLE
Example: Show all video game preference for an userid 569723
game_id category_id game_name category_name favorite dislike
------- ----------- ---------------- ------------- --------- --------
840832 1000 'counter-strike' fps 1 NULL
779343 1000 'call of duty modern warfare' fps 1 NULL
684244 2000 'minecraft' adventure NULL NULL
983565 2000 'assassin\'s creed syndicate'adventure NULL NULL
858168 3000 'need for speed - rivals' racing NULL NULL
819837 4000 'mortal kombat x' fighting NULL NULL
634266 5000 'street fighter v' fighting NULL NULL

You have some problems with your foreign keys and tables in general:
the "destination" column of the foreign key reference has to be indexed so InnoDB can quickly check if it exists etc. (for instance user_id in your users table is only a second column in your primary key, it has to be first in some index)
in one case (videogames_categories_bridge.category_id) you try to reference the same column in the same table, that does not make sense
primary keys in users and categories contain the name AND id at the same time so they do not enforce much - usually the ID is the right one for a foreign key. The way you defined it there might be the same id for multiple different names.
http://sqlfiddle.com/#!9/9e24b - the FKs modified to work

Related

Database design: When not to use foreign keys?

I am unsure what is the rule of thumb of when to use foreign keys and when it's better to insert an "unreferenced" value in regard of disk space needed, performance etc.
Let's say I have three tables:
Table 1: itemGroup (for to populate a dropdown menu with items)
ID title
1 Active/Inactive Options
2 Car Brands
3 Ratings
4 Languages
Table 2: item (the itemID will be the actual value in the dropdown and used as a foreign key )
itemID listID title
1000 1 active
1001 1 inactive
1002 2 Porsche
1003 2 Audi
1004 3 1-Star Rating
1005 3 2-Star Rating
1006 4 en
1007 4 de
Table 3: exampleTable
ID car rating active language
So my question is whether I should insert foreign keys in table 3 using the itemID, or would it make more sense to use a 1/0 for active/inactive and to use let's say 1,2,3,4,5 as an integer for the rating? Guess for the car it's quite self explaining that the foreign keys are better but in some cases it's hard to decide as my "item" table can be very big and therefore the itemID has more digits than the actual "value" it might be referring to and in a big database this at some point will make a difference in space and I guess also performance wise because with foreign keys I need to make joints.
UPDATE:
I added the field "language" as maybe here the issue gets illustrated better. So if I'd store a language foreign key (e.g. "1006"):
I need to store over and over a 4-digit int in my exampleTable, instead of just a 2-character varchar
I can't do a an easy query like "SELECT * from exampleTable WHERE language=en"
Why would it be better to use a foreign key here?

MySQL two-column table as primary key

I have an extreamly simple idea: table that keeps user "achievements". And it is as simple as that:
user_id | achievement_id
1 | 1
1 | 2
1 | 5
2 | 2
2 | 3
All what I need is user id, and id of achievement if he already got it. All what I need to SELECT is SELECT achievement_id WHERE user_id=x. So no need for an artificial autoincrement column that I'll never use or know what it contains. But setting an primary key is required, so the question is - is it good idea to make such 2-column table and set both columns as multi-column primary key? I already have a set of 3-columns table where 2 are primary key, because it is logic... Well, logic for me, but for the database?

These types of tables are common in cases of n-n relationships, multivalued attributes, and weak entities. It varies a lot from its modeling, but yes, it is a good solution for some cases. the primary key is usually the relation of the columns. In your case it would be user_id and achievement_id.

Yes since the rule for such a set of n-keys is: "I only want one kind of record which has this set (a,b) of keys".
-> therefore you won't be able to add twice "Mario, achievement1".
Primary key will be then (PlayerID, AchievementID).
If you want to add some informations about this achievement (for example, when the player got the achievement), simply do such as: (PlayerID, AchievementID, Date) with PlayerID, AchievementID as primary key.
I hope this will help you.

Allow/require only one record with common FK to have "primary" flag

Firstly, I apologise if this is a dupe - I suspect it may be but I can't find it.
Say I have a table of companies:
id | company_name
----+--------------
1 | Someone
2 | Someone else
...and a table of contacts:
id | company_id | contact_name | is_primary
----+------------+--------------+------------
1 | 1 | Tom | 1
2 | 2 | Dick | 1
3 | 1 | Harry | 0
4 | 1 | Bob | 0
Is it possible to set up the contacts table in such a way that it requires that one and only one record has the is_primary flag set for each common company_id?
So if I tried to do:
UPDATE contacts
SET is_primary = 1
WHERE id = 4
...the query would fail, because Tom (id = 1) is already flagged as the primary contact for company_id = 1. Or even better, would it be possible to construct a trigger so that the query would succeed, but Tom's is_primary flag would be cleared by the same operation?
I am not too bothered about checking whether company_id exists in the companies table, my PHP code would already have performed this check before I got to this stage (although if there is a way to do this in the same operation it would be nice, I suppose).
When I initially thought about this I thought "that will be easy, I'll just add a unique index across the company_id and is_primary columns" but obviously that won't work as it would restrict me to one primary and one non-primary contact - any attempt to add a third contact would fail. But I can't help feeling there would be a way to configure a unique index that gives me the minimum functionality I require - to reject an attempt to add a second primary contact, or reject an attempt to leave a company with no primary contact.
I am aware that I could just add a primary_contact field to the companies table with an FK to the contacts table but it feels messy. I don't like the idea of both tables having an FK to the other - it seems to me that the one table should rely on the other, not both tables relying on each other. I guess I just think that over time there is more chance of something going wrong.
To sum up:
How can I restrict the contacts table so that one and only one record with a given company_id has the is_primary flag set?
Anyone have any thoughts on whether two tables having FKs to each other is a good/bad idea?

Circular refenences between tables are indeed messy. See this (decade old) article: SQL By Design: The Circular Reference
The cleanest way to make such a constraint is to add another table:
Company_PrimaryContact
----------------------
company_id
contact_id
PRIMARY KEY (company_id)
FOREIGN KEY (company_id, contact_id)
REFERENCES Contact (company_id, id)
This will also require a UNIQUE constraint in table Contact on (company_id, id)

You could just do a query before that one setting
UPDATE contacts SET is_primary = 0 WHERE company_id = .....
or even
UPDATE contacts
SET is_primary = IF(id=[USERID],1,0)
WHERE company_id = (
SELECT company_id FROM contacts WHERE id = [USERID]
);
Just putting an alternative out there - personally I'd probably look to the FK approach though instead of this type of workaround i.e. have a field in the companies table with a primary_user_id field.
EDIT method w/o relying on a contact.is_primary field
Alternative method, first of all remove is_primary from contacts. Secondly add a "primary_contact_id" INT field into companies. Thirdly, when changing the primary contact, just change that primary_contact_id thus preventing any possibility of there being more than 1 primary contact at any time and all without the need for triggers etc in the background.
This option would work fine in any engine as it's simply updating an INT field, any reliance on FK's etc could be added/removed as required but at it's simplest it's just changing an INT fields value
This option is viable as long as you need one and precisely one link from companies to contacts flagging a primary

Multiple foreign keys per record in sql?

I'm creating an application (using PHP / Codeigniter / MYSQL) for tracking volunteers at events. I'd like multiple volunteers to be able to sign on to each event. I plan on doing this using a table called signup which looks something like this:
TABLE SIGNUP
============
VolunteerId EventId
----------- -------
12 223
13 223
15 223
12 235
13 235
19 235
Both columns are foreign keys (to the primary keys of the volunteer table and event table respectively).
Is there a better way to do this?
Should I use a compound-key as the primary key?

Honestly, I don't see a problem with the way you've set it up. Tables like this are commonly used to establish one-to-many relationships between different objects. I'm doing something similar in a table that references counties and cities in a given state. (Some cities span multiple counties.)
Database design best practices state that you should declare a primary key for a table. You don't have to do this; you can technically declare a table without a primary key. However, note that many DB engines will simply create a primary key for you behind the scenes if you don't specifically declare a key; this, however, may not be ideal for every situation (and generally isn't). Specifying a primary key of your choice is good for database optimization and organization.
Due to this, I'd say that you might as well use a compound key as your primary key for your many-to-many table instead of creating a separate index column. In this situation, this will satisfy the table requirements (as a db engine will make a primary key for you regardless) and it will prevent multiple occurences of the same pair, which won't do you any good in a many-to-many reference table.
Short answer: Go with the compound primary key - primary key(VolunteerID, EventID). You shouldn't go wrong.

One use for a compound UNIQUE key would be to prevent the same volunteer/event pair from appearing twice in the table. There's no need for a primary key for this.

A good discussion on why compound primary keys should be avoided: What are the down sides of using a composite/compound primary key?

Given the table you've described you have three choices
1 - lunchmeat317
SIGNUP
-------
VolunteerId (PK)
EventId (PK)
2 - Ted Hopp
SIGNUP
-------
VolunteerId (AK1)
EventId (AK1)
3 - ic3b3rg
SIGNUP
-------
SignUpID (PK)
VolunteerId (AK1)
EventId (AK1)
As Thomas pointed out the main difference between 1 and 2 is that Unique doesn't stop the following.
VolunteerId EventId
----------- -------
null null
null null
However if these fields don't allow nulls to begin with (and the shouldn't) then they're exactly the same.
You could also add, as ic3b3rg suggests a Surrogate key (SignUpID). But as CJ Date notes (and I'm paraphrasing) introducing an artificial, surrogate, nonvolatile key will often be a good idea, but since its often difficult to determine volatility there's no formal way to know when you really need it.
That said as long as this table is is ...
Tracking that volunteers have signed up for events
There won't be any other attributes that have a functional or join dependency to R(VolunteerId, EventID)
... then in the immortal words of Yogi Berra "When you come to a fork in the road, take it" Meaning all three choices are valid and the choice probably won't impact your system one way or another.
Personally this is how I typically do it.
SIGNUP
-------
SignUpID (PK)
VolunteerId (AK1) (Not Null)
EventId (AK1) (Not Null)

Database many-to-many intermediate tables: extra fields

I have created a 'shops' and a 'customers' table and an intermediate table customers_shops. Every shop has a site_url web address, except that some customers use an alternative url to access the shop's site (this url is unique to a particular customer).
In the intermediate table below, I have added an additional field, shop_site_url. My understanding is that this is in 2nd normalised form, as the shop_site_url field is unique to a particular customer and shop (therefore won't be duplicated for different customers/shops). Also, since it depends on customer and shop, I think this is in 3rd normalised form. I'm just not used to using the 'mapping' table (customers_shops) to contain additional fields - does the design below make sense, or should I reserve the intermediate tables purely as a to convert many-to-many relationships to one-to-one?
######
customers
######
id INT(11) NOT NULL PRIMARY KEY
name VARCHAR(80) NOT NULL
######
shops
######
id INT(11) NOT NULL PRIMARY KEY
site_url TEXT
######
customers_shops
######
id INT(11) NOT NULL PRIMARY KEY
customer_id INT(11) NOT NULL
shop_id INT(11) NOT NULL
shop_site_url TEXT //added for a specific url for customer
Thanks

What you are calling an "intermediate" table is not a special type of table. There is only one kind of table and the same design principles ought to be applicable to all.

Well, let's create the table, insert some sample data, and look at the results.
id cust_id shop_id shop_site_url
--
1 1000 2000 NULL
2 1000 2000 http://here-an-url.com
3 1000 2000 http://there-an-url.com
4 1000 2000 http://everywhere-an-url-url.com
5 1001 2000 NULL
6 1001 2000 http://here-an-url.com
7 1001 2000 http://there-an-url.com
8 1001 2000 http://everywhere-an-url-url.com
Hmm. That doesn't look good. Let's ignore the alternative URL for a minute. To create a table that resolves a m:n relationship, you need a constraint on the columns that make up the m:n relationship.
create table customers_shops (
customer_id integer not null references customers (customer_id),
shop_id integer not null references shops (shop_id),
primary key (customer_id, shop_id)
);
(I dropped the "id" column, because it tends to obscure what's going on. You can add it later, if you like.)
Insert some sample data . . . then
select customer_id as cust_id, shop_id
from customers_shops;
cust_id shop_id
--
1000 2000
1001 2000
1000 2001
1001 2001
That's closer. You should have only one row for each combination of customer and shop in this kind of table. (This is useful data even without the url.) Now what do we do about the alternative URLs? That depends on a couple of things.
Do customers access the sites through
only one URL, or might they use more
than one?
If the answer is "only one", then you can add a column to this table for the URL, and make that column unique. It's a candidate key for this table.
If the answer is "more than one--at the very least the site url and the alternative url", then you need to make more decisions about constraints, because altering this table to allow multiple urls for each combination of customer and shop cuts across the grain of this requirement:
the shop_site_url field is unique to a
particular customer and shop
(therefore won't be duplicated for
different customers/shops)
Essentially, I'm asking you to decide what this table means--to define the table's predicate. For example, these two different predicates lead to different table structures.
customer 'n' has visited the web site
for shop 'm' using url 's'
customer 'n' is allowed to visit the
web site for shop 'm' using alternate
url 's'

Your schema does indeed make sense, as shop_site_url is an attribute of the relationship itself. You might want to give it a more meaningful name in order to distinguish it from shops.site_url.

Where else would you put this information? It's not an attribute of a shop, and it's not an attribute of a customer. You could put this in a separate table, if you wanted to avoid having a NULLable column, but you'd end up having to have a reference to your intermediate table from this new table, which probably would look even weirder to you.

Relationships can have attributes, just like entities can have attributes.
Entity attributes go into columns in entity tables. Relationship attributes, at least for many-to-many relationships, go in relationship tables.
It sounds as though, in general, URL is determined by the combination of shop and customer. So I would put it in the shop-customer table. The fact that many shops have only one URL suggests that there is a fifth normal form that is more subtle than this. But I'm too lazy to work it out.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008