This question already has answers here:
Should each and every table have a primary key?
(16 answers)
Closed 5 years ago.
I am studying Computer Science, and recently in the exam, I had the following question:
“Write the SQL query creating this ER model.”
The model has three (3) tables: person, library, and the third table contains two foreign keys linked respectively on person's id and library's id.
In the middle table (the foreign keys one), it wasn't indicated if the foreign keys were making the primary so I asked him to make sure, and he told me this table had no primary key.
That left me confused, and refused to explain more, and just said “it works without a primary key, we saw this in class.”
Because the model was a capture from phpmyadmin's designer view and because he didn't really justified. I feel suspicious toward this and I think he tried it, and that MySQL let him do it but that is wrong. Can someone explain this better? I'm totally fine with being wrong, I just want to know.
To make it a bit clearer, he precised that there was no primary key at all and not just the two foreign key together like I thought.
The "middle" table is usually referred to as a junction table. In this case, it represents relationships between people and libraries. When your teacher mentioned that this table has no primary key, it could be that the table has no formal primary key defined. Of course, internally MySQL needs to assign a unique ID to each record to internally keep track of the table. But there may not be an explicit primary key. However, most likely there is a composite primary key in this table composed of the combination of the people and libraries columns. And, also most likely, this pair of values is unique in each row, unless a given relationship could be repeated. If such repetitions exist, then the table is not normalized.
Related
I started to design a database that tracks system events by following some online tutorials, and some easy examples start by assigning auto-incrementing IDs as primary keys. I looked at my database, I don't really need IDs. Out of all my columns, the timestamp and device ID are the two columns that together identifies an unique event.
What my program does right now is to pull some events from system log in the past x minutes and insert these events to the database. However, I could be going too much into the past that the events overlap with what's already in the database. As I mentioned before, timestamp and device ID are the two fields that uniquely identify an event. My question is, should I use these two fields as my primary key and use "Insert ignore" from now on so I can avoid having duplicate records?
It is a good practise to never have your business values as table's primary key and always to use synthetic, e.g. autoincrement, values for this. You will make your life easier in the future when business requirements change :)
We are currently struggling with exactly this situation. Have a column with business values as a primary key for 2 years and now painfully introducing an autoincrement one.
You may need to use foreign key from other table to this in the future to link some rows between two tables. It is easier with one-column primary key.
But if you don't need it now - no need to create column special for index. Table can be altered in future to add such column with autoincrement and move primary key to it.
I know this question has been asked a lot but my example seems different.
I have two entities: Doctor and Client, and a many-to-many relationship between them to create the entity Appointment, which has, say "appointment_date_time" for an attribute.
I'm using the foreign keys from Doctor and Client to create a composite primary key in Appointment, but since there can be many appointments between the same doctor and person, should the "date_time" also be included as part of the primary key so there's no duplicates? Or would the two foreign keys be enough to query off of?
Thanks!
Your PRIMARY KEY needs to always be unique, so including the datetime would make an usable composite PRIMARY KEY that would (probably, unless you could have multiple appointments at the same time for 2 different purposes, which might happen) be unique.
However this is unlikely the best approach practically speaking, as if they move the appointment then this time will change (or maybe changed the doctor). Then you have no way to reference the appointment statically, say for example if you associated some extra data to it during creation or had to reference it as an audit entry. It also means any references to it that you do create would need to store all 3 columns.
As such I would simply look to create an auto incrementing primary key in this case, and simply index on both doctor and client for fast searches.
A friend of mine just sent me an image of his new api database design.
When I saw it, I noticed that his user table had three primary ids.
I actually thought this wouldn't be possible.
It got me thinking... Is it okay to do this? As long as each column is unique?
I can't seem to find a reason not to do this, except the id is not primary if there are more than one.
Is this a bad database design? And why?
There should be only one column(s) designated as the PRIMARY KEY per table and most DB's will disallow usage of multiple PRIMARY KEYS. Note that a PRIMARY KEY can span multiple columns. Use UNIQUE for other column(s) that require unique values. UNIQUE keys can also be used in foreign key relationships.
When using a foreign key in a table, is it good form to change the name of the key for that table to make it clear what function the key performs in the table, or is it good form to retain the original name, to make it clear that it is a foreign key?
Example:
a table keeps track of users, the primary key is user_id
a second table stores articles on the website and keeps track of the author with the foreign key user_id.
In the context of the second table it would make more sense to call the foreign key author. In the context of the whole database it would make more sense to call the foreign key user_id
Is there a general convention that deals with this situation, or is that what comments are for?
Well, if you have a movie table you wouldn't want columns called person_id and person_id, but rather producer and director, or perhaps producer_id and director_id, or maybe producer_person_id and director_person_id.
I know movies can have multiple directors and multiple producers; this was just an example. Any case in which a table has two foreign keys to the same table will show you that you cannot in principle stick completely to a convention of using only the table name in the column name. You can use both (as in the producer_person_id example) but that leads to long column names.
Don't use comments. No one reads them. Okay that was just snark, perhaps, but in general favor descriptive names to comments!
Aside from the two-foreign-key issue, I'm not really aware of any univerally accepted convention.
It is conventional to know the database schema's modelling and designing. Whatever makes sense to the database administrator. Business logic is not concerned with how the database is named, only the results. For the database administrator if it make more sense to rename the foreign key author_id to refer to user_id of another table then do so and notate it in some documents that T2.author_id must exist in T1.user_id. When transitioning from modelling to designing the database (which is where you are now) it would make sense to just keep it simple, but you can change the foreign key names so long as you can remember them (and document them as well).
Note to Mod: I read through about a dozen posts that seemed to pertain to this issue, but none of them answered my question. Please do not flag this post for deletion; this is not a duplicate question.
I am building a database for a web-gallery that will contain many-to-many relationships. For example, tags and images. Obviously, to accomplish this a third, link, table will be created. I can see a use for having a primary key column in the tags table and the images table, but I can't imagine a use for it in the links table. It would just take up server space. So, I'm thinking of just not having a primary key column in the links table. Does MySQL allow this? Or, would there be any compelling reason to have a primary key in the links table? Thanks.
Link Table:
+--------------+---------+-----------+
| primary key? | tag ids | image ids |
+--------------+---------+-----------+
Clarification
Will not having a primary key in a table break the database?
There is no requirement that you have a primary key.
However, there is also no requirement that a primary key be only one field. In this case you might declare your primary key to be (tag_id, image_id).
You've got a question in reply to another post that gives me the idea that maybe you're thinking you should concatenate the two fields to make the primary key. Don't. Define the key as
alter table link add primary key (tag_id, image_id);
Do NOT say
alter table link add primary key (tag_id + image_id);
(I think "+" is the concatenation operator in MySQL. It's been a while. The SQL standard is "&" but MySQL uses that for something else.)
There's a big difference between the two, namely, in the first case, 25,34 and 253,4 are two different values, while in the second case they both get turned into 2534.
Will you always go from tag to image, or will you also want to go from image to tag? If you need to go in both directions, then you should create two indexes, or a primary key and an index, with the fields in both directions. Like:
create index link_tag_image on link(tag_id, image_id);
create index link_image_tag on link(image_id, tag_id);
If you make only the first (for example), then consider this query:
select tag.name
from image
join link on image.image_id=link.imagae_id
join tag on tag.tag_id=link.tag_id
where image.foo='bar'
This seems plausbile enough: find all the tags that match images that meet a certain condition. But without the second index, this query could take a very long time, because the db will have to read the entire link table sequentially to find all the records with a given image_id.
There is no need for primary key in the link table. Although a compound key is a good idea. Uniqueness can be achieved by using UNIQUE ( tag_ids, image_ids)
Yes, your primary key should be a compound/composite key of tag_id and image_id, i.e. PRIMARY KEY (tag_id, image_id). There's no need for an extra autoincrement column in this case.
When working with MySQL Workbench it's highly advisable because without a primary key it won't allow any access to your tables other than read only, which is a pain when trying to test your database. Although it does seem wasteful to have a PK that is never going to be referenced in a relationship.