Quite often I encounter situation like this:
table `user_adress`
+----------+-------------+--------------+---------+
|adress_id | user_id | adress_type |adress |
+----------+-------------+--------------+---------+
| 1 | 1 | home |adressXXX|
| 2 | 2 | home |adressXXX|
| 3 | 3 | home |adressXXX|
| 4 | 1 | work |adressXXX|
| 5 | 2 | work |adressXXX|
| 6 | 1 | second_home |adressXXX|
+----------+-------------+--------------+---------+
If I want to use it, I'm using queries like this:
SELECT `adress` FROM `user_adress` WHERE `user_id`=1;
Seems quite normal, but the thing is, that I use "useless" adress_id column, that has no other purpouse but to be an primary key with autoincrement just for the sake of having an primary key in MySQL table. I never use or need this number. So I figured out that I should not use primary key in my table at all, remove totally adress_id, and set INDEX (without unique) at user_id column. That seems to be good - or am I wrong?
I have some doubts, because as much as I'm reading, everywhere I see advices, that every table should, or even need to have primary key. But why? Perhaps my database is badly designed if I allowed this to happen, but looking on my extreamly simple example table - I can't imagine how this could be the case in every situation, especially in such simple cases. I deffinetly missunderstanded some simple, basic rules about creating tables and properly indexing them - where is the hole in my toughts?
Purely based on your table structure, I would say that your primary key is incorrect.
Instead, it looks like your primary should be:
PRIMARY KEY (user_id, address_type)
You are correct that every table should have a primary key ideally, but primary keys can be over multiple fields.
It is still sometimes easier to have a simple auto-incrementing id as your primary key. The Innodb storage engine will actually do this secretly in an invisible field.
Maybe in your limited example it's not needed, but in a lot of real-world cases it can just make it easier to work with the data. In that sense I would say that having an artificial auto-incrementing primary key is not a best practice from an academic standpoint, but it can be good idea from a 'real world, operational, and MySQL admin' perspective.
There's also ORM systems out there that simply require this (bad as that is).
As is evident in your data the primary key allow the access directly to a single row without any problem or ambiguity .. (expecially for delete or updated)
this is specifically the purpose of a primary key ..
di the fact you could need join this table to others table by user_id
and index (not unique ) on user_id
create index myidx on mytable(user_id)
is really useful for faster join allow a direct access only at the rows related to a single user_id
It's true that a relational database table needs a primary key.
But it all comes down to the definition of a primary key. A primary key is NOT necessarily a single integer column that auto-increments.
A primary key is any column or set of multiple columns that can uniquely identify every row. In your case, the combination of user_id and address_type can do this (as Evert posted already).
So if you make your table like this:
CREATE TABLE user_address (
user_id INT NOT NULL,
address_type varchar(10) NOT NULL,
address TEXT NOT NULL,
PRIMARY KEY (user_id, address_type)
);
Then you can update or delete one specific row at a time like this:
UPDATE user_address SET ...
WHERE user_id = ? AND address_type = ?;
Some people feel that it's more convenient to enforce a convention that every table should have a single integer column as its primary key. They even may insist that the column must be called id for the sake of consistency.
There's some advantage in consistency, but on the other hand, it's kind of brainless to insist on that convention even when it's not helpful.
Related
In my web application, the user can define documents and give them a unique name that identifies that document and a friendly name that a human will use to refer to the document. Take the following table schema as an example:
| id | name | friendly_name |
-----------------------------------------------
| 2 | invoice-2 | Invoice 2 |
In this example I've used the id column as the primary key, which is an auto incrementing number. Since there's already a natural ID for documents (name) I could also do this:
| name | friendly_name |
--------------------------------------
| invoice-2 | Invoice 2 |
In this example, name is the primary key of the document. We've eliminated the id field as it's essentially just a duplicate of name, since every document in the table must have a unique name anyway.
This would also mean that when I refer to a document from a foreign key relationship I'd have to call it document_name rather than document_id.
What's the best practice regarding this? Theoretically it's entirely possible for me to use a VARCHAR for the primary key, but does it come with any downsides such as performance overhead?
There are two schools of thought on this topic.
There are some who hold strongly to the belief that using a "natural key" as the primary key for an entity table is desirable, because it has significant advantages over a surrogate key.
The are others that believe that a "surrogate" key can provide some desirable properties which a "natural" key may not.
Let's summarize some of the most important and desirable properties of a primary key:
minimal - fewest possible number of attributes
simple - native datatypes, ideally a single column
available - the value will always be available when the entity is created
unique - absolutely no duplicates, no two rows will ever have the same value
anonymous - carries no hidden "meaningful" information
immutable - once assigned, it will never be modified
(There are some other properties that can be listed, but some of those properties can be derived from the properties above (not null, can be indexed, etc.)
I break the two schools of thought regarding "natural" and "surrogate" keys as the "best" primary keys into two camps:
1) Those who have been badly burned by an earlier decision to elect a natural key as the primary key, and
2) Those who have not yet been burned by that decision.
Of course you can.
create table sometbl(
`name` varchar(250) NOT NULL PRIMARY KEY,
`friendly_name` varchar(400)
);
Time for accessing integer or varchar (unless its too long) key doesn't have any difference. Even if it has, it wont be your main bottleneck. As long as a column is declared as key mysql can access it very fast.
Auto incrementing integer can not be primary key. Its just a serial number for the row. When you look at the real object you'll see it doesn't have any serial number. So the primary key should be based on those real properties.
I'm sure this is simple stuff to many of you, so I hope you can help easily.
If I have a MySQL table on the "many" side of a "one to many" relationship - like this:
Create Table MyTable(
ThisTableId int auto_increment not null,
ForeignKey int not null,
Information text
)
Since this table would always be used via a join using ForeignKey, it would seem useful to make ForeignKey a clustered index so that foreign keys would always be sorted adjacently for the same source record. However, ForeignKey is not unique, so I gather that it is either not possible or bad practice to make this a clustered index? If I try and make a composite primary key using (ForeignKey, ThisTableId) to achieve both the useful clustering and uniqueness, then there is an error "There can only be one auto column and it must be defined as a key".
I think perhaps I am approaching this incorrectly, in which case, what would be the best way to index the above table for maximum speed?
InnoDB requires that if you have an auto-increment column, it must be the first column in a key.
So you can't define the primary key as (ForeignKey, ThisTableId) -- if ThisTableId is auto-increment.
You could do it if ThisTableId were just a regular column (not auto-increment), but then you would be responsible for assigning a value that is at least unique among other rows with the same value in ForeignKey.
One method I have seen used is to make the column BIGINT UNSIGNED, and use a BEFORE INSERT trigger to assign the column a value from the function UUID_SHORT().
#ypercube correctly points out another solution: The InnoDB rule is that the auto-increment column should be the first column of some key, and if you create a normal secondary key, that's sufficient. This allows you to create a table like the following:
CREATE TABLE `MyTable` (
`ForeignKey` int(11) NOT NULL,
`ThisTableId` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`ForeignKey`,`ThisTableId`),
KEY (`ThisTableId`)
) ENGINE=InnoDB;
And the auto-increment works as expected:
mysql> INSERT INTO MyTable (ForeignKey) VALUES (123), (234), (345), (456);
mysql> select * from MyTable;
+------------+-------------+
| ForeignKey | ThisTableId |
+------------+-------------+
| 123 | 1 |
| 234 | 2 |
| 345 | 3 |
| 456 | 4 |
+------------+-------------+
Straight into this one. I have a table for a sort of "like" feature. This table naturally has the following:
Name | Type | Attributes | (Comment)
Post ID | int | index | ID of the post which was "Liked"
Topic ID | int | index | ID of the topic which contains the "Liked" post
Member ID | int | index | ID of the member who "Liked" the post
Date | bigint | index | Date/time of "Like"
As you can see, there's no primary key. This seems natural. The only functions which need performing are the INSERT (for "Like"), DELETE (for "Unlike") and searching for likes in order of most recent by the post or member who gave them.
Each entry will obviously be very 'UNIQUE' - as only one like is needed per person per post. There seems absolutely no need for a unique primary index, as if duplicates occur (somehow) I will want to DELETE them all, not just one with a particular ID. Same with insertion, no one can like the same thing twice. And these "likes" will only ever be selected using the indexes from other tables.
Yet, phpMyAdmin now forbids me from any manual editing, copying or deleting. This is also fine, but prompted me to further look up the logistics of not having a primary key. When I found a stackoverflow answer, the general opinion was that it's "very rare" to not need a primary key.
So, either I've found one of these very rare moments, or it's not that rare at all. My scenario seems quite simple and common, so there should be a more definite answer. Everything seems natural this way, I will never ever need to actually use a primary key. Therefore, I'd think it'd be simpler not to have one. Are there any really mysterious (and somewhat magical) ways of MySQL I'm overlooking? Or am I safe to leave out a useless auto-incrementing primary ID key (which could reach its limit way before any of the currently used ID's would, anyway) at least until I time I find a use for them (never)?
You've said that Post ID and Member ID define the uniqueness of a column (and that Topic ID is secondary, included only for convenience).
So, why not have a primary key on (Post ID, Member ID)? If you already have UNIQUEness constraints on them, then this is not a big leap.
CREATE TABLE `Likes` (
`PostID` INT UNSIGNED NOT NULL,
`TopicID` INT UNSIGNED NOT NULL,
`MemberID` INT UNSIGNED NOT NULL,
`Date` DATETIME NOT NULL,
PRIMARY KEY (`PostID`, `MemberID`),
FOREIGN KEY (`PostID`) REFERENCES `Posts` (`ID`) ON DELETE CASCADE,
FOREIGN KEY (`MemberID`) REFERENCES `Members` (`ID`) ON DELETE CASCADE
) Engine=InnoDB;
(I don't know enough about TopicID to suggest key constraints for it, but you may wish to add some.)
Certainly adding an arbitrary auto-incrementing field is pointless, but that doesn't mean that you can't have a meaningful primary key.
As an aside, I'd consider removing the TopicID field; if you have your foreign keys set up properly then it should be trivial to do post<->topic lookup without it, and in this instance you're duplicating data and violating the relational model!
I have 3 tables: users, pages and users_pages
Users Table
+----+------+-----
| id | name | ...
+----+------+-----
Pages Table
+----+------+-----
| id | name | ...
+----+------+-----
users_pages table, which says, which user is admin of which page.
+---------+---------+
| user_id | page_id |
+---------+---------+
| 1 | 1 | // means, user 1 is admin of page 1
+---------+---------+
in users_pages table, combination of user_id and page_id is a compound key ( primary key )
Is it possible to define user_id and page_id as foreign key while they both together are primary key?
Yes, Absolutely. You havn't mentioned which relational database you are using, but this is common practice, and allowable in all relational databases i know of.
My attempt at an additional explanation:-
Primary and foreign keys are more like 'theoretical' things rather than hard physical things. When looking at the nuts and bolts, I find it useful to think of only indexes and contraints, not of 'keys' as such
Thinking this way a 'primary key' is actually a combination of two separate things :-
A unique contraint. This checks for and refuses any attempts to
create duplicates.
An index based on the field. This just makes
it much faster to retrieve the record if you use that field to look
it up (select * from table where pkey = 'x')
A 'foreign key' in practice is just a contraint, not much different from the unique key contraint. It checks the records exist in the other table, and refuses any attempts to create records with no corresponding entries in the referred to table.
There is no reason why you cant have multiple contraints on the same field (that it is both unique and exists in another table), and whatever indexes is on the table in no way prevents you from adding any contraint you like. Therefore there is no problem having the same field as part of a primary key and it also have a foreign key contraint.
I have a movie table and want to store alternative titles. I'll be storing the alternative titles/aliases in another table. I'm not sure what is the best primary key to use though.
I will have a movie_id INT field, and an alias varchar(255) field. Should the primary key be on both fields (since one movie can have more than one alias)? Should I add another field for the primary key instead, for example alias_id that just auto increments, but this serves no purpose otherwise. Or does this table need a primary key? Maybe it should just have a unique index on the alias and no primary key is needed?
Your movie_id could be your only PK and auto-incrementing. Then make a FK movie_id in your alternative alias table to match the alt. name with its original title.
movie_id | Title
--------------------
1 | "Jaws"
2 | "Star Trek"
3 | "Matrix 3"
movie_id | Alt_Title
------------------------
1 | "Death Shark"
1 | "Tales of the Deep"
3 | "Neo is Uber"
1 | "Another Jaws Title"
When you make an insert into the alt name table, you will have to make a join on the original title, and pull its movie_id to insert with.
put primary key on id fields in all cases, because the data storage is efficient and match-search operations is quick. if u want to enforce uniqueness use a unique index on the field(s) u want except primary key. primary keys are by default necessary and unique.
To answer your question directly, you want to make the movie_id be a foreign key in your alt_title table. Then the simplest thing is probably to create a separate alt_title_id field to be the primary key of the alt_title table. I wouldn't make the title the primary key, because thats awfully long and cumbersome to make a good key.
I'm not sure what you're doing with this data, but my impulse would be to create a single table to hold both primary and alternate titles, and then just have a flag to identify the primary. Assuming you have a bunch of other data about each movie, pull the title out of the basic movie record into a separate table. If you put them in one table, then if you want to search by either primary or alternate title, you just say
select whatever
from movie_title
join movie using (movie_id)
where title='Java Forever'
If you want to search by just primary title for some reason, fine, you write
select whatever
from movie_title
join movie using (movie_id)
where title='Java Forever' and primary=true
With two tables, if you want to search by primary title, sure, it's easy. But if you want to search by primary or alternate, you need a union, which is slow and painful. If the query is complex, joining on several other tables or pulling out a bunch of fields, all that extra complexity has to be written twice, in each half of the join.