I always see MySQL database primary keys as integers. Is that because primary keys must be integers, or because of ease of use when setting auto_increment on the column?
I am wondering just in case I want my primary key to be a varchar in the future.
You can use varchar as well as long as you make sure that each one is unique. This however isn't ideal (see article link below for more info).
What you are looking for is called natural key but a primary key with auto-increment and handled by the RDBMS is called surrogate key which is preferred way. Therefore you need to have it to be integer.
Learn more:
Surrogate Keys vs Natural Keys for Primary Key?
Why I prefer surrogate keys instead of natural keys in database design
Why Integers Make Good Primary Keys
It's often easier to use an integer for indexing, in comparison to a string or composite key, because it lends itself well to treating results (conceptually or in practice) as an array. Depending on the database implementation, integers may also be faster to access, sort, or compare, and the integer type usually offers additional features like auto-incrementing that aren't available for other data types. How would you go about auto-incrementing a composite key, for example?
MySQL has this to say about the primary key:
The primary key for a table represents the column or set of columns that you use in your most vital queries. It has an associated index, for fast query performance. Query performance benefits from the NOT NULL optimization, because it cannot include any NULL values.
SQL allows any non-null, unique column (or set of columns) to be used as a primary key. However, if you don't care about auto-incrementing, you can usually make your primary key any index that is UNIQUE and NOT NULL.
Consider Your Application's Expectations
While not a hard requirement, some frameworks optimize for integer primary keys. For example, Ruby on Rails facilitates the use of an auto-incrementing primary key by default; you have to deliberately work against the convention if you want to use something different.
That doesn't mean one should or shouldn't use integers as primary keys. It just means the choice of primary key is driven in part by your underlying database system and the queries you expect to run against it, and in part by the applications you expect to use to retrieve the data. All of those things should be taken into consideration when considering candidate keys.
A primary key is unique.int is easy to satisfy that condition with auto increment.If you make it char you have to create a way to make it unique whenever you add a data.
No, the primary key does not have to be an integer; it's just very common that it is. As an example, we have User ID's here that can have leading zeroes and so must be stored in a varchar field. That field is used as a primary key in our Employee table.
For a primary key to be an Integer, is easier to manage, and makes its index more effective.
As you know, while the keys are auto indexed, the indexes are stored as Binary tree which is best for integers in traversing.
There is no restriction on making a key to be int, you can declare it a varchar too.
Basicly a primary key needs to fulfill only 2 conditions: it has to be a not null column and it has to be unique. Any typeof column that respects this 2 conditions can be set as primary keys. In case the primary key is a multiple column one, then both columns need to be not null.
While in theory you can use other fields as primary keys, integers are the easiest to manage, as well as being the fastest indexes available.
Related
I'm fairly new to SQL programming, and I have a table that has two columns, UUID (a 32 CHAR unique hexstring), and permission
(just a permission for the system I'm using, like 'admin' or 'blacklist').
I'm not the greatest on the concept of the primary key, so I was wondering if its valid to use UUID as my primary key, considering it is 100% unique without whitespace, which I am already using to easily index my queries. I'm not joining the table with anything and more than likely never will, so is it valid to use the UUID as my primary key without any unforeseen consequences?
It is valid to use a UUID as a primary key. It meets the two conditions required of a primary key:
It is unique.
It is never NULL.
However, it is a bad idea. Why? MySQL automatically clusters the data by the primary key. That is, the data is actually sorted by the primary key. UUIDs are not sequential. Inserts can occur anywhere, requiring movement of data.
I would recommend a simple auto incremented primary key and declare the UUID as unique:
I am working on banking application, during the database design I have heard that primary key should be auto increment, but it is easy to use account number as primary key for the table account, which is the mixture and string and digit. So please mention is there any problem about using account_number as primary key.
A primary key is by no means required to use the auto_increment property - it just needs to be a unique, not-null, identifier, so the account number would do just fine.
There are, however, two additional considerations worth keeping in mind:
Numbers generally take up less space than strings, which usually makes their indexes smaller, and thus faster. If you have a complicated schema and use a lot of joins on the primary key in your queries, this difference may be noticeable.
Some DBAs advocate the practice of having primary (and other) keys that are separate from your data. That way, if you one day change your account identifiers (e.g., the bank acquires another bank and has to incorporate its clients into its system), you'll just have to update some data, and not all your keys.
There's no problem with it, and you don't need to auto_increment (but its handy). A Primary Key just needs to be a unique value that identifies its entry from other entries, and not null.
Primary key should be unique but not necessarily need to be auto_increment. As long is account_number is unique, there shouldn't be any problem.
Well, there is no need that the primary key is set to auto increment if we know the primary key is always unique. We set the primary key to auto increment because it is easy to handle. As you are making a bank software, the bank account number is always unique for every user. Therefore you can use that account number as a primary key. That's not a problem.
As long as the account number is unique, immutable and never NULL, it's not a problem to use it as a primary key.
If you aren't sure about the immutability it's probably not a good idea, things change and updating a primary key is generally not the best idea due to the fact that you also need to update all foreign key referencing it.
If you're not sure, a surrogate primary key along with a UNIQUE index on the account number is probably better.
I'm putting together a new database and I have a few tables that contain temp data.
e.g.: user requests to change password - a token is stored and then later removed.
Currently I have a primary key on these tables that will auto-increment from 1 upwards.
AUTO_INCREMENT = 1;
I don't really see any use for this primary key... I will never reference it and it will just get larger.
Should tables like this have a primary key or not?
Short answer: yes.
Long answer:
You need your table to be joinable on something If you want your table
to be clustered, you need some kind of a primary key. If your table
design does not need a primary key, rethink your design: most
probably, you are missing something. Why keep identical records? In
MySQL, the InnoDB storage engine always creates a PRIMARY KEY if you
didn't specify it explicitly, thus making an extra column you don't
have access to.
Note that a PRIMARY KEY can be composite.
If you have a many-to-many link table, you create the PRIMARY KEY on
all fields involved in the link. Thus you ensure that you don't have
two or more records describing one link.
Besides the logical consistency issues, most RDBMS engines will
benefit from including these fields in an UNIQUE index.
And since any PRIMARY KEY involves creating a UNIQUE index, you should
declare it and get both logical consistency and performance.
Here is a SO thread already have same discussion.
Some people still loves to go with your opinion. Have a look here
My personal opinion is that you should have primary keys, to identify or to make a row unique. The logic can be your program logic. Can be an auto-increment or composite or whatever it can be.
It's obvious that we already have another unique information about each user, and that is username. Then, why we need another unique thing for each user? Why should we also have an id for each user? What would happen if we omit the id column?
Even if your username is unique, there are few advantages to having an extra id column instead of using the varchar as your primary key.
Some people prefer to use an integer column as the primary key, to serve as a surrogate key that never needs to change, even if other columns are subject to change. Although there's nothing preventing a natural primary key from being changeable too, you'd have to use cascading foreign key constraints to ensure that the foreign keys in related tables are updated in sync with any such change.
The primary key being a 32-bit integer instead of a varchar can save space. The choice between a int or a varchar foreign key column in every other table that references your user table can be a good reason.
Inserting to the primary key index is a little bit more efficient if you add new rows to the end of the index, compared to of wedging them into the middle of the index. Indexes in MySQL tables are usually B+Tree data structures, and you can study these to understand how they perform.
Some application frameworks prefer the convention that every table in your database has a primary key column called id, instead of using natural keys or compound keys. Following such conventions can make certain programming tasks simpler.
None of these issues are deal-breakers. And there are also advantages to using natural keys:
If you look up rows by username more often than you search by id, it can be better to choose the username as the primary key, and take advantage of the index-organized storage of InnoDB. Make your primary lookup column be the primary key, if possible, because primary key lookups are more efficient in InnoDB (you should be using InnoDB in MySQL).
As you noticed, if you already have a unique constraint on username, it seems a waste of storage to keep an extra id column you don't need.
Using a natural key means that foreign keys contain a human-readable value, instead of an arbitrary integer id. This allows queries to use the foreign key value without having to join back to the parent table for the "real" value.
The point is that there's no rule that covers 100% of cases. I often recommend that you should keep your options open, and use natural keys, compound keys, and surrogate keys even in a single database.
I cover some issues of surrogate keys in the chapter "ID Required" in my book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
This identifier is known as a Surrogate Key. The page I linked lists both the advantages and disadvantages.
In practice, I have found them to be advantageous because even superkey data can change over time (i.e. a user's email address may change and thus any corresponding relations must change), but a surrogate key never needs to change for the data it identifies because its value is meaningless to the relation.
It's also nice from a JOIN standpoint because it can be an integer with a smaller key length than a varchar.
I can say that in practice I prefer to use them. I have been bitten too many times by having multiple-column primary keys or a data-representative superkey used across tables having to become non-unique later due to changing requirements during development, and that is not a situation you want to deal with.
In my opinion, every table should have a unique, auto-incremented id.
Here are some practical reasons. If you have duplicate rows, you can readily determine which row to delete. If you want to know the order that rows were inserted, you have that information in the id. As for users, there's more than on "John Smith" in the world. An id provides a key for foreign references.
Finally, just about anything that might describe a user -- a name, an address, a telephone number, an email address -- could change over time.
im mysql we have.
1:Index fields 2:Unique fields and 3:PK fields.
index means pointable
unique means in a table must be one in all rows.
PK = index + unique
in a table you may have lots of unique fields like
username or passport code or email.
but you need a field like ID. that is both unique and index (=PK).which is first is always one thing and never changes and second is unique and third is simple (because is often number).
One reason to have a numeric id is that creating an index on it is leaner than on a text-field, reducing index size and processing time required to look up a specific user. Also it's less bytes to save when cross-referencing to a user (relational database) in a different table.
Will it ever happen that we design a table that doesn't need a primary key?
No.
The primary key does a lot of stuff behind-the-scenes, even if your application never uses it.
For example: clustering improves efficiency (because heap tables are a mess).
Not to mention, if ANYONE ever has to do something on your table that requires pulling a specific row and you don't have a primary key, you are the bad guy.
Yes.
If you have a table that will always be fetched completely, and is being referred-to by zero other tables, such as some kind of standalone settings or configuration table, then there is no point having a primary key, and the argument could be made by some that adding a PK in this situation would be a deception of the normal use of such a table.
It is rare, and probably when it is most often done it is done wrongly, but they do exist, and such instances can be valid.
Depends.
What is primary key / unique key?
In relational database design, a unique key can uniquely identify each row in a table, and is closely related to the Superkey concept. A unique key comprises a single column or a set of columns. No two distinct rows in a table can have the same value (or combination of values) in those columns if NULL values are not used. Depending on its design, a table may have arbitrarily many unique keys but at most one primary key.
So, when you don't have to differentiate (uniquely identify) each row,
you don't have to use primary key
For example, a big table for logs,
without using primary key, you can have fairly smaller size of data and faster for insertion
Primary key not mandatory but it is not a good practice to create tables without primary key. DBMS creates auto-index on PK, but you can make a column unique and index it, e.g. user_name column in users table are usually made unique and indexed, so you may choose to skip PK here. But it is still a bad idea because PK can be used as foreign key for referential integrity.
In general, you should almost always have PK in a table unless you have very strong reason to justify not having a PK.
Link tables (in many to many relationship) may not have a primary key. But, I personally like to have PK in those tables as well.