Aside from the convenient auto-increment and UNIQUE features, does the PK actually speed up the index?
Will the speed be the same whether it's a non-PKed indexed INT or PKed (same column, two different tests)? If I had the same column on the same table on the same system, will it be faster if a UNIQUE INT column with an index also has PK enabled? Does PK make the index it coexists with faster?
Please, actual results only with system stats if you could be so kind.
The primary key for a table represents the column or set of columns that you use in your most vital queries. It has an associated index, for fast query performance. Query performance benefits from the NOT NULL optimization, because it cannot include any NULL values. With the InnoDB storage engine, the table data is physically organized to do ultra-fast lookups and sorts based on the primary key column or columns.
If your table is big and important, but does not have an obvious column or set of columns to use as a primary key, you might create a separate column with auto-increment values to use as the primary key. These unique IDs can serve as pointers to corresponding rows in other tables when you join tables using foreign keys.
Also refer the following locations : http://www.dbasquare.com/2012/04/04/how-important-a-primary-key-can-be-for-mysql-performance/ and http://www.w3schools.com/sql/sql_primarykey.asp
Rows in a base table are uniquely identified by the value of the primary key defined for the table. The primary key for a table is composed of the values of one or more columns.
Primary keys are automatically indexed to facilitate effective information retrieval.
The primary key index is the most effective access path for the table.
Other columns or combinations of columns may be defined as a secondary index to improve performance in data retrieval. Secondary indexes are defined on a table after it has been created (using the CREATE INDEX statement).
An example of when a secondary index may be useful is when a search is regularly performed on a non-keyed column in a table with many rows, defining an index on the column may speed up the search. The search result is not affected by the index but the speed of the search is optimized.
It should be noted, however, that indexes create an overhead for update, delete and insert operations because the index must also be updated.
Indexes are internal structures which cannot be explicitly accessed by the user once created. An index will be used if the internal query optimization process determines it will improve the efficiency of a search.
SQL queries are automatically optimized when they are internally prepared for execution. The optimization process determines the most effective way to execute each query, which may or may not involve using an applicable index.
Related
Lets say you have Users table and Posts table.
Users
id
name
email
Posts
id
contents
user_id
If I add index to "user_id" in Posts table, and set it as NOT NULL, Can I expect same effect as Foreign Key?
I know that I can set user_id as any number, whereas foreign_key will force you set valid id. Let's assume that user_id is valid. Is there any performance benefit when we set foreign_key?
The main benefit of foreign keys is that they enforce data consistency, meaning that they keep the database clean in other words Keys are Indexes that have Integrity rules applied to prevent corruption of data.
Index is a data structure built on columns of a table to speed up search for indexed records based on values of indexed columns. In other words you gain search speed in exchange of insert/delete speed and storage.
Is there any performance benefit when we set foreign_key?
In performance terms, you will face no improvement.
Foreign keys will impact INSERT, UPDATE and DELETE statements because of the data checking rules , but keep in mind that your data will be consistet .
In MySQL, defining a foreign key constraint automatically creates an index, unless it can use an index that already exists. That is, if you create an index and subsequently add a foreign key on the same column(s), MySQL does not create an extra index just for the foreign key.
If you run a query that needs that index, it doesn't matter if you created the index yourself or if the index was created as a side-effect of adding the foreign key. Either way, the index can help the query. The performance benefit is the same.
If you run a query that does not need that index, then there's no benefit to having index either way.
You didn't describe any specific SQL query, so there's no way for us to guess whether the index is needed.
I have a table that uses 2 foreign key fields and a date field.
Is it common to have a table use 3 or more fields as a primary key? And are there any disadvantages to doing this?
--
My 3 tables are employees, training, and emp_training. The employees table holds employee data. Training table holds different training courses. And I am designing the emp_training table to be the fields EmployeeID (FK), TrainingID (FK), OnDate.
An employee can do multiple training courses, and can do the same training course multiple times. But they cannot to the same training course more than once on the same day.
Which is better to implement:
Option A - Make all 3 fields a primary key
Option B - Add an autonumber PK field, and use a query to find any potential duplicates.
I've created many tables before using 2 fields as a primary key, but never 3, so I'm curious if there is any disadvantage to proceeding with option A
It's worth to mention, that with SQL Server the PK by default is the one and only clustered key, but you are allowed to create a non-clustered PK as well.
You may define a new clustered index which is not the PK. "Primary Key" is just a name actually...
The most important question is: Which columns participate in a clustered key and (this is the very most important question): Do they have an implicit sorting? And (very important too): Are there many update operations which change the content of participating columns?
You must be aware, that a clustered key defines the physical order on your hard disc. In other words: The clustered key is the table itself. You can think of an index with all columns included. If your leading column (worst case) is a GUID, each insert to your table will not be in order. This leads to a 99.99% fragmentation.
If a clustered index is bound to the time of insert or a running number (best case), it will never go into fragmentation!
What makes things worse: If there is a clustered key (whether it's called PK or not), it will be used as lookup key for other indexes.
So: in many cases it is best to use a running number as clustered key and a non-clustered multi-column index which is much faster to re-build than as if it was the clustered one.
All indexes will profit from this!
My advise for you:
Option C: a running number as PK and additionally a unique multi-column-key to ensure data integrity. No need to use own logic here...
Yes, you can have a poor strategy for choosing too many columns for your composite Primary Key (PK) if a better strategy could be employeed for uniqueness via secondary indexes.
Remember that the PK is special. There is only 1 physical / clustered ordering of your data. Changes to the data via Inserts and Updates (and incumbent shuffling) has overhead there that would not exist if maintained in a secondary index.
So the following can have not-so-insignificant differences:
A primary key with 5 composite columns
vs.
A primary key with 1 or 2 columns plus
Secondary indexes that maintain uniqueness if thought through well
The former mandates movement of data between data pages to maintain the clustered index (the PK). Which might suggest why so often one sees:
(
id int auto_increment primary key,
...
)
in table designs.
Performance with Index Width:
The width of the PK in 1. above is narrow. The width of 2. can be quite wide. Wider keys propagating to child relationships will slow performance and concurrency.
Cases of FK compositions:
Special cases of compositions of foreign keys simply cannot be achieved without the use of a single column index, preferably the PK, as seen in this recent Answer of mine.
I dont think that there is any problem of creating a table with a composed PK ,such tables are needed in larger db .There is not a real problem in creating a table with 2FK whose with the OnDate field form the PK . Both ways are vailable.
Good luck!
If you assign primary key on more than one column it will be composite primary key. For example,
CREATE TABLE employee(
training VARCHAR(10),
emp_training VARCHAR (20),
OnDate INTEGER,
PRIMARY KEY (training, emp_training, OnDate)
)
there will be unique records in training, emp_training, OnDate together and can not be null together.
As already stated you can have a single primary key which consists of multiple columns.
If the question was how to make the columns primary keys separately, that's not possible. However, you can create 1 primary key and add two unique keys
How MySQL create index for a partition table, Example if I create 5 hash by ID partitions:
Create 1 global index for all data and 5 partitions will use this index
Create 5 partitioned index with subdata in 5 partitioned tables
Create 5 index with all data in 5 partitioned tables
Thanks
There is no "global" index for a partitioned table in MySQL.
The only indexes you can put on a partitioned table ends up being separate indexes on each partition. Each partition is effectively an independent table.
HASH partitioning is virtually useless; do you have a particular use for which you think it might be beneficial?
Addenda...
The size of the index is similar to that of a table.
Since there are no "global" indexes, you cannot have a UNIQUE key unless it includes the column(s) of the "partition key". Nor can you use FOREIGN KEYs.
There is no type of index that spans more than one table.
Partitioned table in Mysql has only local indexes support.
What does that mean? Every partition of the table stores its own B-Tree for the indexes. This can slow down the process of search if you don't have partition key as part of the index. Also for unique key constraint, you need to add partition key as part of the unique key.
Compared to Mysql, Oracle has concept of Global indexes as well. Global Indexes are very hard to manage.
I am not too sure How helpful Mysql partitions would be if it has ignored Global indexes.
mysql warning : primary key and index on the same field in MYSQL .
As in theory books these two terms are used to explain indices, but in practice, when I try to make index on particular field which is also a primary key, then MySQL generates a warning, although the index is created.
Could anyone explain?
A primary key already implies an index on the set of columns that make up the key, therefore a second (separate) index is redundant:
The primary key for a table represents the column or set of columns
that you use in your most vital queries. It has an associated index,
for fast query performance.
So by creating an explicit index you don't gain anything but on the contrary saddle the database with the responsibility of having to maintain two separate indexes.
MySQL automatically places an index on primary key fields. Adding your own index for that field is therefore unnecessary.
I want to create a database with 3 tables. One for posts and one for tags and one that links posts to tags with the post_id and tag_id functioning as foreign key references.
Can you explain what an Index would be in this scenario and how it differs from a Foreign Key and how that impacts my database design?
an index on a table is a data structure that makes random access to the rows fast and efficient. It helps to optimize the internal organization of a table as well.
A foreign key is simply a pointer to a corresponding column in another table that forms a referential constraint between the two tables.
An index is added as a fast look up for data in the table.
An index can have constraints, in that the column or columns that are used to make the index might have to be unique (unique: only one row in the database is returned for that index, or non-unique: multiple rows can be returned). The primary key for the table is a unique index, and usually only has one column.
A foreign key is a value in a table that references a unique index in another table. It is used as a way to relate to tables together. For example, a child table can look up the one parent row via its column that is a unique index in the parent table.
You'll have foreign keys in the third table. Indexes are not necessary, you need them if you have lots of data where you want to find something by Id quickly. Maybe you'll want an index on posts primary key, but DBMS will probably create it automatically.
Index is a redundant data structure which speeds up some queries.
Foreign key, for practical matters, is a way to make sure that you have no invalid pointers between the rows in your tables (in your case, from the relationship table to posts and tags)
Question: Can you explain what an Index would be in this scenario and how it differs from a Foreign Key and how that impacts my database design?
Your foreign keys in this case are the two columns in your Posts_Tags table. With a foreign key, Each foreign key column must contain a value from the main table it is referencing. In this case, the Posts and Tags tables.
Posts_Tags->PostID must be a value contained in Posts->PostID
Posts_Tags->TagID must be a value contained in Tags->TagID
Think of an index as a column that has been given increased speed and efficiency for querying/searching values from it, at the cost of increased size of your database. Generally, primary keys are indexes, and other columns that require querying/searching on your website, in your case, probably the name of a post (Posts->PostName)
In your case, indexes will have little impact on your design (they are nice to have for speed and efficiency), but your foreign keys are very important to avoid data corruption (having values in them that don't match a post and/or tag).
You describe a very common database construct; it's called a "many-to-many relation".
Indexes shouldn't impact this schema at all. In fact, indexes shouldn't impact any schema. Indexes are a trade-off between space and time: indexes specify that you're willing to use extra storage space, in exchange for faster searches through the database.
Wikipedia has an excellent article about what database indexes are: Index (database)
To use foreign keys in mysql, you need to create indexes on both tables. For example, if you want the field a_id on table b to reference the id field on the table a, you have to create indexes on both a.id and b.a_id before you can create the reference.
Update: here you can read more about it: http://dev.mysql.com/doc/refman/5.1/en/innodb-foreign-key-constraints.html