MySQL 2 columns index - mysql

I have unique index on 2 columns: name(varchar) and add_date(date).
If i add non unique indexes on name and on add_date its increase select speed or not?
name and add_date apart may not be unique.
UPD
MySQL console show columns says:
| Field | Type | Null | Key | Default | Extra |
| name | varchar(10) | NO | MUL | NULL | |
| time | date | NO | MUL | NULL | |

it will speed if you selecting on each one separately.
but make sure you need it.

Related

Does the foreign key slow down the join query?

I have two databases test & test2. Both have the same tables(employees & salaries) and both have the same records. test2 database uses a foreign key and test database doesn't.
test structure
test.employees
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| emp_id | int(11) | NO | PRI | NULL | |
| name | varchar(30) | YES | | NULL | |
+--------+-------------+------+-----+---------+-------+
test.salaries
+--------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| salary | int(11) | YES | | NULL | |
| emp_id | int(11) | NO | | NULL | |
+--------+---------+------+-----+---------+----------------+
test2 structure
test2.employees
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| emp_id | int(11) | NO | PRI | NULL | |
| name | varchar(30) | YES | | NULL | |
+--------+-------------+------+-----+---------+-------+
test2.salaries
+--------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| salary | int(11) | YES | | NULL | |
| emp_id | int(11) | NO | MUL | NULL | |
+--------+---------+------+-----+---------+----------------+
I run the same join query on both databases
select * from employees inner join salaries on employees.emp_id=salaries.emp_id;
This is the output i get from test database which doesn't contain a foreign key
2844047 rows in set (3.25 sec)
This is the output i get from test2 database which contains a foreign key
2844047 rows in set (17.21 sec)
So does the foreign key slow down the join query?
Your empirical evidence suggests that in at least one case it does. So, if we believe your numbers, the answer is clearly "yes" -- and I assume you have ruled out other potential causes such as locks on the table or resource competition (actually the difference is pretty big). I presume that you want to know why.
In most databases, declaring a foreign key is about relational integrity. It would have no effect on the optimization of queries. The join conditions in the query would redundantly cover the same information.
However, MySQL does a bit more when a foreign key is declared. A foreign key declaration automatically creates an index on the columns being used. This is not standard behavior -- I'm not even sure if any other database does this.
Normally, an index would benefit performance. In this case, the optimizer has more choices on how to approach the query. For whatever reason, it is using a substandard execution plan.
You should be able to look at the explain plans and see a difference. The issue is that the optimizer has chosen the wrong plan. I would say that this is uncommon and should not dissuade you from using proper foreign key declarations in your databases.

Simple heavily-indexed table slow query in MySQL

I am having troubles with a particular query being slow. Although everything is heavily indexed, some similar queries working fine and the indexes are used, the query still is slow as hell. I cannot understand why, so maybe anybody can help.
Just for the prerequisites: the write speed of the underlying table does not matter. The table contains ~3.5 million entries but I guess MySQL should handle that just fine.
The query that is being slow takes about 2s
SELECT DISTINCT t.`tag_3` FROM `image_tags` t
WHERE t.`type` = 1 AND t.`category` LIKE "00%" AND tag_1 = "0"
--- DESCRIBE OUTPUT
--- The used index thirdtag is just an index defined as (type, category, tag_1, tag_3)
--- The actual result is 201 rows
+----+-------------+-------+- -----------------------+----------+---------+------+---------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-----------------+----------+---------+------+---------+-------------------------------------------+
| 1 | SIMPLE | t | range | [... A LOT ...] | thirdtag | 31 | NULL | 1652861 | Using where; Using index; Using temporary |
+----+-------------+-------+-------+-----------------+----------+---------+------+---------+-------------------------------------------+
The only thing that's standing out is the enormous amount of rows involved. If you compare with the 2 fast queries I attached to the end of this question it is literally the only thing different (at least from the first one). So most probably that's the problem. But that's how the data is given to me so I need to work with that. I thought if involved in the index mysql could handle the data just fine.
Does anybody have a suggestion how to optimize the query? Any suggestions if i could use different indexes that suit more to the query?
For comparison these 2 similar queries work blazing fast
--- just a longer category string resulting in fewer results
SELECT DISTINCT t.`tag_3` FROM `image_tags` t
WHERE t.`type` = 1 AND t.`category` LIKE "0000%" AND tag_1 = "0"
--- and additional where clause
SELECT DISTINCT t.`tag_3` FROM `image_tags` t
WHERE t.`type` = 1 AND t.`category` LIKE "00%" AND tag_1 = "0" and tag_2 = ""
The table (it has a lot of indexes probably too long to paste).
+----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| image | char(8) | NO | MUL | NULL | |
| category | varchar(6) | YES | MUL | NULL | |
| type | tinyint(1) | NO | MUL | NULL | |
| tag_1 | char(3) | NO | MUL | NULL | |
| tag_2 | char(3) | NO | MUL | NULL | |
| tag_3 | char(3) | NO | MUL | NULL | |
| tag_4 | char(3) | NO | MUL | NULL | |
| tag_5 | char(3) | NO | MUL | NULL | |
| tag_6 | char(3) | NO | MUL | NULL | |
+----------+------------------+------+-----+---------+----------------+
Please provide SHOW CREATE TABLE, it is more descriptive than DESCRIBE! In particular, I cannot see what indexes you have.
As My index cookbook explains, start the index with any fields that are '=', then you get one chance to add a 'range' comparison. Your category is a range, so
WHERE t.`type` = 1 AND t.`category` LIKE "00%" AND tag_1 = "0"
does not get past category in
INDEX(type, category, tag_1, tag_3)
For your 3 queries, these are the best indexes:
INDEX(type, tag_1, category)
INDEX(type, tag_1, category)
INDEX(type, tag_1, tag_2, category)
category should be last; the other columns can be in any order. Perhaps some one of your indexes handled the 3rd case?
it has a lot of indexes probably too long to paste
Probably most of them are unused. Keep in mind that INDEX(a) is unnecessary if you also have INDEX(a,b).

Database schema: Key/Value table or all keys in one record

I guess that this is somewhat of a philosophical question. I need to collect pathology results for a group of patients and store them in a database. In the past I have used a very simple table structure (simplified):
+-------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| ID | bigint(20) | NO | PRI | NULL | |
| Updated | datetime | NO | PRI | NULL | |
| PatientId | varchar(255) | NO | | NULL | |
| Name | varchar(255) | NO | | NULL | |
| Value | varchar(255) | NO | | NULL | |
+-------------------+--------------+------+-----+---------+-------+
More often in schema design I see:
+-------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| ID | bigint(20) | NO | PRI | NULL | |
| PatientId | varchar(255) | NO | | NULL | |
| Ph_Value | varchar(255) | NO | | NULL | |
| K_Value | varchar(255) | NO | | NULL | |
| Ca_Value | varchar(255) | NO | | NULL | |
| Ph_Value_updated | datetime | NO | | NULL | |
| K_Value_updated | datetime | NO | | NULL | |
| Ca_Value_updated | datetime | NO | | NULL | |
+-------------------+--------------+------+-----+---------+-------+
It seems to me that the first design is much more flexible, expandable etc. However, I do wonder about performance hits when the records run to the millions.
The issue with the second is that there may be a couple of hundred fields that need to be recorded on occasions.
I would be really interested to get comments / advice / guidance on this.
You are absolutely right, the first schema is a lot more flexible: you can add new keys on a live database without changing the schema. However, flexibility is usually bought with the time and/or the space. In this case, it's both: you need more space to store all keys for the same row because the ID is replicated N times, and the joins or orderings required to get the fields together would take time.
There is no reason to pay for flexibility unless you need it. If most of your queries need most of the columns, the second result is the most economical. However, if most of your queries ask for a single column, getting the flexibility may be worth spending the CPU time and the database space.
In my opinion, If that name/value pairs won't be changed much so the second option is much better in the terms of space and number of rows.
Also you can have another solution to optimize the first schema , to put the names in another table and just put name_id instead of repeating the same name several times.
The other schema is to have patient table and a table for each value that contains patient_id and value and the table name is the name for that value

MySQL Alter table add column that is unique together with another column

I have such a table:
+---------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| url | varchar(255) | YES | UNI | NULL | |
| ts | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| content | longblob | YES | | NULL | |
| source | varchar(255) | YES | | NULL | |
| state | int(11) | NO | | 0 | |
+---------+--------------+------+-----+-------------------+-----------------------------+
I'd like the id to stay the only PRIMARY KEY and I'd like to add field "VERSION" which will be unique.
What I want is to create unique pair (url, version) unique together but not separately. How can I do that? Should I add field version just like that, alter url so it's not unique and then add constraint?
Thanks in advance!
If what you're looking for is to store multiple versions of the same URL together in the table, then yes, what you need to do is:
Drop the unique constraint on URL
Add non-unique column version (assume integer here)
Create unique constraint or index on (url, version). I would suggest an index since I think that should make the unique checks faster.

In mysql can I have a composite primary key composed of an auto increment and another field? Also, please critique my "mysql partitioning" logic

I am experimenting with mysql partitioning ( splitting the table up to help it scale better ), and I am having a problem with the keys on the table. First, I am using a python's threaded comments module... here is the schema
+-----------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+-------+
| content_type_id | int(11) | NO | MUL | NULL | |
| object_id | int(10) unsigned | NO | | NULL | |
| parent_id | int(11) | YES | MUL | NULL | |
| user_id | int(11) | NO | MUL | NULL | |
| date_submitted | datetime | NO | | NULL | |
| date_modified | datetime | NO | | NULL | |
| date_approved | datetime | YES | | NULL | |
| comment | longtext | NO | | NULL | |
| markup | int(11) | YES | | NULL | |
| is_public | tinyint(1) | NO | | NULL | |
| is_approved | tinyint(1) | NO | | NULL | |
| ip_address | char(15) | YES | | NULL | |
| id | int(11) | YES | | NULL | |
+-----------------+------------------+------+-----+---------+-------+
Note, I have modified this database by dropping the id col (primary by default), and re adding it.
Essentially, I want to have id AND content_type_id as my primary keys. I also want id to auto increment. Is this possible.
Second question. Since I am just learning about mysql partitioning, I am wondering if my partitioning logic is sound. There are 67 different content_types, and some (maybe all) of those content types allow comments to be made on them. My plan is to partition based on the type of object that is being commented on. For instance, the images will be commented on a lot, so I put any content type pertaining to images into one partition, and another content type that can be commented on is "blog entries", so there is a separate partition for that, and so on and so on. This will allow me to spread these partitions possibly to dedicated machines as the load grows. How is my understanding of this concept so far?
Thanks so much!
Since id will be auto incremented, it can be the primary key all by itself. Adding content_type to the primary key would not gain you anything in regards to the uniqueness of the key.
If you want to add an index for faster performance to the 2 columns, then add an alternate unique index to the table with the 2 columns instead of trying to add them both to the primary key. However, be aware that enforing uniqueness on the 2 columns would be a waste since id is already guaranteed to be unique by itself, so a regular index would make more sense if needed.