I have a table where id is primary key.
CREATE TABLE t1 (
id INT NOT NULL AUTO_INCREMENT,
col1 VARCHAR(45) NULL,
PRIMARY KEY (id));
I have another table t2 which is joining table t1 as
t2 LEFT JOIN t1 ON CONCAT("USER_", t1.id) = t2.user_id
I want to create an index which has CONCAT("USER_", t1.id) values indexed in any order.
I tried
ALTER TABLE t1 ADD INDEX ((CONCAT('user_',id) DESC);
but it is giving error.
I have followed official documentation of mysql.
Note : I do not want to create a new CONCAT("user_", id) column.
https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-column-prefixes
InnoDB supports secondary indexes on virtual generated columns.
https://dev.mysql.com/doc/refman/5.7/en/create-table-secondary-indexes.html
In 5.7(onward) you can use a generated column, then index that column. e.g.
Here is an example of taking the integer out of the string to create an efficient join:
CREATE TABLE myusers (
id mediumint(8) unsigned NOT NULL auto_increment
, name varchar(255) default NULL,
PRIMARY KEY (`id`)
) AUTO_INCREMENT=1
;
INSERT INTO myusers (`name`) VALUES ('Imelda'),('Hamish'),('Brandon'),('Amity'),('Jillian'),('Lionel'),('Faith'),('Dai'),('Reed'),('Molly');
CREATE TABLE mytable (
id mediumint(8) unsigned NOT NULL auto_increment
, user_id VARCHAR(20)
, ex_user_id integer GENERATED ALWAYS AS (0+substring(user_id,6,20))
, password varchar(255)
, PRIMARY KEY (`id`)
, INDEX idx_ex_user_id (ex_user_id)
) AUTO_INCREMENT=1
;
INSERT INTO mytable (`user_id`,`password`) VALUES
('user_1','PYX68BIC9RD')
,('user_2','LPY07EIN0UA')
,('user_3','UGC24TKI3JL')
,('user_4','YQU18ALB8YA')
,('user_5','DEL56AGR6AD')
,('user_6','YQN87UOB0PO')
,('user_7','CPC15JFU6MC')
,('user_8','MWC40ZWD2EE')
,('user_9','HEB34QQH0UM')
,('user_10','GVP36PLP5PW')
;
select
*
from myusers
inner join mytable on myusers.id = mytable.ex_user_id
;
id | name | id | user_id | ex_user_id | password
-: | :------ | -: | :------ | ---------: | :----------
1 | Imelda | 1 | user_1 | 1 | PYX68BIC9RD
2 | Hamish | 2 | user_2 | 2 | LPY07EIN0UA
3 | Brandon | 3 | user_3 | 3 | UGC24TKI3JL
4 | Amity | 4 | user_4 | 4 | YQU18ALB8YA
5 | Jillian | 5 | user_5 | 5 | DEL56AGR6AD
6 | Lionel | 6 | user_6 | 6 | YQN87UOB0PO
7 | Faith | 7 | user_7 | 7 | CPC15JFU6MC
8 | Dai | 8 | user_8 | 8 | MWC40ZWD2EE
9 | Reed | 9 | user_9 | 9 | HEB34QQH0UM
10 | Molly | 10 | user_10 | 10 | GVP36PLP5PW
explain select
*
from myusers
inner join mytable on myusers.id = mytable.ex_user_id
;
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
-: | :---------- | :------ | :--------- | :--- | :------------- | :------------- | :------ | :------------------------------------- | ---: | -------: | :----------
1 | SIMPLE | myusers | null | ALL | PRIMARY | null | null | null | 10 | 100.00 | null
1 | SIMPLE | mytable | null | ref | idx_ex_user_id | idx_ex_user_id | 5 | fiddle_HNTHMETRTFAHHKBIGWZM.myusers.id | 1 | 100.00 | Using where
db<>fiddle here
note the conversion of user_id from string to integer is "implicit":
To cast a string to a number, you normally need do nothing other than use the string value in numeric context:
https://dev.mysql.com/doc/refman/5.7/en/create-table-secondary-indexes.html
Related
Recently ,i got on a strange question,my test table structure:
CREATE TABLE `index_test` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`a` varchar(64) NOT NULL DEFAULT '',
`card_no` bigint(20) NOT NULL,
`card_no2` bigint(20) NOT NULL,
`optype` int(11) NOT NULL,
`optype2` int(11) NOT NULL,
`create_time` datetime NOT NULL DEFAULT '2000-01-01 00:00:00',
`_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `idx_a` (`a`),
KEY `idx_card_no` (`card_no`),
KEY `idx_card_no2` (`card_no2`),
KEY `idx_optype` (`optype`),
KEY `idx_optype2` (`optype2`)
) ENGINE=InnoDB AUTO_INCREMENT=10000 DEFAULT CHARSET=utf8;
5 major columns,a varchar,cardno and cardno2 are bigint,optype and optype2 are int,
as my experience,mysql index prefer select high cardinality、small data type and non null columns,but when i run explain query statements,a few problems occurred,here is my init data procedure
DELIMITER ;;
CREATE DEFINER=`xx`#`%` PROCEDURE `simple_insert`( )
BEGIN
DECLARE counter BIGINT DEFAULT 0;
my_loop: LOOP
SET counter=counter+1;
IF counter=10000 THEN
LEAVE my_loop;
END IF;
INSERT INTO `index_test` (`a`,`card_no`,`card_no2`,`optype`,`optype2`, `create_time`) VALUES (replace(uuid(), '-', ''),counter,counter%180, counter,counter%180,current_timestamp);
END LOOP my_loop;
END;;
DELIMITER ;
insert 10,000 row data,first i execute the statistics query
select * from information_schema.statistics where table_schema = 'test' and table_name = 'index_test';
output
+---------------+--------------+------------+------------+--------------+--------------+--------------+-------------+-----------+-------------+----------+--------+----------+------------+---------+---------------+
| TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | NON_UNIQUE | INDEX_SCHEMA | INDEX_NAME | SEQ_IN_INDEX | COLUMN_NAME | COLLATION | CARDINALITY | SUB_PART | PACKED | NULLABLE | INDEX_TYPE | COMMENT | INDEX_COMMENT |
+---------------+--------------+------------+------------+--------------+--------------+--------------+-------------+-----------+-------------+----------+--------+----------+------------+---------+---------------+
| def | test | index_test | 0 | test | PRIMARY | 1 | id | A | 10089 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_a | 1 | a | A | 9999 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_card_no | 1 | card_no | A | 9999 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_card_no2 | 1 | card_no2 | A | 180 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_optype | 1 | optype | A | 9999 | NULL | NULL | | BTREE | | |
| def | test | index_test | 1 | test | idx_optype2 | 1 | optype2 | A | 180 | NULL | NULL | | BTREE | | |
+---------------+--------------+------------+------------+--------------+--------------+--------------+-------------+-----------+-------------+----------+--------+----------+------------+---------+---------------+
step 2:
explain select * from index_test where optype=9600 and a= 'e095af180f4911ea8d907036bd142a99';
output:
+----+-------------+------------+------------+------+------------------+-------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+------------------+-------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | index_test | NULL | ref | idx_a,idx_optype | idx_a | 194 | const | 1 | 5.00 | Using where |
+----+-------------+------------+------------+------+------------------+-------+---------+-------+------+----------+-------------+
as my experience ,varchar(64) space is bigger than int,so use int column is ok
step3:
explain select * from index_test where optype=9600 and card_no = 9600;
output
+----+-------------+------------+------------+------+------------------------+-------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+------------------------+-------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | index_test | NULL | ref | idx_card_no,idx_optype | idx_card_no | 8 | const | 1 | 5.00 | Using where |
+----+-------------+------------+------------+------+------------------------+-------------+---------+-------+------+----------+-------------+
so ,the question is why mysql query optimizer prefer use bigint column than int column,any one can help me or give some offcinal document links about this question ,thanks。
by the way,my test environment is macos(10.14.6) x64 and mysql server version is 5.7.26
I don't think INT vs BIGINT is the issue. First, let me mention better indexes:
For
where optype=9600 and a= 'e095af180f4911ea8d907036bd142a99'
Either of these "composite" indexes would be optimal, and better than what you have:
INDEX(optype, a)
INDEX(a, optype)
For
where optype=9600
and card_no = 9600
and a= 'e095af180f4911ea8d907036bd142a99'
any index starting with those 3 columns is optimal; any 2 would be "good", and single-column indexes would be poor, but better than no index.
The optimizer may be making probes to see which of your 3 poor indexes is best.
I can't explain why it did not list a as a "Possible key".
I have the following ugly query, which runs okay but not great, on my local machine (1.4 secs, running v5.7). On the server I'm using, which is running an older version of MySQL (v5.5), the query just hangs. It seems to get caught on "Copying to tmp table":
SELECT
SQL_CALC_FOUND_ROWS
DISTINCT p.parcel_number,
p.street_number,
p.street_name,
p.site_address_city_state,
p.number_of_units,
p.number_of_stories,
p.bedrooms,
p.bathrooms,
p.lot_area_sqft,
p.cost_per_sq_ft,
p.year_built,
p.sales_date,
p.sales_price,
p.id
FROM (
SELECT APN, property_case_detail_id FROM property_inspection AS pi
GROUP BY APN, property_case_detail_id
HAVING
COUNT(IF(status='Resolved Date', 1, NULL)) = 0
) as open_cases
JOIN property AS p
ON p.parcel_number = open_cases.APN
LIMIT 0, 1000;
mysql> show processlist;
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
| 21120 | headsupcity | localhost | lead_housing | Query | 21 | Copying to tmp table | SELECT
SQL_CALC_FOUND_ROWS
DISTINCT p.parcel_number,
p.street_numbe |
| 21121 | headsupcity | localhost | lead_housing | Query | 0 | NULL | show processlist |
+-------+-------------+-----------+--------------+---------+------+----------------------+------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
Explains are different on my local machine and on the server, and I'm assuming the only reason my query runs at all on my local machine, is because of the key that is automatically created on the derived table:
Explain (local):
+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+
| 1 | PRIMARY | p | NULL | ALL | NULL | NULL | NULL | NULL | 40319 | 100.00 | Using temporary |
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 8 | lead_housing.p.parcel_number | 40 | 100.00 | NULL |
| 2 | DERIVED | pi | NULL | ALL | NULL | NULL | NULL | NULL | 1623978 | 100.00 | Using temporary; Using filesort |
+----+-------------+------------+------------+------+---------------+-------------+---------+------------------------------+---------+----------+---------------------------------+
Explain (server):
+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+
| 1 | PRIMARY | p | ALL | NULL | NULL | NULL | NULL | 41369 | Using temporary |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 122948 | Using where; Distinct; Using join buffer |
| 2 | DERIVED | pi | ALL | NULL | NULL | NULL | NULL | 1718586 | Using temporary; Using filesort |
+----+-------------+------------+------+---------------+------+---------+------+---------+------------------------------------------+
Schemas:
mysql> explain property_inspection;
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| lblCaseNo | int(11) | NO | MUL | NULL | |
| APN | bigint(10) | NO | MUL | NULL | |
| date | varchar(50) | NO | | NULL | |
| status | varchar(500) | NO | | NULL | |
| property_case_detail_id | int(11) | YES | MUL | NULL | |
| case_type_id | int(11) | YES | MUL | NULL | |
| date_modified | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| update_status | tinyint(1) | YES | | 1 | |
| created_date | datetime | NO | | NULL | |
+-------------------------+--------------+------+-----+-------------------+-----------------------------+
10 rows in set (0.02 sec)
mysql> explain property; (not all columns, but you get the gist)
+----------------------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------------+--------------+------+-----+-------------------+-----------------------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| parcel_number | bigint(10) | NO | | 0 | |
| date_modified | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| created_date | datetime | NO | | NULL | |
+----------------------------+--------------+------+-----+-------------------+-----------------------------+
Variables that might be relevant:
tmp_table_size: 16777216
innodb_buffer_pool_size: 8589934592
Any ideas on how to optimize this, and any idea why the explains are so different?
Since this is where the Optimizers are quite different, let's try to optimize
SELECT APN, property_case_detail_id FROM property_inspection AS pi
GROUP BY APN, property_case_detail_id
HAVING
COUNT(IF(status='Resolved Date', 1, NULL)) = 0
) as open_cases
Give this a try:
SELECT ...
FROM property AS p
WHERE NOT EXISTS ( SELECT 1 FROM property_inspection
WHERE status = 'Resolved Date'
AND p.parcel_number = APN )
ORDER BY ??? -- without this, the `LIMIT` is unpredictable
LIMIT 0, 1000;
or...
SELECT ...
FROM property AS p
LEFT JOIN property_inspection AS pi ON p.parcel_number = pi.APN
WHERE pi.status = 'Resolved Date'
AND pi.APN IS NULL
ORDER BY ??? -- without this, the `LIMIT` is unpredictable
LIMIT 0, 1000;
Index:
property_inspection: INDEX(status, parcel_number) -- in either order
MySQL 5.5 and 5.7 are quite different and the later has better optimizer so there is no surprise that explain plans are different.
You'd better provide SHOW CREATE TABLE property; and SHOW CREATE TABLE property_inspection; outputs as it will show indexes that are on your tables.
Your sub-query is the issue.
- Server tries to process 1.6M rows with no index and grouping everything.
- Having is quite expensive operation so you'd better avoid it, expecially in sub-queries.
- Grouping in this case is bad idea. You do not need the aggregation/counting. You need to check if the 'Resolved Date' status is just exists
Based on the information provided I'd recommend:
- Alter table property_inspection to reduce length of status column.
- Add index on the column. Use covering index (APN, property_case_detail_id, status) if possible (in this columns order).
- Change query to something like this:
SELECT
SQL_CALC_FOUND_ROWS
DISTINCT p.parcel_number,
...
p.id
FROM
property_inspection AS `pi1`
INNER JOIN property AS p ON (
p.parcel_number = `pi1`.APN
)
LEFT JOIN (
SELECT
`pi2`.property_case_detail_id
, `pi2`. APN
FROM
property_inspection AS `pi2`
WHERE
`status` = 'Resolved Date'
) AS exclude ON (
exclude.APN = `pi1`.APN
AND exclude.property_case_detail_id = `pi1`.property_case_detail_id
)
WHERE
exclude.APN IS NULL
LIMIT
0, 1000;
The following query is taking 5 hours so far to run:
INSERT $LINEITEM_PUBLIC SELECT *
FROM LINEITEM
WHERE L_PARTKEY IN ( SELECT P_PARTKEY FROM $PART_PUBLIC )
AND L_SUPPKEY IN ( SELECT S_SUPPKEY FROM $SUPPLIER_PUBLIC )
AND L_ORDERKEY IN ( SELECT O_ORDERKEY FROM $ORDERS_PUBLIC );
I added all required indexes but nothing seems to be helping. The Query Explain Plan prints the following:
+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+
| 1 | INSERT | $LINEITEM_PUBLIC | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| 1 | SIMPLE | $ORDERS_PUBLIC | NULL | index | PRIMARY | O_ORDERDATE | 3 | NULL | 12826617 | 100.00 | Using index |
| 1 | SIMPLE | LINEITEM | NULL | ref | PRIMARY,LINEITEM_FK2,L_SUPPKEY | PRIMARY | 4 | TPCH.$ORDERS_PUBLIC.O_ORDERKEY | 3 | 100.00 | NULL |
| 1 | SIMPLE | $SUPPLIER_PUBLIC | NULL | eq_ref | PRIMARY | PRIMARY | 4 | TPCH.LINEITEM.L_SUPPKEY | 1 | 100.00 | Using index |
| 1 | SIMPLE | $PART_PUBLIC | NULL | eq_ref | PRIMARY | PRIMARY | 4 | TPCH.LINEITEM.L_PARTKEY | 1 | 100.00 | Using index |
+----+-------------+------------------+------------+--------+--------------------------------+-------------+---------+--------------------------------+----------+----------+-------------+
Any recommendations on how this query can be optimized?
Update:
The size of the tables in the previous query is as follows:
LINEITEM: 60M records
$ORDERS_PUBLIC: 13M records
$SUPPLIER_PUBLIC: 92K records
$PART_PUBLIC: 2M records
Make sure there is an index starting with O_ORDERKEY.
IN (SELECT ...) may be optimized poorly (depending on version); try this:
INSERT $LINEITEM_PUBLIC
SELECT l.*
FROM LINEITEM AS l
WHERE EXISTS( SELECT * FROM $PART_PUBLIC WHERE P_PARTKEY = L_PARTKEY )
AND EXISTS( SELECT * FROM $SUPPLIER_PUBLIC WHERE S_SUPPKEY = L_SUPPKEY )
AND EXISTS( SELECT * FROM $ORDERS_PUBLIC WHERE O_ORDERKEY = L_ORDERKEY );
New to SQL, Assuming this is the right way to create a JOIN Table between my two main entities, do I have to hardcode insert data for the join table? How do I query from my join table? I'm using SQL Fiddle so I'm not sure if the links are being produced correctly to my foreign keys.
CREATE TABLE Organization(
`Organization_id` int NOT NULL,
PRIMARY KEY(`Organization_id`)
);
CREATE TABLE QuestionBank(
`Question_id` int NOT NULL,
`Question_text` VARCHAR(255) NOT NULL,
PRIMARY KEY(`Question_id`)
);
CREATE TABLE OrganizationQuestion(
`OrganizationQuestion_id` int NOT NULL,
`Question_id` int NOT NULL,
`Organization_id` int NOT NULL,
PRIMARY KEY(`OrganizationQuestion_id`),
FOREIGN KEY(`Question_id`) REFERENCES QuestionBank(`Question_id`),
FOREIGN KEY(`Organization_id`) REFERENCES Organization(`Organization_id`)
);
INSERT INTO Organization(`Organization_id`) VALUES(1);
INSERT INTO QuestionBank(`Question_id`, `Question_text`) VALUES(1, 'How did he perform?');
INSERT INTO OrganizationQuestion(`OrganizationQuestion_id`, `Question_id`, `Organization_id`)
VALUES(1, 1, 1);
This is your join:
select oq.OrganizationQuestion_id,
oq.Question_id,
oq.Organization_id,
o.Organization_id,
qb.Question_id,
qb.Question_text
from OrganizationQuestion oq
join Organization o
on o.Organization_id = oq.Organization_id
join QuestionBank qb
on qb.Question_id = oq.Question_id
+-------------------------+-------------+-----------------+-----------------+-------------+---------------------+
| OrganizationQuestion_id | Question_id | Organization_id | Organization_id | Question_id | Question_text |
+-------------------------+-------------+-----------------+-----------------+-------------+---------------------+
| 1 | 1 | 1 | 1 | 1 | How did he perform? |
+-------------------------+-------------+-----------------+-----------------+-------------+---------------------+
Which is not terribly interesting because you made almost everything a 1.
Output with EXPLAIN:
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------------+------+-------------+
| 1 | SIMPLE | oq | ALL | Question_id,Organization_id | NULL | NULL | NULL | 1 | NULL |
| 1 | SIMPLE | o | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.oq.Organization_id | 1 | Using index |
| 1 | SIMPLE | qb | eq_ref | PRIMARY | PRIMARY | 4 | so_gibberish.oq.Question_id | 1 | NULL |
+----+-------------+-------+--------+-----------------------------+---------+---------+---------------------------------+------+-------------+
Remember that KEYS (Indexes) are not used in queries with small tables. It takes longer to use the KEYS than just a table scan.
To view indexes on a table:
mysql> show indexes from OrganizationQuestion;
+----------------------+------------+-----------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------------+------------+-----------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| organizationquestion | 0 | PRIMARY | 1 | OrganizationQuestion_id | A | 1 | NULL | NULL | | BTREE | | |
| organizationquestion | 1 | Question_id | 1 | Question_id | A | 1 | NULL | NULL | | BTREE | | |
| organizationquestion | 1 | Organization_id | 1 | Organization_id | A | 1 | NULL | NULL | | BTREE | | |
+----------------------+------------+-----------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
See the MySQL Manual pages for SHOW INDEX and EXPLAIN
Edit 2 completely different question
To disallow content during an INSERT (say, 2 FK id's in one table being the same such a Mail 'sender' and 'recipient')
See the following Answer for generating a
signal sqlstate '45000';
I wish to reduce time to query data in view.
My tables have following structure:
Table Rings contains individual rings, each ring has unique combination of ID_RingType and Number, But also ID, which is used as foreign key elsewhere.
-- RINGS
CREATE TABLE `Rings` (
ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
ID_RingType CHAR(2) NOT NULL,
Number MEDIUMINT UNSIGNED NOT NULL,
ID_RingStatus TINYINT DEFAULT 1,
ID_User INT(11),
DateLastChange TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
FOREIGN KEY (ID_RingType) REFERENCES RingType(Code),
FOREIGN KEY (ID_RingStatus) REFERENCES RingStatus(ID),
FOREIGN KEY (ID_User) REFERENCES `848-cso`.`Users`(UID)
);
-- create index on tripple ID_User, ID_RingType, Number
CREATE INDEX idx_rings ON `Rings` (ID_User, ID_RingType, Number);
CREATE INDEX idx_rings_overview ON `Rings` (ID_RingType, Number, ID_RingStatus);
CREATE INDEX idx_rings_numbers ON `Rings` (ID_RingStatus, ID_User, ID_RingType, Number);
Ring Status contains only 4 values and their meaning
-- RING STATUS
CREATE TABLE `RingStatus` (
ID TINYINT NOT NULL PRIMARY KEY,
Name VARCHAR(20) UNIQUE COLLATE utf8_czech_ci,
NameEng VARCHAR(20)
);
Ring Type is indentified by two-letters Code
-- RING TYPE
CREATE TABLE `RingType` (
Code CHAR(2) NOT NULL PRIMARY KEY,
Material VARCHAR(30) COLLATE utf8_czech_ci,
Radius DOUBLE UNSIGNED,
MaxVal MEDIUMINT UNSIGNED NOT NULL
);
Moreover, I use following function:
/*
Function returns tinyint(1) specifying, whether ring was assigned
*/
CREATE FUNCTION fn_isRingAssigned (idRingStatus TINYINT)
RETURNS TINYINT(1) DETERMINISTIC
RETURN IF(idRingStatus = 1,1,2);
The query which I try to optimize is stored in following VIEW:
/*
View finds contiguous ranges of rings grouped by type, radius and status
*/
ALTER VIEW vw_rings_overview AS SELECT
a.ID_RingType,
rt.Radius,
fn_isRingAssigned(a.ID_RingStatus) AS status,
rs.Name,
a.Number AS min,
MIN(b.Number) AS max
FROM
RingStatus AS rs, Rings AS a
JOIN RingType AS rt ON a.ID_RingType = rt.Code
JOIN Rings AS b
ON a.ID_RingType = b.ID_RingType
AND fn_isRingAssigned(a.ID_RingStatus) = fn_isRingAssigned(b.ID_RingStatus)
AND a.Number <= b.Number
WHERE NOT EXISTS
( SELECT 1
FROM Rings AS c
WHERE c.ID_RingType = a.ID_RingType
AND fn_isRingAssigned(c.ID_RingStatus) = fn_isRingAssigned(a.ID_RingStatus)
AND c.Number = a.Number - 1
)
AND NOT EXISTS
( SELECT 1
FROM Rings AS d
WHERE d.ID_RingType = b.ID_RingType
AND fn_isRingAssigned(d.ID_RingStatus) = fn_isRingAssigned(b.ID_RingStatus)
AND d.Number = b.Number + 1
)
AND fn_isRingAssigned(a.ID_RingStatus) = rs.ID
GROUP BY
a.ID_RingType,
fn_isRingAssigned(a.ID_RingStatus),
a.Number
ORDER BY
a.ID_RingType,
a.Number;
The data in Rings table look as follows
+----+-------------+--------+---------------+---------+---------------------+
| ID | ID_RingType | Number | ID_RingStatus | ID_User | DateLastChange |
+----+-------------+--------+---------------+---------+---------------------+
| 1 | A | 1 | 4 | 2 | 2015-12-02 19:02:50 |
| 2 | A | 2 | 4 | 2 | 2015-12-02 19:02:56 |
| 3 | A | 3 | 4 | 2 | 2015-12-02 19:22:29 |
| 4 | A | 4 | 4 | 2 | 2015-12-21 20:32:24 |
| 5 | A | 5 | 4 | 2 | 2015-12-21 20:52:08 |
| 6 | A | 6 | 4 | 2 | 2015-12-21 20:52:22 |
| 7 | A | 7 | 1 | 2 | 2015-12-02 19:00:23 |
| 8 | A | 8 | 1 | 2 | 2015-12-02 19:00:23 |
| 9 | A | 9 | 1 | 2 | 2015-12-02 19:00:23 |
| 10 | A | 10 | 1 | 2 | 2015-12-02 19:00:23 |
+----+-------------+--------+---------------+---------+---------------------+
And results of the query look like this:
mysql> select * from vw_rings_overview;
+-------------+--------+--------+----------------+-----+-------+
| ID_RingType | Radius | status | Name | min | max |
+-------------+--------+--------+----------------+-----+-------+
| A | 20 | 2 | Assigned | 1 | 6 |
| A | 20 | 1 | Not assigned | 7 | 10 |
+-------------+--------+--------+----------------+-------------+
What the view does is it finds contiguous ranges in rings, having the same ring type, status and radius.
Table Rings currently contains less than 30 000 rows, and querying takes approx. 2 seconds. It is expected to contains few millions of rows, so I wish to optimize design of tables, indexes and view.
Here is result of EXPLAIN:
mysql> explain select * from vw_rings_overview;
+----+--------------------+------------+--------+--------------------+--------------------+---------+-----------------------------+-------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+--------+--------------------+--------------------+---------+-----------------------------+-------+-----------------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 19 | |
| 2 | DERIVED | a | index | idx_rings_overview | idx_rings_overview | 7 | NULL | 25173 | Using where; Using index; Using temporary; Using filesort |
| 2 | DERIVED | rt | eq_ref | PRIMARY | PRIMARY | 2 | 848-avi2.a.ID_RingType | 1 | |
| 2 | DERIVED | rs | eq_ref | PRIMARY | PRIMARY | 1 | func | 1 | Using where |
| 2 | DERIVED | b | ref | idx_rings_overview | idx_rings_overview | 2 | 848-avi2.rt.Code | 1573 | Using where; Using index |
| 4 | DEPENDENT SUBQUERY | d | ref | idx_rings_overview | idx_rings_overview | 5 | 848-avi2.b.ID_RingType,func | 1 | Using where; Using index |
| 3 | DEPENDENT SUBQUERY | c | ref | idx_rings_overview | idx_rings_overview | 5 | 848-avi2.a.ID_RingType,func | 1 | Using where; Using index |
+----+--------------------+------------+--------+--------------------+--------------------+---------+-----------------------------+-------+-----------------------------------------------------------+
Here are some sample data: http://sqlfiddle.com/#!9/b8b489/1