can't improve query performance of query - mysql

I have the following 2 tables:
CREATE TABLE table1 (
ID INT(11) NOT NULL AUTO_INCREMENT,
AccountID INT NOT NULL,
Type VARCHAR(50) NOT NULL,
ValidForBilling BOOLEAN NULL DEFAULT false,
MerchantCreationTime TIMESTAMP NOT NULL,
PRIMARY KEY (ID),
UNIQUE KEY (OrderID, Type)
);
with the index:
INDEX accID_type_merchCreatTime_vfb (AccountID, Type, MerchantCreationTime, ValidForBilling);
CREATE TABLE table2 (
OrderID INT NOT NULL,
AccountID INT NOT NULL,
LineType VARCHAR(256) NOT NULL,
CreationDate TIMESTAMP NOT NULL,
CalculatedAmount NUMERIC(4,4) NULL,
table1ID INT(11) NOT NULL
);
I'm running the following query:
SELECT COALESCE(SUM(CalculatedAmount), 0.0) AS CalculatedAmount
FROM table2
INNER JOIN table1 ON table1.ID = table2.table1ID
WHERE table1.ValidForBilling is TRUE
AND table1.AccountID = 388
AND table1.Type = 'TPG_DISCOUNT'
AND table1.MerchantCreationTime >= '2018-11-01T05:00:00'
AND table1.MerchantCreationTime < '2018-12-01T05:00:00';
And it takes about 2 minutes to complete.
I did EXPLAIN in order to try and improve the query performance and got the following output:
+----+-------------+------------------+------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+---------+----------------------+-------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+---------+----------------------+-------+----------+--------------------------+
| 1 | SIMPLE | table1 | NULL | range | PRIMARY,i_fo_merchant_time_account,FO_AccountID_MerchantCreationTime,FO_AccountID_ExecutionTime,FO_AccountID_Type_ExecutionTime,FO_AccountID_Type_MerchantCreationTime,accID_type_merchCreatTime_vfb | accID_type_merchCreatTime_vfb | 61 | NULL | 71276 | 100.00 | Using where; Using index |
| 1 | SIMPLE | table2 | NULL | eq_ref | table1ID,i_oc_fo_id | table1ID | 4 | finance.table1.ID | 1 | 100.00 | NULL |
+----+-------------+------------------+------------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------+---------+----------------------+-------+----------+--------------------------+
I see that I scan 71276 rows in table1 and I can't seem to make this number lower.
Is there an index I can create to improve this query performance?

Move ValidForBilling before MerchantCreationTime in accID_type_merchCreatTime_vfb. You need to do ref lookups =TRUE before range uses in an index.
For table 2, seems to be a table1ID index already and appending CalculatedAmount will be able to be used in the result:
CREATE INDEX tbl1IDCalcAmount (table1ID,CalculatedAmount) ON table2

Related

mysql table join with max()

I have problem with query using JOIN and MAX/MIN. For Example:
SELECT Min(a.date), Max(a.date)
FROM a
INNER JOIN b ON b.ID = a.ID AND b.cID = 5
Its possible to add index or change this query result was better?
Below the result of explain
+----+-------------+----------+------+-----------------+-----+---------+-----------+--------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+-----------------+-----+---------+-----------+--------+-----------------------+
| 1 | SIMPLE | b | ref | PRIMARY,cID | cID | 5 | const | 680648 | Using index |
| 1 | SIMPLE | a | ref | ID | ID | 5 | base.b.ID | 1 | Using index condition |
+----+-------------+----------+------+-----------------+-----+---------+-----------+--------+-----------------------+
Sorry, but I would not put here the whole table, and could make a lot of confusion.
CREATE TABLE `a` (
`ID` int(11) NOT NULL,
`date` datetime DEFAULT,
PRIMARY KEY (`ID`),
KEY `date` (`date`),
)
CREATE TABLE `b` (
`bID` int(11) NOT NULL,
`ID` int(11) NOT NULL,
`cID` int(11) DEFAULT,
PRIMARY KEY (`bID`),
KEY `cID` (`cID`),
)
b: INDEX(cID, ID)
will make that a "covering" index, so it will probably get through the 680648 rows faster. It should replace the current KEY(cID).
Key_len for b is 5. That disagrees with the table definition; something got simplified too much.

Eliminating Sub query in MySQL

I have a table in MySQL as below.
CREATE TABLE `myTable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`employeeNumber` int(11) DEFAULT NULL,
`approveDate` date DEFAULT NULL,
`documentNumber` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `INDEX_1` (`documentNumber`)
) ENGINE=InnoDB;
I want to make a query if a documentNumber is approved by all employeeNumber or approved by some of employeeNumber or not approved by any employeeNumber.
I made a query as below.
SELECT T1.documentNumber,
(CASE WHEN T2.currentNum = '0' THEN '1' WHEN T2.currentNum < T2.totalNum THEN '2' ELSE '3' END) AS approveStatusNumber
FROM myTable AS T1 LEFT JOIN
(SELECT documentNumber, COUNT(*) AS totalNum, SUM(CASE WHEN approveDate IS NOT NULL THEN '1' ELSE '0' END) AS currentNum
FROM myTable GROUP by documentNumber) AS T2 ON T1.documentNumber = T2.documentNumber
GROUP BY T1.documentNumber;
This SQL works, but very slow.
I tried explain on this SQL, the result is as below.
+----+-------------+------------+-------+---------------+---------+---------+------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+---------+---------+------+------+----------------------------------------------+
| 1 | PRIMARY | T1 | range | INDEX_1 | INDEX_1 | 153 | NULL | 27 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5517 | |
| 2 | DERIVED | myTable | index | NULL | INDEX_1 | 153 | NULL | 5948 | Using where |
+----+-------------+------------+-------+---------------+---------+---------+------+------+----------------------------------------------+
I think I have to eliminate sub query to improve my query.
How can I do the same thing without sub query? Or do I have another way to improve my query?
The expression to return currentNum could be more succinctly expressed (in MySQL) as
SUM(approveDate IS NOT NULL)
And there's no need for an inline view. This will return an equivalent result:
SELECT t.documentNumber
, CASE
WHEN SUM(t.approveDate IS NOT NULL) = 0
THEN '1'
WHEN SUM(t.approveDate IS NOT NULL) < COUNT(*)
THEN '2'
ELSE '3'
END AS approveStatusNumber
FROM myTable t
GROUP BY t.documentNumber
ORDER BY t.documentNumber

Horrible MySQL index behavior with a simplest IN statement

I have found that MySQL (Win 7 64, 5.6.14) does not use index properly if I specify table output for IN statement. USER table contains 900k records.
If I use IN (_SOME_TABLE_OUTPUT_) syntax - I get fullscan for all 900k users. Query runs forever.
If I use IN ('CONCRETE','VALUES') syntax - I get a correct index usage.
How can I make MySQL finally USE the index?
1st case:
explain SELECT gu.id FROM USER gu WHERE gu.uuid in
(select '11b6a540-0dc5-44e0-877d-b3b83f331231' union
select '11b6a540-0dc5-44e0-877d-b3b83f331232');
+----+--------------------+------------+-------+---------------+------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+-------+---------------+------+---------+------+--------+--------------------------+
| 1 | PRIMARY | gu | index | NULL | uuid | 257 | NULL | 829930 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
| 3 | DEPENDENT UNION | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
| NULL | UNION RESULT | <union2,3> | ALL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------------+------------+-------+---------------+------+---------+------+--------+--------------------------+
2nd case:
explain SELECT gu.id FROM USER gu WHERE gu.uuid in
('11b6a540-0dc5-44e0-877d-b3b83f331231');
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
| 1 | SIMPLE | gu | ref | uuid | uuid | 257 | const | 1 | Using where; Using index |
+----+-------------+-------+------+---------------+------+---------+-------+------+--------------------------+
Table structure:
CREATE TABLE `USER` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`version` bigint(20) NOT NULL,
`email` varchar(255) DEFAULT NULL,
`uuid` varchar(255) NOT NULL,
`partner_id` bigint(20) NOT NULL,
`password` varchar(255) DEFAULT NULL,
`date_created` datetime DEFAULT NULL,
`last_updated` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique-email` (`partner_id`,`email`),
KEY `uuid` (`uuid`),
CONSTRAINT `fk_USER_partner` FOREIGN KEY (`partner_id`) REFERENCES `partner` (`id`) ON DELETE CASCADE,
CONSTRAINT `FKB2D9FEBE725C505E` FOREIGN KEY (`partner_id`) REFERENCES `partner` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3315452 DEFAULT CHARSET=latin1
FORCE INDEX and USE INDEX statements don't change anything.
Demonstration SQLfiddle: http://sqlfiddle.com/#!2/c607e1/2
In fact I faced such problem before and it happened that I had one table that had a single column set as UTF-8 and the other tables where latin1. It did not matter what I did, MySQL insisted on using no indexes. The problem is quite well described on this blog post Slow queries in MySQL due to collation problems. Once you manage to fix the character set, I believe any of the queries will work.
An inner join on your virtual table might give you better performance. Try something along these lines.
SELECT gu.id
FROM USER gu
INNER JOIN (
select '11b6a540-0dc5-44e0-877d-b3b83f331231' uuid
union all
select '11b6a540-0dc5-44e0-877d-b3b83f331232') ids
on gu.uuid = ids.uuid;

MySQL: Optimizing a query with NOT EXISTS a subquery

Let me just say, first of all, that I'm not a mySQL guru; while I use it adequately I don't know a lot of details about it. In a system I just inherited, I've got this query:
SELECT DISTINCT profile2.f3
FROM node AS profile
JOIN node AS profile2
ON ( profile.f1 = profile2.f1 )
WHERE profile.f2 = "aString"
AND profile.f3 = "anotherString"
AND profile2.f2 = "aThirdString"
AND NOT EXISTS (SELECT profile3.f1
FROM node AS profile3
WHERE profile3.f1 = profile.f1
AND profile3.f2 = "yetAnotherString") ;
SHOW CREATE TABLE gives:
CREATE TABLE `node` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`graph` varchar(100) CHARACTER SET latin1 DEFAULT NULL,
`f1` varchar(200) NOT NULL,
`f2` varchar(200) NOT NULL,
`f3` mediumtext NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `nodeindex` (`graph`(20),`f1`(100),`f2`(100),`f3`(100)),
KEY `ix_node_f1` (`f1`),
KEY `ix_node_graph` (`graph`),
KEY `ix_node_f3` (`f3`(255)),
KEY `ix_node_f2` (`f2`),
KEY `node_po` (`f2`,`f3`(130)),
KEY `node_so` (`f1`,`f3`(130)),
KEY `node_sp` (`f1`,`f2`(130)),
FULLTEXT KEY `node_search` (`f3`)
) ENGINE=MyISAM AUTO_INCREMENT=455854703 DEFAULT CHARSET=utf8
EXPLAIN EXTENDED gives:
+----+--------------------+----------+------+--------------------------------------------------------------------------------------+---------+---------+-----------------------------------+-------+----------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+----------+------+--------------------------------------------------------------------------------------+---------+---------+-----------------------------------+-------+----------+------------------------------+
| 1 | PRIMARY | profile | ref | ix_node_f1,ix_node_f3,ix_node_f2,node_po,node_so,node_sp,node_search | node_po | 994 | const,const | 49084 | 100.00 | Using where; Using temporary |
| 1 | PRIMARY | profile2 | ref | ix_node_f1,ix_node_f2,node_po,node_so,node_sp | node_sp | 994 | sumazi_prdf.profile.f1,const | 1 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | profile3 | ref | ix_node_f1,ix_node_f2,node_po,node_so,node_sp | node_sp | 994 | sumazi_prdf.profile.f1,const | 1 | 100.00 | Using where |
+----+--------------------+----------+------+--------------------------------------------------------------------------------------+---------+---------+-----------------------------------+-------+----------+------------------------------+
As I say, I'm not an RDBMS guru, but my intuition suggests that the performance of this query could be substantially improved. Any suggestions?
You can try this and this should be relatively faster or you can go for joins
SELECT DISTINCT profile2.f3
FROM node AS profile
JOIN node AS profile2
ON ( profile.f1 = profile2.f1 )
WHERE profile.f2 = "aString"
AND profile.f3 = "anotherString"
AND profile2.f2 = "aThirdString"
AND PROFILE.F1 NOT IN (SELECT profile3.f1
FROM node AS profile3
WHERE profile3.f2 = "yetAnotherString") ;
Left Joins ... Where NULL tend to be faster than Not Exists clauses in MySQL; in other RDBMSs, it tends to be the other way round. Try:
SELECT DISTINCT profile2.f3
FROM node AS profile
JOIN node AS profile2 ON profile.f1 = profile2.f1
LEFT JOIN node AS profile3 ON profile.f1 = profile3.f1
AND profile3.f2 = "yetAnotherString"
WHERE profile.f2 = "aString"
AND profile.f3 = "anotherString"
AND profile2.f2 = "aThirdString"
AND profile3.f1 IS NULL

MySQL fixing index so possible keys is not null on left join

This question follows on from the problem posted here when i run explain I on my query
SELECT u_id, SUM(counts.s_count * tablename.weighted) AS total FROM tablename
LEFT JOIN (SELECT a_id, s_count FROM tablename WHERE u_id = 1) counts
ON tablename.a_id = counts.a_id
GROUP BY u_id ORDER BY total DESC LIMIT 0,100;
I get the response
+----+-------------+--------------------+-------+---------------+-----------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+---------------+-----------+---------+------+--------+----------------------------------------------+
| 1 | PRIMARY | tablename | index | NULL | a_id | 3 | NULL | 7222350| Using index; Using temporary; Using filesort |
| 1 | PRIMARY | [derived2] | ALL | NULL | NULL | NULL | NULL | 37 | |
| 2 | DERIVED | tablename | ref | PRIMARY | PRIMARY | 4 | | 37 | Using index |
+----+-------------+--------------------+-------+---------------+-----------+---------+------+-------+----------------------------------------------+
the table is created with
CREATE TABLE IF NOT EXISTS tablename (
u_id INT NOT NULL,
a_id MEDIUMINT NOT NULL,
s_count MEDIUMINT NOT NULL,
weighted FLOAT NOT NULL,
INDEX (a_id),
PRIMARY KEY (u_id,a_id)
)ENGINE=INNODB;
how can I change the index or query to get it to make use of the key more effectively? Once the database grows to a 7 million rows the query takes about 30 seconds
edit
which can be created with dummy data using
CREATE TABLE IF NOT EXISTS tablename ( u_id INT NOT NULL, a_id MEDIUMINT NOT NULL,s_count MEDIUMINT NOT NULL, weighted FLOAT NOT NULL,INDEX (a_id), PRIMARY KEY (u_id,a_id))ENGINE=INNODB;
INSERT INTO tablename (u_id,a_id,s_count,weighted ) VALUES (1,1,17,0.0521472392638),(1,2,80,0.245398773006),(1,3,2,0.00613496932515),(1,4,1,0.00306748466258),(1,5,1,0.00306748466258),(1,6,20,0.0613496932515),(1,7,3,0.00920245398773),(1,8,100,0.306748466258),(1,9,100,0.306748466258),(1,10,2,0.00613496932515),(2,1,1,0.00327868852459),(2,2,1,0.00327868852459),(2,3,100,0.327868852459),(2,4,200,0.655737704918),(2,5,1,0.00327868852459),(2,6,1,0.00327868852459),(2,7,0,0.0),(2,8,0,0.0),(2,9,0,0.0),(2,10,1,0.00327868852459),(3,1,15,0.172413793103),(3,2,40,0.459770114943),(3,3,0,0.0),(3,4,0,0.0),(3,5,0,0.0),(3,6,10,0.114942528736),(3,7,1,0.0114942528736),(3,8,20,0.229885057471),(3,9,0,0.0),(3,10,1,0.0114942528736);
You can hardly force MySQL to use an index for the join with results of a subquery, but you can try to speed up the grouping by using a coverage index (an index that has enough data not to fetch the row it references):
Try to add an composite index (u_id, a_id, weighted)
And you will probably need to give MySQL a hint to use the index:
SELECT u_id, SUM(counts.s_count * tablename.weighted) AS total
FROM tablename USE INDEX(Index_3)
LEFT JOIN (SELECT a_id, s_count FROM tablename WHERE u_id = 1) counts
ON tablename.a_id = counts.a_id
GROUP BY u_id ORDER BY total DESC LIMIT 0,100;