I have a Link table with from_uid and to_uid (both indexed) and I want to filter out certain ids. So I do:
SELECT l.uid
FROM Link l
JOIN filter_ids t1 ON l.from_uid = t1.id
JOIN filter_ids t2 ON l.to_uid = t2.id
Now for some reason this is unexpectedly slow :( whereas each individual join is very fast. Can it not use the index right?
EXPLAIN tells me:
id select table type possible_keys key key_len ref rows Extra
1 SIMPLE t1 index Null PRIMARY 34 Null 12205 Using index
1 SIMPLE l ref from_uid,to_uid from_uid 96 func 6 Using where
1 SIMPLE t2 index Null PRIMARY 34 Null 12205 Using where; Using index; Using join buffer
No idea if it'll help but try:
select l.uid
from Link l
where l.from_uid in (select id from filter_ids)
and l.to_uid in (select id from filter_ids)
Maybe it'll make better work with indexes.
The EXPLAIN tells you that the JOIN actually starts from the t1 table. That is you need to add a new index on Link (or better extend the current from_uid index):
(from_uid, to_uid, uid)
or if uid is the primary key, just:
(from_uid, to_uid)
UPD
What you are describing is strange. You can try running just:
SELECT STRAIGHT_JOIN l.uid
FROM Link l
JOIN filter_ids t1 ON l.from_uid = t1.id
JOIN filter_ids t2 ON l.to_uid = t2.id
Related
In MySQL, I have a simple join between 2 tables. Something like
select a.id, SUM(b.qty) from a inner join b on a.id=b.id
where a.id=12345
group by a.id
It runs normal as a query. But when I keep the query
select a.id, SUM(b.qty) from a inner join b on a.id=b.id
group by a.id
in a view called view_ab, the view takes enormous amount of time when i run the following query on the view.
select * from view_ab where id = 12345
Both these tables are large tables. Unable to figure out the reason for such a drop in performance. Please help resolve this performance issue
EDIT:
This is the view SQL
CREATE VIEW view_ab AS SELECT
r.drid AS drid,
SUM(s.return_qty) AS return_qty
FROM tbl_deliveryroute r INNER JOIN tbl_deliveryroute_sku s ON r.drid =
s.drid GROUP BY r.drid;
This is the query
SELECT
r.drid AS drid,
SUM(s.return_qty) AS return_qty
FROM tbl_deliveryroute r INNER JOIN tbl_deliveryroute_sku s ON r.drid =
s.drid WHERE r.drid=12718651
GROUP BY r.drid;
This is the query on the VIEW
SELECT * FROM view_ab WHERE drid=12718651;
Execution plan of the view
EXPLAIN EXTENDED SELECT * FROM view_ab WHERE drid=12718651;
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
(NULL)
ref
4
const
10
100.00
(NULL)
2
DERIVED
s
(NULL)
ALL
idx_tbl_deliverroute_sku_drid
(NULL)
(NULL)
(NULL)
15060913
100.00
USING TEMPORARY; USING filesort
2
DERIVED
r
(NULL)
eq_ref
PRIMARY,FK_tbl_deliveryroute_1
PRIMARY
4
humdemotest.s.drid
1
100.00
USING INDEX
EXPLAIN EXTENDED SELECT
r.drid AS drid,
SUM(s.return_qty) AS return_qty
FROM tbl_deliveryroute r INNER JOIN tbl_deliveryroute_sku s ON r.drid =
s.drid WHERE r.drid=12718651
GROUP BY r.drid;
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
SIMPLE
r
(NULL)
const
PRIMARY
PRIMARY
4
const
1
100.00
USING INDEX
1
SIMPLE
s
(NULL)
ref
idx_tbl_deliverroute_sku_drid
idx_tbl_deliverroute_sku_drid
4
const
22
100.00
(NULL)
From what I am seeing, you don't even need a join since you are dealing with a join on the same key column from A-B, the key already exists in table B, just query group by that. Also, I would have an index on your DeliveryRoute_SKU on its route ID column
SELECT
s.drid,
sum( s.return_qty ) Return_Qty
from
tbl_DeliveryRoute_Sku s
where
s.drID = 12718651
group by
s.drID;
Since you are only doing the key and the sum, you don't even NEED the other table. Now if you needed other columns from the first table OTHER THAN the key, then yes, you would need the join. You could even simplify a step further since you are only querying a single key ID
SELECT
sum( s.return_qty ) Return_Qty
from
tbl_DeliveryRoute_Sku s
where
s.drID = 12718651;
The reason the view is slow is simple. You are executing:
SELECT *
FROM view_ab
WHERE drid = 12718651;
What you want to execute is:
select a.id, SUM(b.qty)
from a inner join
b
on a.id = b.id
where a.id = 12345
group by a.id;
What is actually being executed is:
select ab.*
from (select a.id, SUM(b.qty)
from a inner join
b
on a.id = b.id
group by a.id
) ab
where ab.id = 12345;
That is, the entire aggregation is performed first. Then the where is applied. What you want is for the predicate to be pushed up (MySQL calls this merging). You can review the documentation on this subject.
One solution would seem to be rephrasing the query as a correlated subquery:
select a.id,
(select sum(b.qty) from b where b.id = a.id) as qty
from a
where a.id = 12345;
Alas, subqueries in the select have the same effect, so this doesn't work.
I don't know of a solution using a view. You can avoid using views for this. The ultimate solution would be to implement a trigger to store the summarized results in another table -- effectively materializing the view.
I have the following query:
SELECT *
FROM s
JOIN b ON s.borrowerId = b.id
JOIN (
SELECT MIN(id) AS id
FROM tbl
WHERE dealId IS NULL
GROUP BY borrowerId, created
) s2 ON s.id = s2.id
Is there a simple way to optimize this so that I can do the JOIN directly and utilize indexes?
UPDATE
The created field is part of the GROUP BY statement because due to the limitations of our version of MySQL and the ORM being used it is possible to have multiple records with the same created timestamp value. As a result I need to find the first record for each combination of borrowerId and created.
Typically I might attempt something like this:
SELECT *
FROM s
INNER JOIN b ON s.borrowerId = b.id
LEFT OUTER JOIN s2
ON s.borrowerId = s2.borrowerId
AND s.created = s2.created
AND s.id <> s2.id
AND s.id < s2.id
WHERE s2.id IS NULL
AND s.dealId IS NULL;
But I'm not sure if that works 100% the way I want.
EXPLAIN from MySQL outputs the following:
1 PRIMARY b ALL NULL NULL NULL NULL 129690
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 317751 Using join buffer
1 PRIMARY s eq_ref PRIMARY,borrowerId_2,borrowerId PRIMARY 4 s2.id 1 Using where
2 DERIVED statuses ref dealId dealId 5 183987 Using where; Using temporary; Using filesort
As you can see, it has to query a massive number of records to build the subquery data set and when joining to the derived subquery, no indexes are found and so no indexes are used.
The first query needs this composite index:
INDEX(borrowerId, created, id)
Note that MySQL rarely uses two indexes for one SELECT, but a composite index is often very handy.
The second query seems grossly inefficient.
Please provide SHOW CREATE TABLE for each table.
I want to do optimize this query. I have give statistics for using tables.
products and products_categories table have around 500000 record. But for below mentioned category it has 1600 record. I have create slots for this 1600 records. Every product can have minimum 1 slot and maximum 10 slots. But slot table have around 300000 record. slot table can have already expire slots also. I want to get products, which are going to expire soon come first and rest of the products come behind of this products.
I have created index for end_time column. But I used conditional operator, so index not using in this query. I want to optimize this query. kindly tell me the best way.
EXPLAIN
SELECT
xcart_products.*
FROM xcart_products
INNER JOIN xcart_products_categories
ON xcart_products_categories.productid = xcart_products.productid
LEFT JOIN (SELECT
t1.*
FROM bvira_megahour_time_slot t1
LEFT OUTER JOIN bvira_megahour_time_slot t2
ON (t1.product_id = t2.product_id
AND t1.end_time > t2.end_time
AND t1.end_time > NOW())
WHERE t2.product_id IS NULL) as bvira_megahour_time_slot
ON bvira_megahour_time_slot.product_id = xcart_products.productid
WHERE xcart_products_categories.categoryid = '4410'
AND xcart_products.saleid = 2
GROUP BY xcart_products.productid
Below is the result of explain query.
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY xcart_products_categories ref PRIMARY,cpm,productid,orderby,pm cpm 4 const 1523 Using index; Using temporary; Using filesort
1 PRIMARY xcart_products eq_ref PRIMARY,saleid PRIMARY 4 wwwbvira_xcart.xcart_products_categories.productid 1 Using where
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 77215
2 DERIVED t1 ALL NULL NULL NULL NULL 398907
2 DERIVED t2 ref i_product_id,i_end_time i_product_id 4 wwwbvira_xcart.t1.product_id 4 Using where; Not exists
I have rewritten your query as follows:
EXPLAIN
SELECT p.*
FROM xcart_products p
INNER JOIN xcart_products_categories c
ON c.productid = p.productid
LEFT JOIN (
SELECT t1.*
FROM bvira_megahour_time_slot t1
LEFT JOIN bvira_megahour_time_slot t2
ON ( t1.product_id = t2.product_id
AND t1.end_time > t2.end_time
AND t1.end_time > NOW()
)
WHERE t2.product_id IS NULL
) AS bvira_megahour_time_slot
ON bvira_megahour_time_slot.product_id = p.productid
WHERE c.categoryid = '4410'
AND p.saleid = 2
GROUP BY p.productid
Please make sure that you have following compound (multi-column) indexes:
bvira_megahour_time_slot: (product_id, end_time)
xcart_products: (productid, sale_id)
xcart_products_categories: (productid, category_id)
or (category_id, productid)
With these indexes, it should work much better.
I have a problem where I'm trying to use an LEFT JOIN statement in my query but it doesn't return anything. I have tried both an INNER and a RIGHT JOIN which works.
But the problem with INNER and RIGHT is that they will just skip the row where the Clause isn't matched which I don't like.
And as the title also says, while making the query the CPU load is at 100% with the LEFT JOIN statement.
My query looks like this
SELECT t1.id, t2.name, t1.name guid, t1.timestamp, t1.server_id, t1.message
FROM chat t1
LEFT JOIN players t2 ON t2.guid = t1.name
ORDER by t1.id DESC
LIMIT 0,20;
Result from EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE t1 ALL NULL NULL NULL NULL 37302 Using temporary; Using filesort
1 SIMPLE t2 ALL NULL NULL NULL NULL 610715
I hold a set of nodes in one mysql table1 and a table of edges in another one (table2). Nodes come with primary keys and edges use this "foreign key"
**table1**
id label
1 node1
2 node2
3 node3
**table2**
FK_first FK_sec rel
1 3 guardian
2 1 guardian
1 3 times
I know the db-design is not perfect, but its simple...
Now i want the number of 'rel' for every node and do a query like:
SELECT
label,
COUNT( rel ) as freq
FROM
`table1`
LEFT JOIN table2 ON (id=FK_first OR id=FK_second)
GROUP BY label
ORDER BY freq DESC
I have about 1000 nodes and 2000 edges. A query with ON (id=FK_first OR id=FK_second), then the query is way faster (<1 sec). The other query needs about 6 sec which is ver slow.
I would appreciate some comments to speed this up a bit :-)
LEFT JOIN table2 ON (id=FK_first OR id=FK_second) ~6 sec
LEFT JOIN table2 ON (id=FK_first) ~0.16 sec
LEFT JOIN table2 ON (id=FK_second) ~0.16 sec
LEFT JOIN table2 ON id IN (FK_first,FK_second) ~6 sec
EXPLAIN 1:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE table1 ALL NULL NULL NULL NULL 2571 Using temporary; Using filesort
1 SIMPLE table2 ALL FK_first,FK_second,FK_first_2 NULL NULL NULL 3858
EXPLAIN 2:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE table1 index NULL PRIMARY 2 NULL 2571 Using index; Using temporary; Using filesort
1 SIMPLE table2 ref FK_first,FK_first_2 FK_first_2 4 table1.id 1
Try doing two joins and moving the "OR" into the COUNT() function:
For every row, this joins table2 once on FK1, then again on FK2 (if it is not already joined to that row via FK1. Then in the COUNT, we specify that only rows which have either join's rel column as non-null.
SELECT
label,
COUNT( table2A.rel || table2B.rel ) as freq
FROM
`table1`
LEFT JOIN
table2 as table2A
ON id=table2A.FK_first
LEFT JOIN
table2 as table2B
ON id=table2B.FK_second
AND table2A.FKFirst != table2B.FKFirst
GROUP BY label
ORDER BY freq DESC