Slow query, should i index or other solutions? - mysql

I have this very slow query, it counts the product that has certain specifications, is the solution indexing? or other solutions?
select count(DISTINCT if(ps10.specification in ('Meisje'),p.products_id,NULL)) as count1 ,count(DISTINCT if(ps10.specification in ('Jongen'),p.products_id,NULL)) as count2 ,count(DISTINCT if(ps10.specification in ('Unisex'),p.products_id,NULL)) as count3 from (products p)
join (products_to_categories p2c)
on (p.products_id = p2c.products_id)
left join (specials s)
on (p.products_id = s.products_id)
left join (products_attributes pa)
on (p.products_id = pa.products_id)
left join (products_options_values pv)
on (pa.options_values_id = pv.products_options_values_id)
left join (products_stock ps)
on (p.products_id=ps.products_id and pv.products_options_values_id = ps.products_options_values_id2)
INNER JOIN products_specifications ps10 ON p.products_id = ps10.products_id INNER JOIN products_specifications ps17 ON p.products_id = ps17.products_id where p.products_status = '1' and ps.products_stock_quantity>0 and p2c.categories_id in (2,54,60,82,109,115,116,118,53,58,104,55,101,75,56,64,66,67,68,69,70,71,84,103,114,80,92,99,93,94,95,97,106) AND ps10.specifications_id = '10'
AND ps10.language_id = '1'
AND ps17.specification in ('Babyslofjes'
) AND ps17.specifications_id = '17'
AND ps17.language_id = '1'
explain this query gives me this results:
+----+-------------+-------+--------+-------------------------------------+-------------------------------------+---------+------------------------------------------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------------+-------------------------------------+---------+------------------------------------------+-------+--------------------------+
| 1 | SIMPLE | ps | ALL | idx_products_stock_attributes | NULL | NULL | NULL | 16216 | Using where |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | kikleding.ps.products_id | 1 | Using where |
| 1 | SIMPLE | s | ref | idx_specials_products_id | idx_specials_products_id | 4 | kikleding.p.products_id | 1 | Using index |
| 1 | SIMPLE | p2c | ref | PRIMARY | PRIMARY | 4 | kikleding.ps.products_id | 1 | Using where; Using index |
| 1 | SIMPLE | pv | ref | PRIMARY | PRIMARY | 4 | kikleding.ps.products_options_values_id2 | 1 | Using where; Using index |
| 1 | SIMPLE | ps10 | ref | products_id | products_id | 12 | kikleding.p.products_id,const,const | 1 | Using where |
| 1 | SIMPLE | ps17 | ref | products_id | products_id | 12 | kikleding.ps.products_id,const,const | 1 | Using where |
| 1 | SIMPLE | pa | ref | idx_products_attributes_products_id | idx_products_attributes_products_id | 4 | kikleding.p2c.products_id | 6 | Using where |
+----+-------------+-------+--------+-------------------------------------+-------------------------------------+---------+------------------------------------------+-------+--------------------------+
Changed the left joins to inner joins like this:
select count(DISTINCT if(ps10.specification in ('Meisje'),p.products_id,NULL)) as count1 ,count(DISTINCT if(ps10.specification in ('Jongen'),p.products_id,NULL)) as count2 ,count(DISTINCT if(ps10.specification in ('Unisex'),p.products_id,NULL)) as count3 from (products p)
inner join (products_to_categories p2c)
on (p.products_id = p2c.products_id)
left join (specials s)
on (p.products_id = s.products_id)
inner join (products_attributes pa)
on (p.products_id = pa.products_id)
inner join (products_options_values pv)
on (pa.options_values_id = pv.products_options_values_id)
inner join (products_stock ps)
on (p.products_id=ps.products_id and pv.products_options_values_id = ps.products_options_values_id2)
INNER JOIN products_specifications ps10 ON p.products_id = ps10.products_id INNER JOIN products_specifications ps17 ON p.products_id = ps17.products_id where p.products_status = '1' and ps.products_stock_quantity>0 and p2c.categories_id in (2,54,60,82,109,115,116,118,53,58,104,55,101,75,56,64,66,67,68,69,70,71,84,103,114,80,92,99,93,94,95,97,106) AND ps10.specifications_id = '10'
AND ps10.language_id = '1'
AND ps17.specification in ('Babyslofjes'
) AND ps17.specifications_id = '17'
AND ps17.language_id = '1'
I Indexed the ps.products_id
It's little bit faster, thank you for the comments, but the query is still very slow

As obvious use p.products_id more so first index that attribute in table Products. Then pv.products_options_values_id and also try to index other attributes you use in Inner Join. Also try to convert where conditions to be used with in Join conditions expecially for Inner join

I would go for a slightly modified query to put the conditions into the join from the where parts, and from the sample I'd think you could get rid of the Specials table as well.
select
count(distinct if(ps10.specification in ('Meisje'), p.products_id, null)) as count1,
count(distinct if(ps10.specification in ('Jongen'), p.products_id, null)) as count2,
count(distinct if(ps10.specification in ('Unisex'), p.products_id, null)) as count3
from (products p)
inner join (products_to_categories p2c)
on (p.products_id = p2c.products_id)
inner join (products_attributes pa)
on (p.products_id = pa.products_id)
inner join (products_options_values pv)
on (pa.options_values_id = pv.products_options_values_id)
inner join (products_stock ps)
on (p.products_id=ps.products_id and pv.products_options_values_id = ps.products_options_values_id2 and ps.products_stock_quantity > 0)
inner join products_specifications ps10
ON p.products_id = ps10.products_id and ps10.language_id = '1' and ps10.specifications_id = '10'
inner join products_specifications ps17
ON p.products_id = ps17.products_id and ps17.language_id = '1' and ps17.specifications_id = '17'
where p.products_status = '1'
and p2c.categories_id in (2,54,60,82,109,115,116,118,53,58,104,55,101,75,56,64,66,67,68,69,70,71,84,103,114,80,92,99,93,94,95,97,106)
and ps17.specification in ('Babyslofjes')
As for the indexes, I'd check the following to be available:
products / products_id (most probably is)
products_to_categories / products_id+categories_id (most probably also)
products_attributes / products_id+options_values_id
products_options_values / products_options_values_id
products_specifications / products_id+language_id+specifications_id
From the table names I suspect this being an OS/XTcommerce database, I'll try to get my hands on one in a few hours and give a more detailed opinion. I just don't remember products_stock and products_specifications, those are both tables, not views, right?

Related

Mysql: Select rows from a join table where 'in' and 'not in' criteria are used

I have 3 tables like below:
Table media:
+------------------------+
| media_id | media_name |
+------------------------+
| 1 | item1 |
| 2 | item2 |
| 3 | item3 |
+------------------------+
Join Table mediatag:
+--------------------+
| media_id | tag_id |
+--------------------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 1 |
| 3 | 3 |
+--------------------+
Table tag:
+--------------------+
| tag_id | tag_name |
+--------------------+
| 1 | blue |
| 2 | red |
| 3 | white |
| 4 | green |
+--------------------+
I wish retrieve all medias that have 'blue' and 'white' tags but without medias that have 'red' tag.
So in my example, the result must be: item2, item3
I tried this query but obviously the item1 is displayed:
SELECT m.media_id, media_name FROM media AS m
INNER JOIN mediatag AS mag ON m.media_id = mag.media_id
WHERE tag_id = '1' OR tag_id = '3' AND tag_id !='2';
how to do this?
Group your data and select only those groups having the conditions you mention
SELECT m.media_id, m.media_name
FROM media AS m
INNER JOIN mediatag AS mag ON m.media_id = mag.media_id
GROUP BY m.media_id, m.media_name
HAVING sum(tag_id in (1,3)) > 0
AND sum(tag_id = 2) = 0
From your desired result, it seems like you want that to actually be blue OR white without red. You can use similar logic but change it to use an OR:
SELECT m.media_id, m.media_name
FROM media AS m
INNER JOIN mediatag AS mt
ON m.media_id = mt.media_id
GROUP BY m.media_id, m.media_name
HAVING (sum(mt.tag_id = 1) > 0 OR sum(mt.tag_id = 3) > 0)
AND sum(mt.tag_id = 2) = 0;
See this demo.
If you didn't want to use the conditional logic in the HAVING clause, you could also write this as a NOT EXISTS query and get the same result:
SELECT DISTINCT m.media_id, m.media_name
FROM media AS m
INNER JOIN mediatag AS mt
ON m.media_id = mt.media_id
WHERE mt.tag_id in (1, 3)
and not exists (SELECT 1
FROM mediatag mt2
WHERE m.media_id = mt2.media_id
and mt2.tag_id = 2);
See another demo.
I would do a left outer join on a sub select of media that matches the rows you want to exclude, then in the where say media_id IS NULL
SELECT *
FROM media AS a
INNER JOIN mediatag AS b ON a.media_id = b.media_id
INNER JOIN tag c ON b.tag_id = c.tag_id AND c.tag_name = 'blue'
LEFT OUTER JOIN (
SELECT a.media_id
FROM media AS a
INNER JOIN mediatag AS b ON a.media_id = b.media_id
INNER JOIN tag c ON b.tag_id = c.tag_id AND c.tag_name = 'red'
) d ON a.media_id = d.media_id
WHERE d.media_id IS NULL;

mysql left join take rows together

I have 3 tables: invoice, person and payement.
I want to have a list of invoices with the client name (from person) and the sum of payements and dates of payements(from payement).
First I made these statement
SELECT V.id, V.datum, V.amount, P.name AS 'client',
(SELECT SUM(B.amount) FROM payement AS B WHERE B.invoiceId = V.id) AS 'payed',
(SELECT GROUP_CONCAT(B.datum SEPARATOR ',') FROM payement AS B WHERE B.invoiceId = V.id) AS 'date payement'
FROM invoice AS V
JOIN person AS P ON (V.clientId = P.id)
WHERE YEAR(V.datum) = '2015'
ORDER BY V.datum;
This give what I want (p.e. a transaction of 1000 on 4 sept and one of 2400 on 10 sept), but works very slow when I have a lot of invoices.
+------+-----------+--------+--------+-------+---------------------+
| id | datum | amount | client | payed | date payement |
+------+-----------+--------+--------+-------+---------------------+
| 75 |2015-09-10 | 3400 |Sommers | 3400 |2015-09-04,2015-09-10|
+------+-----------+--------+--------+-------+---------------------+
So I tried another statement.
SELECT V.id, V.datum, V.amount, P.name AS 'client', B.amount AS 'payed', B.datum 'date payement'
FROM invoice AS V
JOIN person AS P ON (V.clientId = P.id)
LEFT JOIN payement AS B ON B.invoiceId = V.id
WHERE YEAR(V.datum) = '2015'
ORDER BY V.datum;
But this give me 2 rows for 1 invoice, when it is payed with 2 transactions.
Can I solve it with SQL, or is it better to solve it in my application (in Java)?
When an invoice has been paid with 2 payments, which details do you wish to use? the first payment or the 2nd?
Assuming that you want the total payment amount and the latest payment date:-
SELECT V.id,
V.datum,
V.amount,
P.name AS 'client',
SUM(B.amount) AS 'payed',
MAX(B.datum) AS 'date payement'
FROM invoice AS V
JOIN person AS P ON (V.clientId = P.id)
LEFT OUTER JOIN payement AS B ON B.invoiceId = V.id
WHERE YEAR(V.datum) = '2015'
GROUP BY V.id,
V.datum,
V.amount,
P.name
ORDER BY V.datum
I don't use phpmyadmin.
mysql> EXPLAIN SELECT V.factuurnr,
-> V.datum,
-> V.somexcl,
-> P.naam AS 'client',
-> SUM(B.bedrag) AS 'payed',
-> GROUP_CONCAT(DATE_FORMAT(B.datum,'%d/%m/%y') SEPARATOR ',') AS 'date payement'
-> FROM verkoop AS V
-> JOIN persoon AS P ON (V.klantId = P.id)
-> LEFT JOIN betaling AS B ON B.docId = V.id
-> WHERE YEAR(V.datum) = '2015' and month(V.datum)=9
-> GROUP BY V.factuurnr,
-> V.datum,
-> V.somexcl,
-> P.naam
-> ORDER BY factuurnr;
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+----------------------------------------------+
| 1 | SIMPLE | V | ALL | NULL | NULL | NULL | NULL | 1576 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | P | eq_ref | PRIMARY | PRIMARY | 4 | meta.V.klantId | 1 | Using where |
| 1 | SIMPLE | B | ALL | NULL | NULL | NULL | NULL | 3291 | |
+----+-------------+-------+--------+---------------+---------+---------+----------------+------+----------------------------------------------+
3 rows in set (0.00 sec)

MySQL Relational Division Query Performance

I have students that are associated many-to-many with groups via a join table groups_students. Each group has a group_type, which can either be a permission_group or not (boolean on group_types table).
I also have users, which are also associated many-to-many with groups via groups_users.
I want to return all students for which a particular user is associated with ALL the student's permission groups.
I've been lead to believe this requires relational division and here's where I am with it:
SELECT DISTINCT gs.student_id
FROM groups_students AS gs
INNER JOIN groups ON groups.id = gs.group_id
INNER JOIN groups_users gu ON gu.group_id = groups.id
INNER JOIN group_types ON group_types.id = groups.group_type_id
WHERE group_types.permission_group = 1
AND gu.user_id = 37
AND NOT EXISTS (
SELECT * FROM groups_students AS gs2
WHERE gs2.student_id = gs.student_id
AND NOT EXISTS (
SELECT gu2.group_id
FROM groups_users AS gu2
WHERE gu2.group_id = gs2.group_id AND gu2.user_id = gu.user_id
)
)
This works, but on my live database with over 20,000 rows in groups_students, it takes over 3 seconds.
Can I make it faster? I read about doing relational division with COUNT but I couldn't relate it to my scenario. Am I able to make cheap gains to bring this query well under half a second execution time or am I looking at a major restructure?
Edit - English language description: Students belong to classes (groups), and users have permission to view certain classes. I need to know the students for which a particular user has permission to view all the (permission) classes for.
EXPLAIN for the slow query:
+----+--------------------+-------------+--------+--------------------------------------------------------------+--------------------------------------------------+---------+-----------------------------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------+--------+--------------------------------------------------------------+--------------------------------------------------+---------+-----------------------------+------+--------------------------------+
| 1 | PRIMARY | gu | ref | index_groups_users_on_user_id,index_groups_users_on_group_id | index_groups_users_on_user_id | 5 | const | 1181 | Using where; Using temporary |
| 1 | PRIMARY | groups | eq_ref | PRIMARY | PRIMARY | 4 | my_db.gu.group_id | 1 | |
| 1 | PRIMARY | group_types | ALL | PRIMARY | NULL | NULL | NULL | 3 | Using where; Using join buffer |
| 1 | PRIMARY | gs | ref | index_groups_students_on_group_id_and_student_id | index_groups_students_on_group_id_and_student_id | 4 | my_db.groups.id | 9 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | gs2 | ref | index_groups_students_on_student_id_and_group_id | index_groups_students_on_student_id_and_group_id | 4 | my_db.gs.student_id | 8 | Using where; Using index |
| 3 | DEPENDENT SUBQUERY | gu2 | ref | index_groups_users_on_user_id,index_groups_users_on_group_id | index_groups_users_on_group_id | 5 | my_db.gs2.group_id | 99 | Using where |
+----+--------------------+-------------+--------+--------------------------------------------------------------+--------------------------------------------------+---------+-----------------------------+------+--------------------------------+
SQL Fiddle
"I want to return all students for which a particular user is associated with ALL the student's permission groups."
I don't really follow your query; it seems so complicated for this purpose. Instead, I think of it as follows:
Generate all students and their permissions
Generate all permissions for user 37
(outer) Join these together on permissions
Be sure that all permissions for a particular student are in the u37 group
The resulting query is:
select student_id
from (SELECT gs.student_id, g.id as group_id
FROM groups_students gs INNER JOIN
groups g
ON g.id = gs.group_id INNER JOIN
groups_users gu
ON gu.group_id = g.id INNER JOIN
group_types gt
ON gt.id = g.group_type_id
where gt.permission_group = 1
) s left outer join
(select g.id as group_id
from groups_users gu INNER JOIN
groups g
on gu.group_id = g.id INNER JOIN
group_types gt
ON gt.id = g.group_type_id
where gu.user_id = 37 and gt.permission_group = 1
) u37
on s.group_id = u37.group_id
group by s.student_id
having count(*) = count(u37.group_id);
Note: You can do this without the subqueries. Despite their overhead, I think they make the query much more understandable.
A simpler version of Gordon's idea...
SELECT gs.student_id
FROM groups_students gs
JOIN groups g
ON g.id = gs.group_id
JOIN group_types gt
ON gt.id = g.group_type_id
LEFT
JOIN groups_users gu
ON gu.group_id = gs.group_id
AND gu.user_id = 37
WHERE gt.permission_group
GROUP
BY student_id
HAVING COUNT(student_id) = COUNT(user_id)
I don't understand why you use subqueries. They are generally slow and should be avoided if possible. Maybe I do not understand your requirements correctly, but I would come up with something like this:
SELECT DISTINCT gs.student_id
FROM groups_students AS gs
INNER JOIN groups ON groups.id = gs.group_id
INNER JOIN groups_users gu ON gu.group_id = groups.id
INNER JOIN group_types ON group_types.id = groups.group_type_id
LEFT JOIN groups_students AS gs2 ON gs2.student_id = gs.student_id
LEFT JOIN groups_users AS gu2 ON gu2.group_id = gs2.group_id AND gu2.user_id = gu.user_id
WHERE group_types.permission_group = 1
AND gu.user_id = 37
AND gs2.student_id IS NULL
AND gu2.group_id IS NULL
You can force something to not exist by using a left join and checking, that the right table-column (use the primary key) contains null.

How to select items on table based on 3 relations?

I have 3 tables:
Categories
| id | name
| 1 | Samsung
| 2 | Apple
Products
| id | category_id | name
| 1 | 1 | Galaxy S4
| 2 | 1 | Galaxy S3
| 3 | 1 | SHG-G600
| 4 | 3 | Lumia 920
Tags
| id | product_id | name | type
| 1 | 1 | smart-phone | phoneType
| 2 | 2 | smart-phone | phoneType
| 3 | 3 | normal-cell | phoneType
| 4 | 1 | red | phoneColor
I'm trying to find a way to select all Samsung devices which have 'smart-phone' as 'phoneType' and 'red' as 'phoneColor'.
So this what I did until now:
SELECT *
FROM `products`
INNER JOIN `product_tag` ON `product_tag`.`product_id` = `products`.`id`
INNER JOIN `tags` ON `tags`.`id` = `products`.`id`
WHERE (
`tags`.`type` = 'phoneType'
AND `tags`.`name` = 'smart-phone'
)
OR (
`tags`.`type` = 'phoneColor'
AND `tags`.`name` = 'red'
)
)
This did not work as is (without selecting category).
I also didn't know how to join categories and add where categories.id = 1.
You can do this by putting the logic in the having clause. For your example code:
SELECT p.*
FROM `products` p join
`product_tag` pt
ON pt.`product_id` = p.`id` join
`tags` t
ON t.`id` = p.`id`
group by p.id
having sum(t.`type` = 'caseMaterial' AND t.name = 'leather') > 0 and
sum(t.`type` = 'caseFor' AND t.`name` = 'iphone-5') > 0;
However, I'm not quite sure how this relates to the tables at the beginning of the question. Your code sample and data layout are not consistent.
I extended the solution of #Gordon Linoff by adding the category join.
SELECT p.*
FROM `products` p join
`categories` c
ON c.`id` = p.`category_id` join
`product_tag` pt
ON pt.`product_id` = p.`id` join
`tags` t
ON t.`id` = pt.`tag_id`
where c.id = 1
group by p.id
having sum(t.`type` = 'phoneType' AND t.name = 'smart-phone') > 0 and
sum(t.`type` = 'phoneColor' AND t.`name` = 'red') > 0
This is working now. Thanks to #Gordon Linoff.

join SQL query problem

I have following query which returns the product and the lowest sell price found with the quantity of that sell price. Everything works perfectly until I want to get a product that does not have any prices in the product_price table. How can I let return it the product data and NULLS for sellPrice and quantity?
SELECT p.*, MIN(pp.sellPrice) as sellPrice, pp.quantity FROM `product` as p
LEFT JOIN `product_price_group` as ppg ON ppg.productId = p.`id`
LEFT JOIN `product_price` as pp ON pp.priceGroupId = ppg.`id`
WHERE p.`id` = 1 AND p.`active` = 1
Output of an product that has a price available:
+----+--------------+--------+--------------+--------------+-----------+----------+
| id | name | active | sortSequence | creationDate | sellPrice | quantity |
+----+--------------+--------+--------------+--------------+-----------+----------+
| 1 | product_id_1 | 1 | 1 | 1287481220 | 22.00 | 10 |
+----+--------------+--------+--------------+--------------+-----------+----------+
Output of an product that does not have a pricing avaialble
+----+------+--------+--------------+--------------+-----------+----------+
| id | name | active | sortSequence | creationDate | sellPrice | quantity |
+----+------+--------+--------------+--------------+-----------+----------+
| NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+----+------+--------+--------------+--------------+-----------+----------+
Desired output:
+----+--------------+--------+--------------+--------------+-----------+----------+
| id | name | active | sortSequence | creationDate | sellPrice | quantity |
+----+--------------+--------+--------------+--------------+-----------+----------+
| 2 | product_id_2 | 1 | 1 | 1287481220 | NULL | NULL |
+----+--------------+--------+--------------+--------------+-----------+----------+
Update
It seems that I was selecting oN product items that don't exist! Very stupid.
What about using LEFT OUTER JOIN for product_price table?
SELECT p.*, MIN(pp.sellPrice) as sellPrice, pp.quantity FROM `product` as p
LEFT JOIN `product_price_group` as ppg ON ppg.productId = p.`id`
LEFT OUTER JOIN `product_price` as pp ON pp.priceGroupId = ppg.`id`
WHERE p.`id` = 1 AND p.`active` = 1
Is this what you want?
UPDATE: Revision - Like others say, LEFT JOIN = LEFT (OUTER) JOIN so it will not help you in this case...
I MAY be incorrect, but my understanding of a LEFT JOIN has always been the table reference in the equality test as written in the SQL statement... which is in addition, how I write queries... Start with the table I'm expecting FIRST (left), joined to the OTHER (right) table second... Keep the join condition ALSO respective of that relationship...
select from x left join y where x.fld = y.fld
instead of
select from x left join y where y.fld = x.fld
So I would adjust your query as follows
SELECT
p.*,
MIN(pp.sellPrice) as sellPrice,
pp.quantity
FROM
product as p
LEFT JOIN product_price_group as ppg
ON p.id = ppg.productId
LEFT JOIN product_price as pp
ON ppg.id = pp.priceGroupId
WHERE
p.id = 1
AND p.active = 1
Additionally, you can wrap your min() and quantity with a IFNULL( field, 0 ) to prevent NULLS from showing but instead have actual zero values.