To make things easier, let's say that I have a table representing pretty simple user's feed.
There are two "key" columns in my feed table:
object_id it's a ID of different assets, e.g. comment, post, etc.
entity_type_id it's a basically reference to another table in my DB.
The "children" tables may have some attributes in common, e.g. is_hidden, is_deleted and is_locked (however, they are not replicated across all tables).
Now, I'd like to implement a filter that should filter out my feed items, based on the values of these three attributes.
What I did so far?
SELECT `f`.*
FROM `feed` `f`
WHERE 1
-- !!! Other filters goes here. ---
AND
(
--
-- !!! Filter by status
--
( -- "Locked" (not all children tables have this column)
(
`f`.`entity_type_id` = 1 AND `f`.`object_id` IN ( SELECT `fb_comment_id` FROM `comments` WHERE `is_locked` = 1 AND `fb_page_id` IN('0123456789') )
)
OR
(
`f`.`entity_type_id` = 4 AND `f`.`object_id` IN ( SELECT `fb_post_id` FROM `posts` WHERE `is_locked` = 1 AND `fb_page_id` IN('0123456789') )
)
)
( -- "Hidden" (not all children tables have this column)
(
`f`.`entity_type_id` = 1 AND `f`.`object_id` IN ( SELECT `fb_comment_id` FROM `comments` WHERE `is_hidden` = 1 AND `fb_page_id` IN('0123456789') )
)
OR
(
`f`.`entity_type_id` = 4 AND `f`.`object_id` IN ( SELECT `fb_post_id` FROM `posts` WHERE `is_hidden` = 1 AND `fb_page_id` IN('0123456789') )
)
)
OR
(
-- "Deleted"
(
`f`.`entity_type_id` = 1 AND `f`.`object_id` IN ( SELECT `fb_comment_id` FROM `comments` WHERE `is_deleted` = 1 AND `fb_page_id` IN ('0123456789') )
)
OR
(
`f`.`entity_type_id` = 3 AND `f`.`object_id` IN ( SELECT `insta_comment_id` FROM `instagram_comments` WHERE `is_deleted` = 1 AND `insta_profile_id` IN ('9876543210') )
)
OR
(
`f`.`entity_type_id` = 4 AND `f`.`object_id` IN ( SELECT `fb_post_id` FROM `posts` WHERE `is_deleted` = 1 AND `fb_page_id` IN ('0123456789') )
)
OR
(
`f`.`entity_type_id` = 5 AND `f`.`object_id` IN ( SELECT `insta_post_id` FROM `instagram_posts` WHERE `is_deleted` = 1 AND `insta_profile_id` IN ('9876543210') )
)
)
)
As you can see I was using sub queries, but I was wondering is there a better way to write such queries?
I don't know if it's better, but I'd create a subquery that unions the necessary flag fields from your child tables and then just do a regular join to get the flag fields. If a flag field is not present for one of the tables, it can just be false.
Something like:
SELECT `f`.*
FROM `feed` `f`
JOIN
(
SELECT
1 AS `entity_type_id`
, fb_comment_id AS `object_id`
, is_locked
, is_hidden
, is_deleted
FROM
comments
UNION ALL
SELECT
4 AS `entity_type_id`
, fb_post_id AS `object_id`
, is_locked
, is_hidden
, is_deleted
FROM
posts
UNION ALL
SELECT
3 AS `entity_type_id`
, insta_comment_id AS `object_id`
, 0 AS is_locked
, 0 AS is_hidden
, is_deleted
FROM
instagram_comments
UNION ALL
SELECT
5 AS `entity_type_id`
, insta_post_id AS `object_id`
, 0 AS is_locked
, 0 AS is_hidden
, is_deleted
FROM
instagram_posts
) AS flag_summary ON (
flag_summary.entity_type_id = f.entity_type_id
AND flag_summary.object_id = f.object_id
)
Some tips:
Try to use INNER JOIN instead of WHERE + correlated queries. Create for example a table with all the tables in the sub-queries, and apply your filters. Do not forget to use PROCEDURE ANALYSE and index.
Avoid SELECT *, type all the variables you need.
Apply an EXPLAIN to know where you can improve your script.
So my mysql query's been loading for 25sec every time. I split query and found out that it works perfectly without one of WHERE conditions. Condition causing problem is :
eshop_products.id IN
(SELECT product-id
FROM eshop_productCombinations
WHERE eshop_productCombinations.recomended = 1
GROUP BY product-id)
Without this condition query took 0.019 sec to load. BUT when I execute this select separately, it takes only 0.026 sec to load:
SELECT product-id
FROM eshop_productCombinations
WHERE eshop_productCombinations.recomended = 1
GROUP BY product-id
Does anyone have any idea what's wrong with my main query? Thank you.
Here's full query (although I don't think it'd be useful for anybody):
SELECT
CAST(
SUBSTRING_INDEX(
GROUP_CONCAT(
price_with_vat ORDER BY IF(eshop_products_cache.`stock` > 0, 1, 0) DESC,
IF(
eshop_products.`type_default_price`=2,eshop_products_cache.`price_with_vat`,
if(
eshop_products.`type_default_price`=0,eshop_products_cache.`default`,null
)
) DESC,
IF(eshop_products.`type_default_price`=1,eshop_products_cache.`price`, null) ASC
),
",
",
1
) AS DECIMAL(10,2)
) AS `price_with_vat`,
SUBSTRING_INDEX(
GROUP_CONCAT(
eshop_products_cache.combination_id ORDER BY IF(eshop_products_cache.`stock` > 0, 1, 0) DESC,
IF(
eshop_products.`type_default_price`=2,
eshop_products_cache.`price_with_vat`,
if(
eshop_products.`type_default_price`=0,
eshop_products_cache.`default`,
null
)
) DESC,
IF(eshop_products.`type_default_price`=1,eshop_products_cache.`price`, null)
ASC
),
",
",
1
) AS `combination_id`,
if( eshop_products.id in ('5993', '6144', '6663', '5120', '5376', '5632', '5888', '6400', '6656', '5121', '5377', '5633'), 1, 0) AS new
FROM `eshop_products` LEFT JOIN `eshop_products_cache` ON eshop_products_cache.product_id=eshop_products.`id` WHERE
(
(
(
eshop_products.stockType = 2 AND eshop_products_cache.stock > 0
)
OR eshop_products.stockType <> 2
)
)
AND
(
price_with_vat > 0
)
AND
(
eshop_products.recomended = 1
OR
eshop_products.id IN (
SELECT `product-id` FROM eshop_productCombinations WHERE eshop_productCombinations.recomended = 1 GROUP BY `product-id`
)
)
AND
(
eshop_products.active = '1'
)
AND (dateStartPublish <= NOW() OR dateStartPublish IS NULL)
AND (dateStopPublish >= NOW() OR dateStopPublish IS NULL)
GROUP BY `eshop_products`.`id`, `eshop_products_cache`.`product_id` ORDER BY RAND() ASC LIMIT 5
Suggested by Anthony , subquery has to be replaced with code below:
EXISTS (
SELECT 1 FROM eshop_productCombinations
WHERE eshop_productCombinations.recomended = 1
AND product-id = eshop_products.id )
I have a table similar to this simplified version:
CREATE TABLE `accounts` (
`id` int(11) NOT NULL,
`account_type_id` int(10) NOT NULL,
`type` varchar(10) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `accounts` VALUES (1,1,'single'),(2,1,'single'),(3,1,'single'),(4,1,'single'),(5,1,'single'),(6,1,'single'),(7,1,'single'),(8,1,'single'),(9,1,'single'),(10,2,'single'),(11,2,'single'),(12,2,'single'),(13,2,'single'),(14,2,'single'),(15,2,'single'),(16,2,'single'),(17,2,'single'),(18,2,'single'),(19,2,'single'),(20,2,'single'),(21,1,'joint'),(22,1,'joint'),(23,1,'joint'),(24,1,'joint'),(25,1,'joint'),(26,1,'joint'),(27,1,'joint'),(28,1,'joint'),(29,1,'joint'),(30,1,'joint'),(31,2,'joint'),(32,2,'joint'),(33,2,'joint'),(34,2,'joint'),(35,2,'joint'),(36,2,'joint'),(37,2,'joint'),(38,2,'joint'),(39,2,'joint'),(40,2,'joint'),(41,3,'single'),(42,3,'single'),(43,3,'single'),(44,3,'single'),(45,3,'single'),(46,3,'single'),(47,3,'single'),(48,3,'single'),(49,3,'single'),(50,3,'single'),(51,3,'single'),(52,3,'single'),(53,3,'single'),(54,3,'single'),(55,3,'single'),(56,3,'single'),(57,3,'single'),(58,3,'single'),(59,3,'single'),(60,3,'single'),(61,3,'joint'),(62,3,'joint'),(63,3,'joint'),(64,3,'joint'),(65,3,'joint'),(66,3,'joint'),(67,3,'joint'),(68,3,'joint'),(69,3,'joint'),(70,3,'joint'),(71,3,'joint'),(72,3,'joint'),(73,3,'joint'),(74,3,'joint'),(75,3,'joint'),(76,3,'joint'),(77,3,'joint'),(78,3,'joint'),(79,3,'joint'),(80,3,'joint');
I want to keep:
random 5x type = single, account_type_id = 1 or 2
random 5x type = joint, account_type_id = 1 or 2
random 5x type = single, account_type_id = 3
random 5x type = joint, account_type_id = 3
My approach was to get the ids of 5 records matching each of the above, and then delete everything else.
(SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'single' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'single' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
This correctly returns 5 ids of each required type. However, if I try and use that resultset directly in a WHERE id NOT IN (...) then I get an error (I've replaced DELETE with SELECT for the example):
SELECT * FROM accounts WHERE id NOT IN(
(SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'single' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'single' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
);
Error Code: 1064. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'UNION (SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'j' at line 3
If I then add an intermediary subquery as follows:
SELECT * FROM accounts WHERE id NOT IN(
SELECT a.id FROM (
(SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'single' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'single' ORDER BY RAND() LIMIT 5)
UNION
(SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
) a
);
I get the result I want... please could someone explain why the extra query is necessary?
if you say NOT IN means
id not in field set (1 ,2,3,4,5 ,...)
in your query NOT IN and then it finds union queries , there is no set of values.
but if you make extra subquery which will select a.id is already a set a values of ids
then when you say NOT IN ( those ids ) it will return right result.
which you got what i mean.