This query is processing and running but it is completely ignoring the NOT IN section
SELECT * FROM `offers` as `o` WHERE `o`.country_iso = '$country_iso' AND `o`.`id`
not in (select distinct(offer_id) from aff_disabled_offers
where offer_id = 'o.id' and user_id = '1') ORDER by rand() LIMIT 7
Maybe your "not in" query returns nothing.
Shouldn't the
where offer_id='o.id'
Be
where offer_id=o.id
?
guido has the answer... it looks like you meant to create a correlated subquery. 'o.id' is being seen as a literal.
SOME CAUTIONS:
You usually want some sort of guarantee that the subquery in the NOT IN predicate does NOT return a NULL value. If you don't have that guarantee enforced from the database, adding a WHERE/HAVING return_expr IS NOT NULL in the subquery is sufficient to give you that guarantee.
That correlated subquery is going to eat your lunch, performance wise, on large sets. As will that ORDER BY rand().
Generally, an anti-join pattern turns out to be much more efficient on large sets:
SELECT o.*
FROM offers o
LEFT
JOIN aff_disabled_offers d
ON d.user_id = '1'
AND d.offer_id = o.id
WHERE d.offer_id IS NULL
AND o.country_iso = '$country_iso'
ORDER BY rand()
LIMIT 7
Related
I am struggling with a mysql problem.
I have two exact same queries, just the item_id at the end is different, but they return different execution plans when I execute them with analyze/explain.
This results in a huge difference of time needed to return a result.
The query is something like
explain select `orders`.*,
(select count(*) from `items`
inner join `orders_items` on `items`.`id` = `orders_items`.`item_id`
where `orders`.`id` = `orders_items`.`order_id`
and `items`.`deleted_at` is null) as `items_count`,
(select count(*) from `returns`
where `orders`.`id` = `returns`.`order_id`
and `returns`.`deleted_at` is null) as `returns_count`,
(select count(*) from `shippings`
where `orders`.`id` = `shippings`.`order_id`
and `shippings`.`deleted_at` is null) as `shippings_count`,
(select count(*) from `orders` as `laravel_reserved_2`
where `orders`.`id` = `laravel_reserved_2`.`recurred_from_id`
and `laravel_reserved_2`.`deleted_at` is null) as `recurred_orders_count`,
(select COALESCE(SUM(orders_items.amount), 0) from `items`
inner join `orders_items` on `items`.`id` = `orders_items`.`item_id`
where `orders`.`id` = `orders_items`.`order_id`
and `items`.`deleted_at` is null) as `items_sum_orders_itemsamount`,
`orders`.*,
`orders_items`.`item_id` as `pivot_item_id`,
`orders_items`.`order_id` as `pivot_order_id`,
`orders_items`.`amount` as `pivot_amount`,
`orders_items`.`id` as `pivot_id`
from `orders`
inner join `orders_items` on `orders`.`id` = `orders_items`.`order_id`
where `orders_items`.`item_id` = 497
and `import_finished` = 1
and `orders`.`deleted_at` is null
order by `id` desc limit 50 offset 0;
As you can see it is a laravel/eloquent query.
This is the execution plan for the query above:
But when I change the item_id at the end it return the following execution plan
It is absolutely random. 30% of the item_id's return the faster one and 70% return the slower one and I have no idea why. The related data is almost the same for every item we have in our database.
I also flushed the query cache to see if this was causing the different exec plans but no success.
I am googlin' since 4 hours but I can't find anything about this exact problem.
Can someone of you guys tell me why this hapens?
Thanks in advance!!
Edit 01/21/2023 - 19:04:
Seems like mysql don't like to order by columns which are not defined in the related where clause, in this case the pivot table orders_items.
I just replaced the
order by id
with
order by orders_items.order_id
This results in a 10 times faster query.
A query using different execution plans just because of a different parameter can have several reasons. The simplest explanation would be the position of the used item_id in the relevant index. The position in the index may affect the cost of using the index which in turn may affect if it is used at all. (this is just an example)
It is important to note that the explain statement will give you the planned execution plan but maybe not the actually used one.
EXPLAIN ANALYZE is the command which will output the actually used execution plan for you. It may still yield different results for different parameters.
ON `orders`.`id` = `orders_items`.`order_id`
where `orders_items`.`item_id` = 497
and ???.`import_finished` = 1
and `orders`.`deleted_at` is null
order by ???.`id` desc
limit 50 offset 0;
One order maps to many order items, correct?
So, ORDER BY orders.id is different than ORDER BY orders_items.item_id.
Does item_id = 497 show up in many different orders?
So, think about which ORDER BY you really want.
Meanwhile, these may help with performance:
orders_items: INDEX(order_id, item_id)
returns: INDEX(order_id, deleted_at)
shippings: INDEX(order_id, deleted_at)
laravel_reserved_2: INDEX(recurred_from_id, deleted_at)
How can I turn this subquery in a JOIN?
I read that subqueries are slower than JOINs.
SELECT
reklamation.id,
reklamation.titel,
(
SELECT reklamation_status.status
FROM reklamation_status
WHERE reklamation_status.id_reklamation = reklamation.id
ORDER BY reklamation_status.id DESC
LIMIT 1
) as status
FROM reklamation
WHERE reklamation.aktiv=1
This should do it:
SELECT r.id, r.titel, MAX(s.id) as status
FROM reklamation r
LEFT JOIN reklamation_status s ON s.id_reklamation = r.id
WHERE r.aktiv = 1
GROUP BY r.id, r.titel
The key point here is to use aggregation to manage the cardinality between reklamation and reklamation_status. In your original code, the inline subquery uses ORDER BY reklamation_status.id DESC LIMIT 1 to return the highest id in reklamation_status that corresponds to the current reklamation. Without aggregation, we would probably get multiple records in the resultset for each reklamation (one for each corresponding reklamation_status).
Another thing is to consider is the type of the JOIN. INNER JOIN would filter out records of reklamations that do not have a reklamation_status: the original query with the inline subquery does not behave like that, so I chose LEFT JOIN. If you can guarantee that every reklamation has at least one child in reklamation_status, you can safely switch back to INNER JOIN (which might perform more efficiently).
PS:
I read that subqueries are slower than JOINs.
This is not a universal truth. It depends on many factors and cannot be told without seeing your exact use case.
Using JOIN query can be rewritten to:
SELECT reklamation.id, reklamation.titel, reklamation_status.status
FROM reklamation
JOIN reklamation_status ON reklamation_status.id_reklamation = reklamation.id
WHERE reklamation.aktiv=1
What you read is incorrect. Subqueries can be slower, faster, or the same as joins. I would write the query as:
SELECT r.id, r.titel,
(SELECT rs.status
FROM reklamation_status rs
WHERE rs.id_reklamation = r.id
ORDER BY rs.id DESC
LIMIT 1
) as status
FROM reklamation r
WHERE r.aktiv = 1;
For performance, you want an index on reklamation_status(id_reklamation, id, status).
By the way, this is a case where the subquery is probably the fastest method for expressing this query. If you attempt a JOIN, then you need some sort of aggregation to get the most recent status. That can be expensive.
SELECT LM.user_id,LM.users_lineup_id, min( LM.total_score ) AS total_score
FROM vi_lineup_master LM JOIN
vi_contest AS C
ON C.contest_unique_id = LM.contest_unique_id join
(SELECT min( total_score ) as total_score
FROM vi_lineup_master
GROUP BY group_unique_id
) as preq
ON LM.total_score = preq.total_score
WHERE LM.contest_unique_id = 'iledhSBDO' AND
C.league_contest_type = 1
GROUP BY group_unique_id
Above query is to find the loser per group of game, query return accurate result but its not responding with large data. How can I optimize this?
You can try to move your JOINs to subqueries. Also, you should pay attention on your "wrong" GROUP BY usage on the outer query. In Mysql you can group by some columns and select others not specified in the group clause without any aggregation function, but the database can't ensure what data it will return to you. For the sake of consistency of your application, wrap them in an aggregation function.
Check if this one helps:
SELECT
MIN(LM.user_id) AS user_id,
MIN(LM.users_lineup_id) AS users_lineup_id,
MIN(LM.total_score) AS total_score
FROM vi_lineup_master LM
WHERE 1=1
-- check if this "contest_unique_id" is equals
-- to 'iledhSBDO' for a "league_contest_type" valued 1
AND LM.contest_unique_id IN
(
SELECT C.contest_unique_id
FROM vi_contest AS C
WHERE 1=1
AND C.contest_unique_id = 'iledhSBDO'
AND C.league_contest_type = 1
)
-- check if this "total_score" is one of the
-- "min(total_score)" from each "group_unique_id"
AND LM.total_score IN
(
SELECT MIN(total_score)
FROM vi_lineup_master
GROUP BY group_unique_id
)
GROUP BY LM.group_unique_id
;
Also, some pieces of this query may seem redundant, but it's because I did not want to change the filters you wrote, just moved them.
Also, your query logic seems a bit strange to me, based on the tables/columns names and how you wrote it... please, check the comments in my query which reflects what I understood of your implementation.
Hope it helps.
I have SQL query with LEFT JOIN:
SELECT COUNT(stn.stocksId) AS count_stocks
FROM MedicalFacilities AS a
LEFT JOIN stocks stn ON
(stn.stocksIdMF = ( SELECT b.MedicalFacilitiesIdUser
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY stn.stocksId DESC LIMIT 1)
AND stn.stocksEndDate >= UNIX_TIMESTAMP() AND stn.stocksStartDate <= UNIX_TIMESTAMP())
These query I want to select one row from table stocks by conditions and with field equal value a.MedicalFacilitiesIdUser.
I get always count_stocks = 0 in result. But I need to get 1
The count(...) aggregate doesn't count null, so its argument matters:
COUNT(stn.stocksId)
Since stn is your right hand table, this will not count anything if the left join misses. You could use:
COUNT(*)
which counts every row, even if all its columns are null. Or a column from the left hand table (a) that is never null:
COUNT(a.ID)
Your subquery in the on looks very strange to me:
on stn.stocksIdMF = ( SELECT b.MedicalFacilitiesIdUser
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY stn.stocksId DESC LIMIT 1)
This is comparing MedicalFacilitiesIdUser to stocksIdMF. Admittedly, you have no sample data or data layouts, but the naming of the columns suggests that these are not the same thing. Perhaps you intend:
on stn.stocksIdMF = ( SELECT b.stocksId
-----------------------------^
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY b.stocksId DESC
LIMIT 1)
Also, ordering by stn.stocksid wouldn't do anything useful, because that would be coming from outside the subquery.
Your subquery seems redundant and main query is hard to read as much of the join statements could be placed in where clause. Additionally, original query might have a performance issue.
Recall WHERE is an implicit join and JOIN is an explicit join. Query optimizers
make no distinction between the two if they use same expressions but readability and maintainability is another thing to acknowledge.
Consider the revised version (notice I added a GROUP BY):
SELECT COUNT(stn.stocksId) AS count_stocks
FROM MedicalFacilities AS a
LEFT JOIN stocks stn ON stn.stocksIdMF = a.MedicalFacilitiesIdUser
WHERE stn.stocksEndDate >= UNIX_TIMESTAMP()
AND stn.stocksStartDate <= UNIX_TIMESTAMP()
GROUP BY stn.stocksId
ORDER BY stn.stocksId DESC
LIMIT 1
The following query hangs: (although subqueries perfomed separately are fine)
I don't know how to make the explain table look ok. If someone tells me, I'll clean it up.
select
sum(grades.points)) as p,
from assignments
left join grades using (assignmentID)
where gradeID IN
(select grades.gradeID
from assignments
left join grades using (assignmentID)
where ... grades.date <= '1255503600' AND grades.date >= '984902400'
group by assignmentID order by grades.date DESC);
I think the problem is with the first grades table... the type ALL with that many rows seems to be the cause.. Everything is indexed.
I uploaded the table as an image. Couldn't get the formatting right:
http://imgur.com/AjX34.png
A commenter wanted the full where clause:
explain extended select count(assignments.assignmentID) as asscount, sum(TRIM(TRAILING '-' FROM grades.points)) as p, sum(assignments.points) as t
from assignments left join grades using (assignmentID)
where gradeID IN
(select grades.gradeID from assignments left join grades using (assignmentID) left join as_types on as_types.ID = assignments.type
where assignments.classID = '7815'
and (assignments.type = 30170 )
and grades.contactID = 7141
and grades.points REGEXP '^[-]?[0-9]+[-]?'
and grades.points != '-'
and grades.points != ''
and (grades.pointsposs IS NULL or grades.pointsposs = '')
and grades.date <= '1255503600'
AND grades.date >= '984902400'
group by assignmentID
order by grades.date DESC);
See "The unbearable slowness of IN":
http://www.artfulsoftware.com/infotree/queries.php#568
Super messy, but: (thanks for everyone's help)
SELECT *
FROM grades
LEFT JOIN assignments ON grades.assignmentID = assignments.assignmentID
RIGHT JOIN (
SELECT g.gradeID
FROM assignments a
LEFT JOIN grades g
USING ( assignmentID )
WHERE a.classID = '7815'
AND (
a.type =30170
)
AND g.contactID =7141
g.points
REGEXP '^[-]?[0-9]+[-]?'
AND g.points != '-'
AND g.points != ''
AND (
g.pointsposs IS NULL
OR g.pointsposs = ''
)
AND g.date <= '1255503600'
AND g.date >= '984902400'
GROUP BY assignmentID
ORDER BY g.date DESC
) AS t1 ON t1.gradeID = grades.gradeID
Suppose you use a Real Database (ie, any database except MySQL, but I'll use Postgres as an example) to do this query :
SELECT * FROM ta WHERE aid IN (SELECT subquery)
a Real Database would look at the subquery and estimate its rowcount :
If the rowcount is small (say, less than a few millions)
It would run the subquery, then build an in-memory hash of ids, which also makes them unique, which is a feature of IN().
Then, if the number of rows pulled from ta is a small part of ta, it would use a suitable index to pull the rows. Or, if a major part of the table is selected, it would just scan it entirely, and lookup each id in the hash, which is very fast.
If however the subquery rowcount is quite large
The database would probably rewrite it as a merge JOIN, adding a Sort+Unique to the subquery.
However, you are using MySQL. In this case, it will not do any of this (it is gonna re-execute the subquery for each row of your table) so it will take 1000 years. Sorry.
If your subquery performs fine when it is executed separately, then try using a JOIN rather than IN, like this:
select count(assignments.assignmentID) as asscount, sum(TRIM(TRAILING '-' FROM grades.points)) as p, sum(assignments.points) as t
from assignments left join grades using (assignmentID)
join
(select grades.gradeID from assignments left join grades using (assignmentID) left join as_types on as_types.ID = assignments.type
where assignments.classID = '7815'
and (assignments.type = 30170 )
and grades.contactID = 7141
and grades.points REGEXP '^[-]?[0-9]+[-]?'
and grades.points != '-'
and grades.points != ''
and (grades.pointsposs IS NULL or grades.pointsposs = '')
and grades.date <= '1255503600'
AND grades.date >= '984902400'
group by assignmentID
order by grades.date DESC) using (gradeID);
There really isn't enough information to answer your question, and you've put a ... in the middle of the where clause which is weird. How big are the tables involved and what are the indexes?
Having said that, if there are too many terms in an in clause, you can see seriously degraded performance. Replace the use of in with a right join.
For starters, the table as_types in the in clause is not used. Left joining it serves no purpose so get rid of it.
That leaves the in clause having only the assignments and grades table from the outer query. Clearly the wheres the modify assignments belong in the where clause for the outer query. You should move all of the where grades=whatever into the on clause of the left join to grades.
The query is a little tough to follow, but I suspect that the subquery isn't necessary at all.
It seems like your query is basically thus:
SELECT FOO()
FROM assignments LEFT JOIN grades USING (assignmentID)
WHERE gradeID IN
(
SELECT grades.gradeID
FROM assignments LEFT JOIN grades USING (assignmentID)
WHERE your_conditions = TRUE
);
But, you're not doing anything really fancy in the where clause in the subquery.
I suspect something more like
SELECT FOO()
FROM assignments LEFT JOIN grades USING (assignmentID)
GROUP BY groupings
WHERE your_conditions_with_some_tweaks = TRUE;
would work just as well.
If I'm missing some key logic here please comment back and I'll edit/delete this post.