SQL query needs optimization

SQL query needs optimization - mysql

SELECT LM.user_id,LM.users_lineup_id, min( LM.total_score ) AS total_score
FROM vi_lineup_master LM JOIN
vi_contest AS C
ON C.contest_unique_id = LM.contest_unique_id join
(SELECT min( total_score ) as total_score
FROM vi_lineup_master
GROUP BY group_unique_id
) as preq
ON LM.total_score = preq.total_score
WHERE LM.contest_unique_id = 'iledhSBDO' AND
C.league_contest_type = 1
GROUP BY group_unique_id
Above query is to find the loser per group of game, query return accurate result but its not responding with large data. How can I optimize this?

You can try to move your JOINs to subqueries. Also, you should pay attention on your "wrong" GROUP BY usage on the outer query. In Mysql you can group by some columns and select others not specified in the group clause without any aggregation function, but the database can't ensure what data it will return to you. For the sake of consistency of your application, wrap them in an aggregation function.
Check if this one helps:
SELECT
MIN(LM.user_id) AS user_id,
MIN(LM.users_lineup_id) AS users_lineup_id,
MIN(LM.total_score) AS total_score
FROM vi_lineup_master LM
WHERE 1=1
-- check if this "contest_unique_id" is equals
-- to 'iledhSBDO' for a "league_contest_type" valued 1
AND LM.contest_unique_id IN
(
SELECT C.contest_unique_id
FROM vi_contest AS C
WHERE 1=1
AND C.contest_unique_id = 'iledhSBDO'
AND C.league_contest_type = 1
)
-- check if this "total_score" is one of the
-- "min(total_score)" from each "group_unique_id"
AND LM.total_score IN
(
SELECT MIN(total_score)
FROM vi_lineup_master
GROUP BY group_unique_id
)
GROUP BY LM.group_unique_id
;
Also, some pieces of this query may seem redundant, but it's because I did not want to change the filters you wrote, just moved them.
Also, your query logic seems a bit strange to me, based on the tables/columns names and how you wrote it... please, check the comments in my query which reflects what I understood of your implementation.
Hope it helps.

Related

Two same mysql queries with different execution plans

I am struggling with a mysql problem.
I have two exact same queries, just the item_id at the end is different, but they return different execution plans when I execute them with analyze/explain.
This results in a huge difference of time needed to return a result.
The query is something like
explain select `orders`.*,
(select count(*) from `items`
inner join `orders_items` on `items`.`id` = `orders_items`.`item_id`
where `orders`.`id` = `orders_items`.`order_id`
and `items`.`deleted_at` is null) as `items_count`,
(select count(*) from `returns`
where `orders`.`id` = `returns`.`order_id`
and `returns`.`deleted_at` is null) as `returns_count`,
(select count(*) from `shippings`
where `orders`.`id` = `shippings`.`order_id`
and `shippings`.`deleted_at` is null) as `shippings_count`,
(select count(*) from `orders` as `laravel_reserved_2`
where `orders`.`id` = `laravel_reserved_2`.`recurred_from_id`
and `laravel_reserved_2`.`deleted_at` is null) as `recurred_orders_count`,
(select COALESCE(SUM(orders_items.amount), 0) from `items`
inner join `orders_items` on `items`.`id` = `orders_items`.`item_id`
where `orders`.`id` = `orders_items`.`order_id`
and `items`.`deleted_at` is null) as `items_sum_orders_itemsamount`,
`orders`.*,
`orders_items`.`item_id` as `pivot_item_id`,
`orders_items`.`order_id` as `pivot_order_id`,
`orders_items`.`amount` as `pivot_amount`,
`orders_items`.`id` as `pivot_id`
from `orders`
inner join `orders_items` on `orders`.`id` = `orders_items`.`order_id`
where `orders_items`.`item_id` = 497
and `import_finished` = 1
and `orders`.`deleted_at` is null
order by `id` desc limit 50 offset 0;
As you can see it is a laravel/eloquent query.
This is the execution plan for the query above:
But when I change the item_id at the end it return the following execution plan
It is absolutely random. 30% of the item_id's return the faster one and 70% return the slower one and I have no idea why. The related data is almost the same for every item we have in our database.
I also flushed the query cache to see if this was causing the different exec plans but no success.
I am googlin' since 4 hours but I can't find anything about this exact problem.
Can someone of you guys tell me why this hapens?
Thanks in advance!!
Edit 01/21/2023 - 19:04:
Seems like mysql don't like to order by columns which are not defined in the related where clause, in this case the pivot table orders_items.
I just replaced the
order by id
with
order by orders_items.order_id
This results in a 10 times faster query.

A query using different execution plans just because of a different parameter can have several reasons. The simplest explanation would be the position of the used item_id in the relevant index. The position in the index may affect the cost of using the index which in turn may affect if it is used at all. (this is just an example)
It is important to note that the explain statement will give you the planned execution plan but maybe not the actually used one.
EXPLAIN ANALYZE is the command which will output the actually used execution plan for you. It may still yield different results for different parameters.

ON `orders`.`id` = `orders_items`.`order_id`
where `orders_items`.`item_id` = 497
and ???.`import_finished` = 1
and `orders`.`deleted_at` is null
order by ???.`id` desc
limit 50 offset 0;
One order maps to many order items, correct?
So, ORDER BY orders.id is different than ORDER BY orders_items.item_id.
Does item_id = 497 show up in many different orders?
So, think about which ORDER BY you really want.
Meanwhile, these may help with performance:
orders_items: INDEX(order_id, item_id)
returns: INDEX(order_id, deleted_at)
shippings: INDEX(order_id, deleted_at)
laravel_reserved_2: INDEX(recurred_from_id, deleted_at)

SQL query:Having number=max(number) doesn't work

I have two tables,Writer and Books. A writer can pruduce many books. I want to get the all writers who produce maximal number of books.
Firstly, my sql query is like:
SELECT Name FROM(
SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE
Writer.ID=Book.ID
GROUP BY Writer.Name
)
WHERE NUMBER=(SELECT MAX(NUMBER) FROM
(SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE Writer.ID=Book.ID
GROUP BY Writer.Name
)
It works. However I think this query is too long and there exists some duplications. I want to make this query shorter. So I try another query like this:
SELECT Name FROM(
SELECT Writer.Name,COUNT(Book.ID) AS NUMBER FROM Writer,Book
WHERE
Writer.ID=Book.ID
GROUP BY Writer.Name
HAVING NUMBER = MAX(NUMBER)
)
However, this HAVING clause doesn't work and my sqlite says its an error.
I don't know why. Can anyone explain to me ? Thank you!

The HAVING clause provides filtering on the final set (typically after a group by) and does not provide additional grouping functionality. Think of it just like a WHERE clause, but can be applied after a GROUP BY.
Your query with the HAVING NUMBER = MAX(NUMBER) implies grouping of the set of NUMBER values across all records and doesn't make sense in this example (even though we all get what you want it to do).

Each query provides you with one level of aggregation, so you cannot use Max on COUNT in the same query. You need a sub-query like you did in your first query.
However, your first query can be simplified on MySQL to:
SELECT Writer.Name
FROM Writer, Book
WHERE Writer.ID = Book.ID
GROUP BY Writer.Name
HAVING COUNT(Book.ID) = (SELECT COUNT(Book.ID) AS n
FROM Writer, Book
WHERE Writer.ID = Book.ID
GROUP BY Writer.Name
ORDER BY n DESC
LIMIT 1)

In MySQL (but not SQLite), you can use variables to reduce the amount of work and make a simpler query. However, there are nuances there, because variables with group by require an extra level of subqueries:
SELECT name
FROM (SELECT t.*, (#m := if(#m = 0, NUMBER, #m)) as maxn
FROM (SELECT w.Name, COUNT(b.ID) AS NUMBER
FROM Writer w JOIN
Book b
ON w.ID = b.ID
GROUP BY w.Name
) t CROSS JOIN
(SELECT #m := 0) params
ORDER BY NUMBER desc
) t
WHERE maxn = number;

It looks like you are nesting aggregate functions, which is not allowed.
HAVING NUMBER = MAX(NUMBER) is like HAVING COUNT(Book.ID) = MAX(COUNT(Book.ID))
Nesting COUNT inside MAX seems to be the issue here

Alternative to mysql WHERE IN SELECT GROUP BY when wanting max value in group by

I have the following query, which was developed from a hint found online because of a problem with a GROUP BY returning the maximum value; but it's running really slowly.
Having looked online I'm seeing that WHERE IN (SELECT.... GROUP BY) is probably the issue, but, to be honest, I'm struggling to find a way around this:
SELECT *
FROM tbl_berths a
JOIN tbl_active_trains b on a.train_uid=b.train_uid
WHERE (a.train_id, a.TimeStamp) in (
SELECT a.train_id, max(a.TimeStamp)
FROM a
GROUP BY a.train_id
)
I'm thinking I possibly need a derived table, but my experience in this area is zero and it's just not working out!

you can move that to a SUBQUERY and also select only required columns instead of All (*)
SELECT a.train_uid
FROM tbl_berths a
JOIN tbl_active_trains b on a.train_uid=b.train_uid
JOIN (SELECT a.train_id, max(a.TimeStamp) as TimeStamp
FROM a
GROUP BY a.train_id )T
on a.train_id = T.train_id
and a.TimeStamp = T.TimeStamp

optimize Mysql: get latest status of the sale

In the following query, I show the latest status of the sale (by stage, in this case the number 3). The query is based on a subquery in the status history of the sale:
SELECT v.id_sale,
IFNULL((
SELECT (CASE WHEN IFNULL( vec.description, '' ) = ''
THEN ve.name
ELSE vec.description
END)
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
WHERE veh.id_sale = v.id_sale
AND vec.id_stage = 3
ORDER BY veh.id_record DESC
LIMIT 1
), 'x') sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
WHERE 1 =1
AND v.flag =1
AND v.id_quarters =4
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
the query delay 0.0057seg and show 1011 records.
Because I have to filter the sales by the name of the state as it would have to repeat the subquery in a where clause, I have decided to change the same query using joins. In this case, I'm using the MAX function to obtain the latest status:
SELECT
v.id_sale,
IFNULL(veh3.State3,'x') AS sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
LEFT JOIN (
SELECT veh.id_sale,
(CASE WHEN IFNULL(vec.description,'') = ''
THEN ve.name
ELSE vec.description END) AS State3
FROM t_record veh
INNER JOIN (
SELECT id_sale, MAX(id_record) AS max_rating
FROM(
SELECT veh.id_sale, id_record
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign AND vec.id_stage = 3
) m
GROUP BY id_sale
) x ON x.max_rating = veh.id_record
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
) veh3 ON veh3.id_sale = v.id_sale
WHERE v.flag = 1
AND v.id_quarters = 4
This query shows the same results (1011). But the problem is it takes 0.0753 sec
Reviewing the possibilities I have found the factor that makes the difference in the speed of the query:
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
If I remove this clause, both queries the same time delay... Why it works better? Is there any way to use this clause in the joins? I hope your help.
EDIT
I will show the results of EXPLAIN for each query respectively:
q1:
q2:

Interesting, so that little statement basically determines if there is a match between t_record.id_sale and t_sale.id_sale.
Why is this making your query run faster? Because Where statements applied prior to subSelects in the select statement, so if there is no record to go with the sale, then it doesn't bother processing the subSelect. Which is netting you some time. So that's why it works better.
Is it going to work in your join syntax? I don't really know without having your tables to test against but you can always just apply it to the end and find out. Add the keyword EXPLAIN to the beginning of your query and you will get a plan of execution which will help you optimize things. Probably the best way to get better results in your join syntax is to add some indexes to your tables.
But I ask you, is this even necessary? You have a query returning in <8 hundredths of a second. Unless this query is getting ran thousands of times an hour, this is not really taxing your DB at all and your time is probably better spent making improvements elsewhere in your application.

Query optimization

I'm having a problem with this slow query:
SELECT c.*, csc1.changed_status
FROM contract c
LEFT
JOIN contract_status_change csc1
ON csc1.contract_status_change_id =
( SELECT csc2.contract_status_change_id
FROM contract_status_change csc2
WHERE csc2.contract_id = c.contract_id
ORDER
BY csc2.date_changed DESC
LIMIT 1
)
;
I have a contract table and a contract_status_change table, which records statuses against the contract. This query is joining on the latest status with the contract so you can get its current status..
Please can you help me tidy it up?
-edit-
my apologies. I have updated the query to include selecting the actual latest status out. Sorry for the confusion!

After formatting your query for readability (consistent whitespace and capitalization, removing unnecessary backticks and parentheses, more sensible aliases):
SELECT c.*
FROM contract c
LEFT
JOIN contract_status_change csc1
ON csc1.contract_status_change_id =
( SELECT csc2.contract_status_change_id
FROM contract_status_change csc2
WHERE csc2.contract_id = c.contract_id
ORDER
BY csc2.date_changed DESC
LIMIT 1
)
;
and assuming that contract_status_change.contract_status_change_id is a unique identifier, I'm forced to conclude that your query is equivalent to this, much more efficient one:
SELECT c.*
FROM contract c
;
You say that it "is joining on the latest status with the contract so you can get its current status", but it doesn't do anything with the current status — doesn't order by it, doesn't filter by it, doesn't include it in the query results — so there's no need for that.

This should help a bit.
SELECT c.*, csc1.changed_status
FROM contract c LEFT JOIN contract_changed_status csc1 ON c.contract_id = csc1.contract_id
INNER JOIN
(
SELECT contract_id, changed_status, MAX(date_changed) AS 'max_date'
FROM contract_status_changed GROUP_BY contract_id
GROUP BY contract_id
) csc2 ON csc1.contract_id = csc2.contract_id AND csc1.date_changed = csc2.max_date

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL query needs optimization - mysql

Related

Two same mysql queries with different execution plans

SQL query:Having number=max(number) doesn't work

Alternative to mysql WHERE IN SELECT GROUP BY when wanting max value in group by

optimize Mysql: get latest status of the sale

Query optimization

Categories

Resources