how to change this mysql query to a efficient one - mysql

my table user contains these fields
id,company_id,created_by,name,image
table valet contains
id,vid,dept_id
table cart contains
id,dept_id,map_id,purchase,time
to get the details i have written this mysql query
SELECT c.id, a.id, c.purchace, c.time
FROM user a
LEFT JOIN valet b ON a.vid = b.id
AND a.is_deleted = 0
LEFT JOIN cart c ON b.dept_id = c.dept_id
WHERE a.company_id = 18
AND a.created_by = 102
AND a.is_deleted = 0
AND c.time
IN ( SELECT MAX( time ) FROM cart WHERE dept_id = b.dept_id )
from these three table i want to select last updated raw from cart along with id from user table which is mapped in valet table
this query works fine but it takes almost 15 sec to retrieve the details .
is there any way to improve this query or may be i am doing some wrong.
any help would be appreciated

For one thing, I can see that you’re running the subquery for each row. Depending on what the optimiser does, that may have an impact. max is a pretty expensive operation (there’s nothing for it but to read every row).
If you plan to update and use this query repeatedly, perhaps you should at least index the table on cart.time. This will make it much easier to find the maximum value.
MySQL has the concept of user variables, so you can set a variable to the result of the subquery, and that might help:
SELECT c.id, a.id, c.purchace, c.time
FROM
user a
LEFT JOIN valet b ON a.vid = b.id AND a.is_deleted = '0'
LEFT JOIN cart c ON b.dept_id = c.dept_id
LEFT JOIN (SELECT dept_id,max(time) as mx FROM cart GROUP BY dept_id) m on m.dept_id=c.dept_id
WHERE
a.company_id = '18'
AND a.created_by = '102'
AND a.is_deleted = '0'
AND c.time=m.mx;
Note also:
since you’re only testing a single value (max) for c.time, you should be using = not in.
I’m not sure about is why you are using strings instead of integers. I shold have though that leaving off the quotes makes more sense.
Your JOIN includes AND a.is_deleted = '0', though you make no mention of it in your table description. In any case, why is it in the JOIN and not in the WHERE clause?

Related

SQL JOIN query needs over 15s to run

I have a pretty big SQL query to get data from multiple database tables. I use the ON condition to check if the guild_ids are always the same and in some cases, he check's for an user_id too.
That is my query:
SELECT
SUM( f.guild_id = 787672220503244800 AND f.winner_id LIKE '%841827102331240468%' ) AS guild_winner,
SUM( f.winner_id LIKE '%841827102331240468%' ) AS win_sum,
m.message_count,
r.bypass_role_id,
i.real_count,
i.total_count,
i.bonus_count,
i.left_count
FROM
guild_finished_giveaways AS f
JOIN guild_message_count AS m
JOIN guild_role_settings AS r
JOIN guild_invite_count AS i ON m.guild_id = f.guild_id
AND m.user_id = 841827102331240468
AND r.guild_id = f.guild_id
AND i.guild_id = f.guild_id
AND i.user_id = m.user_id
But it runs pretty slow, with over 15s. I can't see why it needs so long.
I figured out that if I remove the "guild_invite_count" JOIN, it's pretty fast again. Do I have some simple error here that I don't see? Or what could be the issue?
Each JOIN expression needs it's own ON. Don't wait until the end for this. As it was, the server was forced to build up a cartesian product of all those tables before narrowing them down again, and I'm surprised the query ran at all (I'd expect a syntax error for missing ON clauses).
FROM guild_finished_giveaways AS f
JOIN guild_message_count AS m ON m.guild_id = f.guild_id
JOIN guild_role_settings AS r ON r.guild_id = f.guild_id
JOIN guild_invite_count AS i ON i.guild_id = f.guild_id
AND i.user_id = m.user_id
WHERE m.user_id = 841827102331240468
It's also more than a little odd to use SUM() or any other aggregate function in the same query as non-aggregated values without a GROUP BY clause.
Are you using InnoDB?
Does every table have a PRIMARY KEY?
These may help:
m: PRIMARY KEY(user_id) -- assuming that is unique in that table
f: INDEX(guild_id, winner_id)
r: INDEX(guild_id, bypass_role_id)
i: INDEX(user_id,)
It looks like some tables should not be separate -- perhaps r,i,f could be combined? (I need to see SHOW CREATE TABLE to say more.)
Do NOT have a commalist in winner_id. Instead have another table with one row per winner per game (or whatever it is a winner of). Perhaps just to columns like a Many-to-many mapping table.
Noting that the execution is likely to start with m and then go next to i let's improve on Joel's suggestion:
FROM guild_message_count AS m
JOIN guild_invite_count AS i ON i.user_id = m.user_id
JOIN guild_finished_giveaways AS f ON f.guild_id = m.guild_id
JOIN guild_role_settings AS r ON r.guild_id = m.guild_id
WHERE m.user_id = 841827102331240468
Note that 3 tables are joined on guild_id; but only 2 = are needed.
SUM without GROUP BY sums up the entire resultset (after JOINing). But you have 6 non-aggregates, so you need to GROUP BY all 6.
But that may lead to grossly inflated sums. Maybe you need to do the aggregation just over f first since that is where you are summing. Then JOIN to the rest??

Mysql Join performance MongoDB, Cassandra

I have a join query which takes a lot of time to process.
SELECT
COUNT(c.id)
FROM `customers` AS `c`
LEFT JOIN `setting` AS `ssh` ON `c`.`shop_id` = `ssh`.`id`
LEFT JOIN `customer_extra` AS `cx` ON `c`.`id` = `cx`.`customer_id`
LEFT JOIN `customers_address` AS `ca` ON `ca`.`id` = `cx`.`customer_default_address_id`
LEFT JOIN `lytcustomer_tier` AS `ct` ON `cx`.`lyt_customer_tier_id` = `ct`.`id`
WHERE (c.shop_id = '12121') AND ((DATE(cx.last_email_open_date) > '2019-11-08'));
This is primarily because the table 'customers' has 2 million records.
I could go over into indexing etc. But, the larger point is, this 2.5 million could become a billion records 1 day.
I'm looking for solutions which can enhance performance.
I've given thought to
a) horizontal scalability. -: distribute the mysql table into different sections and query the count independently.
b) using composite indexes.
c) My favourite one -: Just create a seperate collection in mongodb or redis which only houses the count(output of this query) Since, the count is just 1 number. this will not require a huge size aka better query performance (Only question is, how many such queries are there, because that will increase size of the new collection)
Try this and see if it improve performance:
SELECT
COUNT(c.id)
FROM `customers` AS `c`
INNER JOIN `customer_extra` AS `cx` ON `c`.`id` = `cx`.`customer_id`
LEFT JOIN `setting` AS `ssh` ON `c`.`shop_id` = `ssh`.`id`
LEFT JOIN `customers_address` AS `ca` ON `ca`.`id` = `cx`.`customer_default_address_id`
LEFT JOIN `lytcustomer_tier` AS `ct` ON `cx`.`lyt_customer_tier_id` = `ct`.`id`
WHERE (c.shop_id = '12121') AND ((DATE(cx.last_email_open_date) > '2019-11-08'));
As I mention in the comment, since the condition AND ((DATE(cx.last_email_open_date) > '2019-11-08'));, already made customers table to INNER JOIN with customer_extra table, you might just change it to INNER JOIN customer_extra AS cx ON c.id = cx.customer_id and follow it with other LEFT JOIN.
The INNER JOIN will at least get the initial result to only return any customer who have last_email_open_date value based on what has been specified.
Say COUNT(*), not COUNT(c.id)
Remove these; they slow down the query without adding anything that I can see:
LEFT JOIN `setting` AS `ssh` ON `c`.`shop_id` = `ssh`.`id`
LEFT JOIN `customers_address` AS `ca` ON `ca`.`id` = `cx`.`customer_default_address_id`
LEFT JOIN `lytcustomer_tier` AS `ct` ON `cx`.`lyt_customer_tier_id` = `ct`.`id`
DATE(...) makes that test not "sargable". This works for DATE or DATETIME; and this is much faster:
cx.last_email_open_date > '2019-11-08'
Consider whether that should be >= instead of >.
Need an index on shop_id. (Please provide SHOW CREATE TABLE.)
Don't use LEFT JOIN when JOIN would work equally well.
If customer_extra is columns that should have been in customer, now is the time to move them in. That would let you use this composite index for even more performance:
INDEX(shop_id, last_email_open_date) -- in this order
With those changes, a billion rows in MySQL will probably not be a problem. If it is, there are still more fixes I can suggest.

Grouping method

I am working on a query with the following format:
I require all the columns from the Database 'A', while I only require the summed amount (sum(amount)) from the Database 'B'.
SELECT A.*, sum(B.CURTRXAM) as 'Current Transaction Amt'
FROM A
LEFT JOIN C
ON A.Schedule_Number = C.Schedule_Number
LEFT JOIN B
ON A.DOCNUMBR = B.DOCNUMBR
ON A.CUSTNMBR = B.CUSTNMBR
GROUP BY A
ORDER BY A.CUSTNMBR
My question is regarding the grouping statement, database A has about 12 columns and to group by each individually is tedious, is there a cleaner way to do this such as:
GROUP BY A
I am not sure if a simpler way exists as I am new to SQL, I have previously investigated GROUPING_ID statements but thats about it.
Any help on lumped methods of grouping would be helpful
Since the docnumber is the primary key - just use the following SQL:
SELECT A.*, sum(B.CURTRXAM) as 'Current Transaction Amt'
FROM A
LEFT JOIN C
ON A.Schedule_Number = C.Schedule_Number
LEFT JOIN B
ON A.DOCNUMBR = B.DOCNUMBR
ORDER BY RM20401.CUSTNMBR
GROUP BY A.DOCNUMBR

Why use letters in front of each value in MySQL query?

Why would I use letters in front of each value in my query like this?
In the database, each of these values is WITHOUT the letter in front.
SELECT c.client_id, c.client_name, c.contactperson, c.internal_comment,
IF NULL(r.region, 'Alle byer') as region, c.phone, c.email,
uu.fullname as changed_by,
(select count(p.project_id)
from projects p
where p.client_id = c.client_id and (p.is_deleted != 1 or p.is_deleted is null)
) as numProjects
FROM clients c LEFT JOIN users uu ON c.db_changed_by = uu.id
LEFT JOIN regions r ON c.region_id = r.region_id
WHERE (c.is_deleted != 1 or c.is_deleted is null)
I have tried looking it up, but I can't find it anywhere.
When in SQL you need to use more than one table for a query, you can do this:
SELECT person.name, vehicle.id FROM person, vehicle;
OR you can do it smaller, and put like this
SELECT p.name, v.id FROM person p, vehicle v;
It's only for reducing the query length, and it's useful for you
By "letters in front", I assume you mean the qualifiers on the columns c., uu. and so on. They indicate the table where the column comes from. In a sense, they are part of the definition of the column.
This is your query:
SELECT c.client_id, c.client_name, c.contactperson, c.internal_comment,
IF NULL(r.region, 'Alle byer') as region, c.phone, c.email,
uu.fullname as changed_by,
(select count(p.project_id)
from projects p
where p.client_id = c.client_id and (p.is_deleted != 1 or p.is_deleted is null)
) as numProjects
FROM clients c LEFT JOIN
users uu
ON c.db_changed_by = uu.id LEFT JOIN
regions r
ON c.region_id = r.region_id
WHERE (c.is_deleted != 1 or c.is_deleted is null)
In some cases, these are needed. Consider the on clause:
ON c.region_id = r.region_id
If you leave them out, you have:
ON region_id = region_id
The SQL compiler cannot interpret this, because it does not know where region_id comes from. Is it from clients or regions? If you used this in the select, you would have the same issue -- and it makes a difference because of the left join. This is also true in the correlated subquery.
In general, it is good practice to qualify column names for several reasons:
The query is unambiguous.
You (and others) readily know where columns are coming from.
If you modify the query and add a new table/subquery, you don't have to worry about naming conflicts.
If the underlying tables are modified to have new column names that are shared with other tables, then the query will still compile.
Consider you are accessing 2 tables and both have same column name say 'Id', In query you can easily identify those columns using letters like a.Id == d.Id if first table has alias name 'a' and second table 'b'. Or else It would be very difficult to identify which column belongs which table especially when you have common table columns.

LEFT JOIN to a single row in order of criteria in MySQL

Ok, I tried to simplify my question by abstracting away the details but I'm afraid I wasn't clear and didn't meet moderator requirements. So I will post the full query with my problem in more detail and the actual query I am struggling with. If the question is still inadequate, could you please comment with specifics about what is unclear and I will do my best to clarify.
First, here is the current query that returns all assignment rows for each bed:
SELECT
beds.bed_id,
beds.bedstatus,
beds.position as bed_position,
rooms.room_id,
rooms.room,
wings.wing_id,
wings.name as wing_name,
buildings.building_id,
buildings.name as building_name,
assignments.assignment_id,
assignments.student_id,
assignments.assign_dt,
assignments.assigned_by,
assignments.assignment_status,
assignments.expected_arrival_dt as arrival_dt,
assignments.room_charge_type,
students.first_name,
students.last_name,
meal_plans.name as meal_plan_name,
room_rates.rate_name
FROM
beds
LEFT JOIN
rooms ON (beds.room_id = rooms.room_id)
LEFT JOIN
wings ON (rooms.wing_id = wings.wing_id)
LEFT JOIN
buildings ON (wings.building_id = buildings.buildings_id)
LEFT JOIN assignments ON
((beds.bed_id=assignments.bed_id) AND (term_id = #term_id))
LEFT JOIN
students ON (assignments.student_id = students.student_id)
LEFT JOIN
meal_plans ON (assignments.meal_plan_id = meal_plans.meal_plan_id)
LEFT JOIN
room_rates ON (room_rate_id = room_rates.room_rate_id)
WHERE
(
(rooms.room IS NOT NULL) AND
(rooms.assignable = 1) AND
(buildings.active = 1) AND
(buildings.building_id = #building_id)
)
ORDER BY BY rooms.room;
The problem is that there may be multiple rows in the "assignments" table for each room distinguished by the "assignment_status" field and I want a single row for each assignment. I want to determine which assignment row to select based on the value in assignment_status. That is if the assignment status is "active", I want that row, otherwise, if there is a row with status "waiting approval" then I want that row, etc...
Barmar's suggestion is given here:
LEFT JOIN (SELECT *
FROM OtherTable
WHERE <criteria>
ORDER BY CASE status
WHEN 'Active' THEN 1
WHEN 'Waiting Approval' THEN 2
WHEN 'Canceled' THEN 3
...
END
LIMIT 1) other
This was very helpful and I attempted this approach:
SELECT
beds.bed_id,
beds.bedstatus,
beds.position as bed_position,
rooms.room_id,
rooms.room,
wings.wing_id,
wings.name as wing_name,
buildings.building_id,
buildings.name as building_name,
assign.assignment_id,
assign.student_id,
assign.assign_dt,
assign.assigned_by,
assign.assignment_status,
assign.expected_arrival_dt as arrival_dt,
assign.room_charge_type,
students.first_name,
students.last_name,
meal_plans.name as meal_plan_name,
room_rates.rate_name
FROM
beds
LEFT JOIN
rooms ON (beds.room_id = rooms.room_id)
LEFT JOIN
wings ON (rooms.wing_id = wings.wing_id)
LEFT JOIN
buildings ON (wings.building_id = buildings.buildings_id)
LEFT JOIN (SELECT *
FROM assignments
WHERE ((assignments.bed_id==beds.bed_id) AND (term_id = #term_id))
ORDER BY CASE assignment_status
WHEN 'Active' THEN 1
WHEN 'Waiting Approval' THEN 2
WHEN 'Canceled' THEN 3
END
LIMIT 1) assign
LEFT JOIN
students ON (assign.student_id = students.student_id)
LEFT JOIN
meal_plans ON (assign.meal_plan_id = meal_plans.meal_plan_id)
LEFT JOIN
room_rates ON (room_rate_id = room_rates.room_rate_id)
WHERE
(
(rooms.room IS NOT NULL) AND
(rooms.assignable = 1) AND
(buildings.active = 1) AND
(buildings.building_id = #building_id)
)
ORDER BY rooms.room;
But I realized, the problem here is that OtherTable (assignments) is joined to the parent query based on a FK:
((beds.bed_id=assignments.bed_id) AND (term_id = #term_id))
So I can't do the subselect as the beds.bed_id isn't in scope for the subselect. So as Barmar's comment indicates the join criteria needs to be outside the subselect--but I'm having trouble figuring out how to both restrict the results to a single row per room and move the join outside the subselect. I'm wondering if travelboy's suggestion to use GROUP BY may be more fruitful, but haven't been able to determine how the grouping should be done.
Let me know if I can provide additional clarification.
Original Question:
I need from Table A to do a LEFT JOIN on a SINGLE row in another table, Table B meeting certain criteria (there may be multiple or no rows in Table B that meet the criteria). If there are multiple rows I want to select which row in B to join based on the value of a field in Table B. For example, if there is a row in B with status column='Active', I want that row, if not, if there is a row with status='Waiting Approval', I want that row, if there is a row with status='Canceled', I want that row, etc... Can I do this without a sub select? With a sub select?
Use:
LEFT JOIN (SELECT *
FROM OtherTable
WHERE <criteria>
ORDER BY CASE status
WHEN 'Active' THEN 1
WHEN 'Waiting Approval' THEN 2
WHEN 'Canceled' THEN 3
...
END
LIMIT 1) other
In some cases (but not in all cases) you can do it without a sub-select. You would need to GROUP BY a unique field in table A, typically an ID. This ensures that you get only one (or none) row from table B. However, selecting the row you want is the tricky part. You need an aggregating function such as MAX(). If the field in B is a number, that's easy to do. If not, you can apply some SQL functions on the fields in B to calculate something like a score to sort by. For example, Active could correspond to a higher value than Cancelled etc. That will work without a sub-select and likely be faster on big data sets.
With a sub-select it's easy to do. You can either use Barmar's solution, or, if you only need one specific field from B, you can also put the sub-select within the SELECT clause of the outer query.
I need to follow up with some additional testing to make sure this is accomplishing my goal--but I think I've done this using travelboy's suggestion of a group by query combined with barmar's case logic (wish I could split the answer). Here's the query:
SELECT
beds.bed_id,
beds.bedstatus,
beds.position as bed_position,
rooms.room_id,
rooms.room,
wings.wing_id,
wings.name as wing_name,
buildings.building_id,
buildings.name as building_name,
assignments.assignment_id,
assignments.student_id,
assignments.assign_dt,
assignments.assigned_by,
assignments.assignment_status,
assignments.expected_arrival_dt as arrival_dt,
assignments.room_charge_type,
MIN(CASE assignments.assignment_status
WHEN 'Active' THEN 1
WHEN 'Waiting Approval' THEN 2
WHEN 'Canceled' THEN 3
END),
students.first_name,
students.last_name,
meal_plans.name as meal_plan_name,
room_rates.rate_name
FROM
beds
LEFT JOIN
rooms ON (beds.room_id = rooms.room_id)
LEFT JOIN
wings ON (rooms.wing_id = wings.wing_id)
LEFT JOIN
buildings ON (wings.building_id = buildings.building_id)
LEFT JOIN assignments
ON ((assignments.bed_id=beds.bed_id) AND (term_id = 28))
LEFT JOIN
students ON (assignments.student_id = students.student_id)
LEFT JOIN
meal_plans ON (assignments.meal_plan_id = meal_plans.meal_plan_id)
LEFT JOIN
room_rates ON (assignments.room_rate_id = room_rates.room_rate_id)
WHERE
(
(rooms.room IS NOT NULL) AND
(rooms.assignable = 1) AND
(buildings.active = 1)
)
GROUP BY
bed_id
ORDER BY rooms.room;