mysql "distinct query" from 3 table performs slow on big data - mysql

Here is three table, order, order_record, pay, with near 2300000 records.
there will be more than 1 record in pay table when giving 1 order_id, so I need to use DISTINCT to remove repeated result
now I need to get distinct data from those three table join on order_id, the example query sql below:
SELECT
DISTINCT (a.order_id)
a.order_id,a.user_id
b.boss_order_id,
c.pay_id,
FROM order a
LEFT JOIN order_record b ON a.order_id = b.order_id AND b.is_delete IN (0,1)
LEFT JOIN pay c ON a.order_id = c.order_id AND c.is_delete =0 WHERE 1=1 AND a.is_delete IN (0,1)
ORDER BY a.id DESC LIMIT 0, 10
this query will takes plenty of time.
then I change to use "GROUP BY":
SELECT
a.order_id,a.user_id
b.boss_order_id,
c.pay_id,
FROM order a
LEFT JOIN order_record b ON a.order_id = b.order_id AND b.is_delete IN (0,1)
LEFT JOIN pay c ON a.order_id = c.order_id AND c.is_delete =0 WHERE 1=1 AND a.is_delete IN (0,1)
GROUP BY a.order_id
ORDER BY a.id DESC LIMIT 0, 10
this time the query takes 122 seconds.
Is there any faster way to implement?

You are using a left join. Hence, you can do:
SELECT o.order_id, o.user_id, orr.boss_order_id, p.pay_id,
FROM (SELECT o.*
FROM order o
WHERE o.is_delete IN (0, 1)
ORDER BY o.id DESC
LIMIT 10
) o LEFT JOIN
order_record orr
ON o.order_id = orr.order_id AND
orr.is_delete IN (0, 1) LEFT JOIN
pay p
ON o.order_id = p.order_id AND
p.is_delete = 0
WHERE 1=1 AND o.is_delete IN (0, 1)
GROUP BY o.order_id
ORDER BY o.id DESC
LIMIT 0, 10
You are using GROUP BY incorrectly, because you have unaggregated columns in the SELECT that are not in the GROUP BY.

Another approach let a where clause do most the work:
select ...
from order
left join order_using using (order_id)
...
where
order.order_id < (select max(order_id) from orders order by order_id limit 10) ...
limit 10
The final limit 10 is weird though as you may get partial records from an order if you drop the group by. I.e. you probably want to drop it and and just put a limit orders table. With the group by means you will a random data from table b and c unless you use aggregate function to tell mysql which of the row values you want.

Related

How to limit record before group by for pagination?

I have this query that will LEFT JOIN and GROUP BY to get SUM of column.
SELECT
c.id,
SUM(
r.score
) AS score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
FROM comments AS c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
WHERE r.comment_id = c.id
GROUP BY c.id
DB Fiddle link:
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=fadba795d8426f91471fa4db83845b6f
The query works, but if the comments records is large (10K for example), I need to implement pagination, how do I modify this query to limit the comments records first before GROUP BY?
In short:
Get the first 5 comments by limit to 5
Left join the table rates
Get the SUM by group by
Example, show the first 4 comments SUM
Thanks
You can use subquery to "select c.id from comments limit N" in the FROM clause.
select c.id,
sum(r.score) as score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
from ( select c.id from comments c limit 2) c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
GROUP BY c.id;
You may apply order by in the subquery to determine order in which you want to select the comments (Top N).
DB Fiddle link
Try the following:
SELECT
c.id,
SUM(
r.score
) AS score_sum,
SUM(
CASE WHEN r.is_active = '0' THEN r.negative ELSE 0 END
) AS negative_sum
FROM comments AS c
LEFT JOIN rates AS r ON (r.comment_id = c.id)
WHERE r.comment_id = c.id
GROUP BY c.id
ORDER BY c.id ASC
LIMIT 5
The rationale behind the above query is that id is the Primary key (hence indexed) in your comments table. Also, your GROUP BY and ORDER BY is on the same column, that is, id; so MySQL will first utilize the index on id and get first 5 rows (due to LIMIT), and then proceed forward to JOIN with other tables and do aggregation etc.
Give it a Try!! More details here: https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
We can confirm the same using EXPLAIN .. on this query.

Fetch only one record from first order by and rest from second order by

I have a query
SELECT s.*
, g.*
from tbl_section1 as s
, tbl_game as g
LEFT
JOIN tbl_game_visit_count AS gvc
ON g.game_id = gvc.game_id
where s.category_id = g.game_id
ORDER
BY g.udate DESC
, gvc.visit_count DESC
which works fine.
But I want to fetch the first record ordered by g.udate, and then the rest of the records ordered by gvc.visit_count.
Is this possible using mysql query?
Thanks in advance.
It could be possible by using UNION(not UNION ALL, since we don't want to duplicate rows ) between two queries with ORDER BY and LIMIT clauses inside parentheses
SELECT q.*
FROM
(
SELECT s.*, g.*
FROM tbl_section1 as s
INNER JOIN tbl_game as g ON s.category_id = g.game_id
LEFT JOIN tbl_game_visit_count AS gvc ON g.game_id = gvc.game_id
ORDER BY g.udate DESC
LIMIT 1
) q
UNION
SELECT s.*, g.*
FROM tbl_section1 as s
INNER JOIN tbl_game as g ON s.category_id = g.game_id
LEFT JOIN tbl_game_visit_count AS gvc ON g.game_id = gvc.game_id
ORDER BY gvc.visit_count DESC;
P.S. Because of your original query I kept DESC options for ORDER BY, you can get rid of them if you want regular ascending ordering.

MySQL Order By in Right Join Clause

I'm trying to do a right join in MySQL like so:
SELECT customers.id,customers.firstname,customers.lastname,customers.email,orders.time,orders.notes,pendings.date_updated,pendings.issue,appointments.closed,appointments.job_description,backup_plans.expiration FROM customers
RIGHT JOIN orders
ON customers.id = orders.customer_id
ORDER BY orders.time DESC LIMIT 1
RIGHT JOIN pendings
ON customers.id = pendings.customer_id
ORDER BY pendings.date_updated DESC LIMIT 1
RIGHT JOIN appointments
ON customers.id = appointments.customer_id
ORDER BY appointments.closed DESC LIMIT 1
RIGHT JOIN backup_plans
ON customers.id = backup_plans.customer_id
ORDER BY backup_plans.expiration DESC LIMIT 1
My intent is this: to select customers' name and email, along with the most recent order, pending, appointment, and backup plan exploration. When I execute this I get a syntax error:
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'RIGHT JOIN pendings
ON customers.id = pendings.customer_id
ORDER BY pendings.d' at line 5
I'm unfamiliar with joins and would appreciate any help.
EDIT 1:
It seems that I need to make a subquery per DanK's suggestion like so:
SELECT customers.id,customers.firstname,customers.lastname,customers.email,orderstmp.time,orderstmp.notes FROM customers
RIGHT JOIN (
SELECT orders.time,orders.notes,orders.customer_id FROM orders ORDER BY orders.time DESC LIMIT 1
) as orderstmp ON orderstmp.customer_id = customers.id
But when I do this, I only get one row result, whereas I want all the customer information.
EDIT 2:
Per Tom H's suggestion, I've built this query:
SELECT
customers.id,
SQ_O.time,
SQ_O.notes
FROM customers
LEFT JOIN (
SELECT
customers.id,
orders.time,
orders.notes
FROM customers
LEFT JOIN orders ON orders.customer_id = customers.id
ORDER BY orders.time DESC LIMIT 1
) AS SQ_O ON SQ_O.id = customers.id
which has all blank time and notes fields
and
SELECT
customers.id,
O1.time,
O1.notes
FROM customers
LEFT JOIN orders AS O1 ON O1.customer_id = O1.id
LEFT JOIN orders AS O2 ON O2.customer_id = customers.id AND O2.time > O1.time WHERE O2.customer_id IS NULL
Which reaches max execution time. I'm guessing this is due to my lack of familiarity with what's possible in MySQL in comparison to other dialects.
I also tried Correlated subqueries like this:
SELECT
customers.firstname,
customers.lastname,
customers.email,
(
SELECT CONCAT(orders.time,': ',orders.notes)
FROM orders
WHERE orders.customer_id = customers.id
ORDER BY orders.time DESC LIMIT 1
) as last_order
FROM customers
But the "last_order" column comes up blank.
FINAL, DISAPPOINTING EDIT
After trying a number of really stellar suggestions that helped me learn SQL significantly, I decided to write a PHP script to get me what I want. The project's under a bit of a deadline so whatever works, works. Thanks everyone!
You can only have one ORDER BY statement per query. You can of course use subqueries and refer to a result set as a virtual table but ultimately in a single SELECT you can only have one ORDER BY.
For instance:
SELECT something
FROM table
ORDER BY something -- One order By
With a subquery as a virtual table:
SELECT something
FROM (SELECT anotherthing, something
FROM table
ORDER BY anotherthing) -- this is an order by in a separate select statement..
ORDER BY something -- still only one Order by
------EDIT--------
For assistance with your join syntax, try something like this:
SELECT --fields,
FROM customers
RIGHT JOIN orders ON customers.id = orders.customer_id
RIGHT JOIN pendings ON customers.id = pendings.customer_id
RIGHT JOIN appointments ON customers.id = appointments.customer_id
RIGHT JOIN backup_plans ON customers.id = backup_plans.customer_id
ORDER BY orders.time DESC, pendings.date_updated DESC, appointments.closed DESC, backup_plans.expiration DESC
LIMIT 1
Try this:
SELECT customers.id,customers.firstname,customers.lastname,customers.email,orders.time,orders.notes,pendings.date_updated,pendings.issue,appointments.closed,appointments.job_description,backup_plans.expiration FROM customers
RIGHT JOIN orders
ON customers.id = orders.customer_id
RIGHT JOIN pendings
ON customers.id = pendings.customer_id
RIGHT JOIN appointments
ON customers.id = appointments.customer_id
RIGHT JOIN backup_plans
ON customers.id = backup_plans.customer_id
ORDER BY orders.time DESC, pendings.date_updated DESC, appointments.closed DESC, backup_plans.expiration DESC LIMIT 1
You can accomplish this through subqueries or with additional JOINs. Here's an example of each. (NOTE: I use SQL Server, so it's possible that some of the syntax that I'm used to isn't supported in the same way in MySQL). I'm only doing these example with the Orders, but hopefully you can extend the ideas to the other tables.
Using subqueries:
SELECT
C.id,
SQ_O.time,
SQ_O.notes
FROM
Customers C
LEFT OUTER JOIN
(
SELECT
C2.Customer_ID,
O.time,
O.notes
FROM
Customers C2
LEFT OUTER JOIN Orders O ON O.customer_id = C2.id
ORDER BY
O.time DESC LIMIT 1
) SQ_O ON SQ_O.customer_id = C.id
Using multiple JOINs:
SELECT
C.id,
O1.time,
O1.notes
FROM
Customers C
LEFT OUTER JOIN Orders O1 ON O1.customer_id = C.id
LEFT OUTER JOIN Orders O2 ON O2.customer_id = C.id AND O2.time > O1.time
WHERE
O2.customer_id IS NULL -- Basically we're excluding any rows where another order was found with a later time than O1
If exact matches in Orders.time are possible than you'll need additional criteria on which one to choose.
As long as you can rely on no customer having their two most recent orders having the same time, this should work:
SELECT c.firstname, c.lastname, c.email, o.*
FROM customers AS c
LEFT JOIN (
SELECT customer_id, MAX(`time`) AS maxTime
FROM orders
GROUP BY customer_id
) AS lastO ON c.id = lastO.customer_id
LEFT JOIN orders AS o
ON lastO.customer_id = o.customer_id
AND lastO.maxTime = o.`time`
;
As long as the other tables can also be relied upon to have only one MAX value per customer, you should be able to append similar JOINs for them. The issue with multiple of the same "last" time\date_updated\closed\etc.. for a customer is that they will multiply results. For example, pairs of the same time in orders and pairs of date_updated in pending on the same customer will result in 4 rows instead of two as every "last" row for that customer in orders is paired up with every "last" row in pending.

MySQL LEFT JOIN only one row, ordered by column without subquery

Is there a possibility to do LEFT JOIN with only one row from other table ordered by column (date) without using sub query. My query is below. It works but it's super slow.
SELECT * FROM clients c
LEFT JOIN loan l ON c.id = l.id_client AND l.id = (
SELECT id FROM loan ll
WHERE ll.id_client = c.id
ORDER BY `create_date` DESC
LIMIT 1)
GROUP BY k.id DESC
ORDER BY c.register_date DESC
LIMIT n , m; (n,m is from pagination)
Is there a way to speed it up?
Im interpreting your question as "Get me all loan details for the most recent loan for each client"
This should work... note the assumption though.
SELECT *
FROM
clients c
LEFT JOIN (select id_client, Max(id) id -- this assumes that a loan with a later create date will also have a higher id.
from loan
group by id_client) il
on il.id_client = c.id
inner join loan l
on l.id = il.id
GROUP BY k.id DESC -- Dont know what "k" is
ORDER BY c.register_date DESC
LIMIT n , m; (n,m is from pagination)

MySQL Join with conditions on join on the fly

I need to know, how do i do this on the fly,
for example i have customers who are in the each different duedate statuses, i want to select MAX (most recent due date) ON LEFT JOIN currently when its join two tables it selects the oldest duedate which is not what i want..
SELECT c.customerid, i.datedue
FROM customers c
LEFT JOIN invoice i
ON i.customerid = c.customerid
WHERE i.datedue <= UNIX_TIMESTAMP()
AND c.status!='d'
GROUP BY i.customerid
ORDER BY i.datedue DESC
LIMIT 0, 1000
You need to use the max() function:
SELECT c.customerid, MAX(i.datedue)
FROM customers c LEFT JOIN invoice i ON i.customerid = c.customerid
WHERE i.datedue <= UNIX_TIMESTAMP() and c.status!='d'
GROUP BY i.customerid
ORDER BY i.datedue DESC
LIMIT 0,1000
This will give you the maximum datedue for each customer.