MySQL Query optimization required - mysql

I've got a slow performing query. I know using a dependent subquery is bad, but I can't think of another way to get the data I want.
Essentially, I want to flag customers who have at least 50 invoices in the past 6 months, but no invoices this month.
This is what I have currently:
select
Customer.name,
Customer.id,
Customer.latitude,
Customer.longitude
from
Customer
where
EXISTS (
SELECT
*
FROM
Invoice_Header
WHERE
Invoice_Header.inv_date BETWEEN '2011-03-02' AND '2011-10-02'
AND
Invoice_Header.account_number = Customer.account_number
HAVING COUNT(invoice_num) > 50
)
AND NOT EXISTS (
SELECT *
FROM
Invoice_Header
WHERE
InvHead.inv_date > '2011-10-02'
AND
InvHead.account_number = Customer.account_number
)
Group by name;
Customer table has about 12k record, Invoice_Header has about 2mill records.
I have indexes on inv_date, account_number (in both tables).
Any suggestions for how to speed this up would be appreciated.

This should eliminate the correlated subqueries and be significantly faster:
SELECT c.name, c.id, c.latitude, c.longitude
FROM Customer c
INNER JOIN (
SELECT account_number
FROM Invoice_Header ih
WHERE ih.inv_date BETWEEN '2011-03-02' AND '2011-10-02'
GROUP BY account_number
HAVING COUNT(*) > 50
MINUS
SELECT DISTINCT account_number
FROM Invoice_Header ih
WHERE ih.inv_date > '2011-10-02'
) tbl
ON tbl.account_number = c.account_number

I would suggest:
SELECT
c.name,
c.id,
c.latitude,
c.longitude
FROM
Customer AS c
INNER JOIN (
SELECT account_number, count(*) AS invoice_count
FROM Invoice_Header
WHERE inv_date >= '2011-03-02' AND inv_date <= '2011-10-02'
GROUP BY account_number
) AS lsm
ON c.account_number = lsm.account_number
LEFT JOIN (
SELECT account_number, count(*) AS invoice_count
FROM Invoice_Header
WHERE inv_date > '2011-10-02'
GROUP BY account_number
) AS lm
ON c.account_number = lm.account_number
WHERE
lsm.invoice_count >= 50
AND IFNULL(lm.invoice_count, 0) = 0

select
C.name,
C.id,
C.latitude,
C.longitude,
I.account_number,
count( IF(I.inv_date>='2011-03-02' AND I.inv_date <='2011-10-02',I.inv_date,NULL )) as inv_count_6,
count( IF(I.inv_date > '2011-10-02',I.inv_date,NULL )) as inv_count_1
from Customer C
LEFT JOIN Invoice_Header I
ON C.account_number = I.account_number
GROUP BY C.id, I.account_number
HAVING inv_count_6 >= 50 AND inv_count_1=0
WHERE I.inv_date BETWEEN '2011-03-02' AND '2011-10-02'
Notes:
1.The invoices is AT LEAST 50. so the condition is >=50 not >50.
2.You have to add index to the column inv_date

Try run your query with explain and see if other indexs are needed .

Related

Mysql Group By get latest record with Count

This is my customer table.
I want to group by emp_id alongwith the count. But Group By gets the 'first' record and not the 'newest' one.
I have tried various queries, like this
SELECT id, emp_id, COUNT( * ) AS count, created_at
FROM customer c
WHERE created_at = (
SELECT MAX( created_at )
FROM customer c2
WHERE c2.emp_id = c.emp_id
)
GROUP BY emp_id
ORDER BY created_at DESC
LIMIT 0 , 30
But cannot get the count. Please help.
Edit: this answer doesn't help to obtain count
Try joining to a subquery:
SELECT c1.id, c1.emp_id, c1.created_at, c2.cnt
FROM customer c1
INNER JOIN
(
SELECT emp_id, MAX(created_at) AS max_created_at, COUNT(*) AS cnt
FROM customer
GROUP BY emp_id
) c2
ON c1.emp_id = c2.emp_id AND c1.created_at = c2.max_created_at;
please try this
SELECT cust1.id, cust1.emp_id, cust1.created_at, cust2.cnt
FROM customer cust1
INNER JOIN
(
SELECT emp_id, MAX(created_at) AS max_created_at, COUNT(*) AS count
FROM customer
GROUP BY emp_id
) cust2
ON cust1.emp_id = cust2.emp_id AND cust1.created_at = cust2.max_created_at;

Simplify slow MySQL query

This query calculates the columns free,plus,score and total based on the COUNT of columns in subquery.
SELECT movie_title,movie_id,MAX(x.free_cnt) as free, MAX(x.plus_cnt) as plus,
(MAX(x.free_cnt) + (MAX(x.plus_cnt)*3)) AS score, (MAX(x.free_cnt) + MAX(x.plus_cnt)) AS total
FROM (
SELECT b.id as movie_id, b.movie_title as movie_title, COUNT(*) AS free_cnt, 0 as plus_cnt
FROM subtitles_request a1
LEFT JOIN movies b on a1.movie_id=b.id
JOIN users c on c.email=a1.email
WHERE c.subsc_status='0'
GROUP BY b.movie_title
UNION ALL
SELECT d.id as movie_id, d.movie_title as movie_title, 0 as free_cnt, COUNT(*) AS plus_cnt
FROM subtitles_request a2
LEFT JOIN movies d on a2.movie_id=d.id
JOIN users e on e.email=a2.email
WHERE e.subsc_status='1'
GROUP BY d.movie_title
) AS x
GROUP BY movie_title
ORDER BY total DESC
LIMIT 10
It is slow performing and i'm wondering is there anyway i can simplify or change the query to speed up performance. I can't calculate the free,plus,score ,total columns outside of query due to being able to order by. Also i may incorporate date.
Anyway to simplify this query?
Try this:
SELECT b.movie_title, x.movie_id, MAX( x.free_cnt ) AS free, MAX( x.plus_cnt ) AS plus,
( MAX( x.free_cnt ) + ( MAX( x.plus_cnt ) * 3 ) ) AS score, ( MAX( x.free_cnt ) + MAX( x.plus_cnt ) ) AS total
FROM ( SELECT a.movie_id,
SUM( IF( c.subsc_status = '0', 1, 0 ) ) AS free_cnt,
SUM( IF( c.subsc_status = '1', 1, 0 ) ) AS plus_cnt
FROM subtitles_request a1
JOIN users c on c.email=a1.email
WHERE c.subsc_status in ('0','1')
GROUP BY a.movie_id
) AS x
LEFT JOIN movies b on x.movie_id = b.id
GROUP BY movie_title, movie_id
ORDER BY total DESC
LIMIT 10
Maybe I've simplified a bit too much. Moreover, I'm not used to grouping on only some of the non-aggregate fields, hence I added movie_id to what is being grouped by and thus changing your query a bit (if two films had the same name, but different ID, then only one of the id's would be returned in your original query, but I guess (being a MySQL newbie, I really don't know) the counts would be for both of them taken together).
HTH,
Set
Well, I have check your the subquery:
SELECT b.id as movie_id, b.movie_title as movie_title, COUNT(*) AS free_cnt, 0 as plus_cnt
FROM subtitles_request a1
LEFT JOIN movies b on a1.movie_id=b.id
JOIN users c on c.email=a1.email
WHERE c.subsc_status='0'
GROUP BY b.movie_title
UNION ALL
SELECT d.id as movie_id, d.movie_title as movie_title, 0 as free_cnt, COUNT(*) AS plus_cnt
FROM subtitles_request a2
LEFT JOIN movies d on a2.movie_id=d.id
JOIN users e on e.email=a2.email
WHERE e.subsc_status='1'
GROUP BY d.movie_title
The statement beside "UNION ALL" can be replaced with one statement with condition at c.subsc_status IN('0','1'). And you can try to use "CASE WHEN" statement at 0 as free_cnt, COUNT(*) AS plus_cnt, just like IFNULL((CASE WHEN e.subsc_status='1' THEN COUNT(*)),0) as free_cnt. It's not a complicated sql statement, I don't think it will take too much time to query. Is there too many datas?
As a matter of fact, I'm also a newer, but I just have some experence about it. Please forgive me if it doesn't work.

MySQL Query - SUM of COUNT from multiple tables

I have three tables:
customers: id, name
contracts_jewels: id, customer_id, paid, transferred, final_date
contracts_objects: id, customer_id, paid, transferred, final_date
As you see, the structure of the last two tables is the same.
The "paid" and the "transferred" fields contain the value 0 or 1.
What I need is to make a query which should return all the clients (no matter if they have contracts or not), and for each client:
id, name, count_contracts_all, count_contracts_active
where:
count_contracts_all would mean the sum of [SELECT COUNT( * ) FROM
contracts_jewels WHERE customer_id=3 (for example)] and [SELECT
COUNT( * ) FROM contracts_objects WHERE customer_id=3 (for example)]
count_contracts_active would mean the sum of [SELECT COUNT( * ) FROM
contracts_jewels WHERE customer_id=3 AND final_date>=Now() AND paid=0
AND transferred=0] and [SELECT COUNT( * ) FROM contracts_objects WHERE
customer_id=3 AND final_date>=Now() AND paid=0 AND transferred=0]
Any idea? Would you please help me? Thank you!
You can count the contracts separately and then just join them up to the customers:
SELECT
c.id,
COALESCE(oc.active_count,0) + COALESCE(jc.active_count,0) as count_contracts_active,
COALESCE(oc.total_count,0) + COALESCE(jc.total_count,0) as count_contracts_all
FROM customers c
LEFT JOIN (
SELECT
customer_id
COUNT(*) as total_count,
COUNT(IF(final_date>=Now() AND paid=0 AND transferred=0,1,NULL)) as active_count
FROM contracts_jewels
GROUP BY customer_id
) as oc ON oc.customer_id = c.id
LEFT JOIN (
SELECT
customer_id
COUNT(*) as total_count,
COUNT(IF(final_date>=Now() AND paid=0 AND transferred=0,1,NULL)) as active_count
FROM contracts_objects
GROUP BY customer_id
) as jc ON jc.customer_id = c.id
One fast solution I can think of right now is:
SELECT COUNT(`temp_table`.*) FROM (
SELECT * FROM contracts_jewels WHERE customer_id=3 UNION ALL
SELECT * FROM contracts_objects WHERE customer_id=3) AS `temp_table`
AND
SELECT COUNT(`temp_table`.*) FROM (
SELECT * FROM contracts_jewels WHERE customer_id=3 AND final_date>=Now() AND paid=0 AND transferred=0 UNION ALL
SELECT * FROM contracts_objects WHERE customer_id=3 AND final_date>=Now() AND paid=0 AND transferred=0) AS `temp_table`
You can join each of those tables twice and add their corresponding COUNTs in your result:
SELECT
c.id,
(COUNT(cj1.id)+COUNT(co1.id)) AS count_contracts_all,
(COUNT(cj2.id)+COUNT(co2.id)) AS count_contracts_active
FROM
customers c
LEFT OUTER JOIN contracts_jewels cj1 ON c.id = cj1.customer_id
LEFT OUTER JOIN contracts_objects co1 ON c.id = co1.customer_id
LEFT OUTER JOIN contracts_jewels cj2 ON
c.id = cj2.id AND
cj2.final_date >= NOW() AND
cj2.paid = 0 AND
cj2.transferred = 0
LEFT OUTER JOIN contracts_object co2 ON
c.id = co2.id AND
co2.final_date >= NOW() AND
co2.paid = 0 AND
co2.transferred = 0
GROUP BY c.id
Note: I haven't run this, but hopefully it sets you in the right direction.
simple solution:
SELECT SUM(c) FROM (
SELECT COUNT(1) as c FROM `tbl1` where ...
UNION
SELECT COUNT(1) as c FROM tbl2 where ...
UNION
SELECT COUNT(1) as c FROM tbl3 where ...
) al

Select the SUM of two different tables

I have an orders table which consists of the following:
id
order_total
delivery_cost
customer_id
I also have a transactions table which has:
id
amount
customer_id
status
What I'm trying to do is,
SELECT SUM(order_total + delivery_cost) FROM orders WHERE customer_id = '1'
then
SELECT SUM(amount) FROM transactions WHERE customer_id = '1' AND transaction_status = 'Paid'
Then with the data, minus the total amount from the order totals.
I've tried different queries using JOINS, but I just can't get my head around it, for example:
SELECT SUM(OrdersTotal - TransactionTotal) as AccountBalance
FROM (
SELECT SUM(`order_total` + `delivery_cost`) FROM `orders` as OrdersTotal
UNION ALL
SELECT SUM(`amount`) FROM `transactions` WHERE `transaction_status` = 'Paid' as TransactionTotal
)
but this didn't work at all. Any help would be greatly appreciated.
The two datasets are effectively autonomous but assuming that it's unlikely to have transactions for a customer without orders, you can acheive your result with a LEFT JOIN rather than a ful outer join - but if you simply join the base tables then you'll likely get values from one table repeated in the interim result set (this is why Joseph B's answer is wrong when a customer has something other than a single row in each table).
SELECT ordered_value-IFNULL(paid_value,0) AS acct_balance
FROM
(
SELECT customer_id, SUM(order_total + delivery_cost) AS ordered_value
FROM orders
WHERE customer_id = '1'
GROUP BY customer_id
) AS orders
LEFT JOIN
(
SELECT customer_id, SUM(amount) AS paid_value
FROM transactions
WHERE customer_id = '1'
AND transaction_status = 'Paid'
FROUP BY customer_id
) as payments
ON orders.customer_id = payments.customer_id
(here the 'GROUP BY' and 'ON' clauses are redundant since your only looking at a single customer - but are required for multiple customers).
Note that calculating a balance based on a sum of transactions is technically correct, it does not scale well - for large systems a better solution (although it breaks the rules of normalization) is to maintain a unified table of transactions and a balance for for the account along with each transaction amount - alternatively use checkpointing.
You can just name the column in your union all query and sum on that:
SELECT SUM(col) as AccountBalance
FROM (SELECT SUM(`order_total` + `delivery_cost`) as col FROM `orders` as OrdersTotal
UNION ALL
SELECT SUM(`amount`) FROM `transactions` WHERE `transaction_status` = 'Paid' as TransactionTotal
) t;
Try this query using a JOIN:
SELECT
SUM(o.order_total + o.delivery_cost) - SUM(t.amount) AS AccountBalance
FROM orders o
INNER JOIN transactions t
ON o.customer_id = t.customer_id AND o.customer_id = '1' AND t.transaction_status = 'Paid';
This should give you the account balance for each customer?
SELECT
ISNULL(o.customer_id, t.customer_id) AS customer_id
OrdersTotal - TransactionTotal AS AccountBalance
FROM (
SELECT
customer_id,
SUM(order_total + delivery_cost) AS OrdersTotal
FROM
orders
GROUP BY
customer_id) o
FULL OUTER JOIN (
SELECT
customer_id,
SUM(amount) AS TransactionTotal
FROM
transactions
WHERE
transaction_status = 'Paid'
GROUP BY
customer_id) t
ON t.customer_id = o.customer_id
You can join the results of your queries and perform your calculations ,note sum without group by will result as one row so the sum in outer query doesn't mean when you have only one row and according to your logic of calculation union has nothing to do with it
SELECT t1.OrdersTotal - t2 .TransactionTotal AS AccountBalance
FROM (
SELECT SUM(order_total + delivery_cost) OrdersTotal ,customer_id
FROM orders
WHERE customer_id = '1'
) t1
JOIN (
SELECT SUM(amount) TransactionTotal ,customer_id
FROM transactions
WHERE customer_id = '1' AND transaction_status = 'Paid'
) t2 USING(customer_id)
Since you are only concerned about a single customer, just have them listed as two different queries as the source...
select
charges.chg - paid.pay as Balance
from
( SELECT SUM(order_total + delivery_cost) chg
FROM orders
WHERE customer_id = '1' ) charges,
( SELECT SUM(amount) pay
FROM transactions
WHERE customer_id = '1' AND transaction_status = 'Paid' ) paid
Now, if you wanted for all customers to see who is outstanding, add the customer's ID to each query and apply a group by, but then change to a LEFT-JOIN so you get all orders with or
select
charges.customer_id,
charges.chg - coalesce( paid.pay, 0 ) as Balance
from
( SELECT customer_id, SUM(order_total + delivery_cost) chg
FROM orders
group by customer_id ) charges
LEFT JOIN ( SELECT customer_id, SUM(amount) pay
FROM transactions
where transaction_status = 'Paid'
group by customer_id ) paid
on charges.customer_id = paid.customer_id
I think this works best if you try an inline view. In the code block below, check the Group By clause, you might need to add more fields in the Group By depending on what you're selecting in the Inner SELECT statements.Try the code below:
SELECT
SUM(Totals.OrdersTotal-Totals.TransactionTotal)
FROM
(SELECT
SUM(ord.order_total + ord.delivery_cost) AS OrdersTotal
, SUM(trans.amount) AS TransactionTotal
FROM orders ord
INNER JOIN transactions trans
ON ord.customer_id = trans.customer_id
WHERE
ord.customer_id =1
AND trans.transaction_status = 'Paid'
GROUP BY
ord.customer_id
) Totals;

MySQL Query not displaying correctly

I am having to set up a query that retrieves the last comment made on a customer, if no one has commented on them for more than 4 weeks. I can make it work using the query below, but for some reason the comment column won't display the latest record. Instead it displays the oldest, however the date shows the newest. It may just be because I'm a noob at SQL, but what exactly am I doing wrong here?
SELECT DISTINCT
customerid, id, customername, user, MAX(date) AS 'maxdate', comment
FROM comments
WHERE customerid IN
(SELECT DISTINCT id FROM customers WHERE pastdue='1' AND hubarea='1')
AND customerid NOT IN
(SELECT DISTINCT customerid FROM comments WHERE DATEDIFF(NOW(), date) <= 27)
GROUP BY customerid
ORDER BY maxdate
The first "WHERE" clause is just ensuring that it shows only customers from a specific area, and that they are "past due enabled". The second makes sure that the customer has not been commented on within the last 27 days. It's grouped by customerid, because that is the number that is associated with each individual customer. When I get the results, everything is right except for the comment column...any ideas?
Join much better to nested query so you use the join instead of nested query
Join increase your speed
this query resolve your problem.
SELECT DISTINCT
customerid,id, customername, user, MAX(date) AS 'maxdate', comment
FROM comments inner join customers on comments.customerid = customers.id
WHERE comments.pastdue='1' AND comments.hubarea='1' AND DATEDIFF(NOW(), comments.date) <= 27
GROUP BY customerid
ORDER BY maxdate
I think this might probably do what you are trying to achieve. If you can execute it and maybe report back if it does or not, i can probably tweak it if needed. Logically, it ' should' work - IF i have understood ur problem correctly :)
SELECT X.customerid, X.maxdate, co.id, c.customername, co.user, co.comment
FROM
(SELECT customerid, MAX(date) AS 'maxdate'
FROM comments cm
INNER JOIN customers cu ON cu.id = cm.customerid
WHERE cu.pastdue='1'
AND cu.hubarea='1'
AND DATEDIFF(NOW(), cm.date) <= 27)
GROUP BY customerid) X
INNER JOIN comments co ON X.customerid = co.customerid and X.maxdate = co.date
INNER JOIN customer c ON X.customerid = c.id
ORDER BY X.maxdate
You need to have subquery for each case.
SELECT a.*
FROM comments a
INNER JOIN
(
SELECT customerID, max(`date`) maxDate
FROM comments
GROUP BY customerID
) b ON a.customerID = b.customerID AND
a.`date` = b.maxDate
INNER JOIN
(
SELECT DISTINCT ID
FROM customers
WHERE pastdue = 1 AND hubarea = 1
) c ON c.ID = a.customerID
LEFT JOIN
(
SELECT DISTINCT customerid
FROM comments
WHERE DATEDIFF(NOW(), date) <= 27
) d ON a.customerID = d.customerID
WHERE d.customerID IS NULL
The first join gets the latest record for each customer.
The second join shows only customers from a specific area, and that they are "past due enabled".
The third join, which uses LEFT JOIN, select all customers that has not been commented on within the last 27 days. In this case,only records without on the list are selected because of the condition d.customerID IS NULL.
But tomake your query shorter, if the customers table has already unique records for customer, then you don't need to have subquery on it.Directly join the table and put the condition on the WHERE clause.
SELECT a.*
FROM comments a
INNER JOIN
(
SELECT customerID, max(`date`) maxDate
FROM comments
GROUP BY customerID
) b ON a.customerID = b.customerID AND
a.`date` = b.maxDate
INNER JOIN customers c
ON c.ID = a.customerID
LEFT JOIN
(
SELECT DISTINCT customerid
FROM comments
WHERE DATEDIFF(NOW(), date) <= 27
) d ON a.customerID = d.customerID
WHERE d.customerID IS NULL AND
c.pastdue = 1 AND
c.hubarea = 1
Two of your table columns are not contained in either an aggregate function or the GROUP BY clause. for example suppose that you have two data rows with the same customer id and same date, but with different comment data. how SQL should aggregate these two rows? :( it will generate an error...
try this
select customerid, id, customername, user,date, comment from(
select customerid, id, customername, user,date, comment,
#rank := IF(#current_customer = id, #rank+ 1, 1),
#current_customer := id
from comments
where customerid IN
(SELECT DISTINCT id FROM customers WHERE pastdue='1' AND hubarea='1')
AND customerid NOT IN
(SELECT DISTINCT customerid FROM comments WHERE DATEDIFF(NOW(), date) <= 27)
order by customerid, maxdate desc
) where rank <= 1