I have a sales table with a sales column TCV, customerKey column which holds customerKey stored in customer table. Each row has an order_date and there are other columns irrelevant to this query.
I have to find sales for current period and another period grouped by customers for comparison. So I have this below query.
SELECT
*,
SUM(TCV) AS sales,
current_sales - SUM(TCV) AS difference
FROM
`sales`
LEFT JOIN
(SELECT
customerKey AS customerID,
SUM(TCV) AS current_sales
FROM
`sales`
WHERE `order_date` BETWEEN '2020-08-01'
AND '2020-08-31'
GROUP BY `sales`.`customerKey`) AS `current_sales`
ON `customerKey` = `customerID`
WHERE `order_date` BETWEEN '2020-09-01'
AND '2020-09-31'
GROUP BY `sales`.`customerKey`
I have this query and it runs very slowly, takes about 30 secs, but if I run the query without the join the result comes back in a second.
What could be the problem, is it structured wrong?
Rewrite it without join, it will perform better:
select s.*
, current_sales - sales as difference
from
( SELECT customerKey
, sum(CASE WHEN order_date BETWEEN '2020-08-01' AND '2020-08-31' then TCV else 0 end) current_sales
, sum(CASE WHEN order_date BETWEEN '2020-09-01' AND '2020-09-30' then TCV else 0 end) sales
FROM sales
WHERE order_date BETWEEN '2020-08-01' AND '2020-09-30'
GROUP
BY sales.customerKey
) s
Related
Use a correlated subquery to return one row per vendor, representing the vendor’s oldest invoice (the one with the earliest date) that is due within the next 2 weeks.
Each row should include these five columns:
vendor_name
invoice_number
invoice_date
invoice_due_date
invoice_total
This is what I have so far just stuck in how I do the invoice_due_date:
SELECT vendor_name, invoice_number, invoice_date, invoice_total
FROM vendors v JOIN invoices i
WHERE invoice_date <= ( SELECT Min(invoice_date)
FROM invoices JOIN vendors ON v.vendor_id = v.vendor_id )
GROUP BY vendor_name, invoice_number, invoice_date, invoice_total;
Below is a SQL query problem for which I am not able to understand correct approach:
DB tables:
Employee: emp_id, emp_name
Credit: credit_id, emp_id, credit_date, credit_amount
debit: debit_id, emp_id, debit_date, debit_amount
Here, each person can have multiple incomes and expenses.
Query requirement: At the end of each day, each employee will have some asset('credit till now' - 'debit till now'). We need to find top five employees in terms of maximum asset and the date on which they had this maximum asset.
I have tried the below query but seems like I am missing something:
select Credit.emp_id, Credit.date, (Credit.income_amount - Debit.credit_amount) from
(select emp_id, sum(amount) as credit_amount
from credit) Credit
LEFT JOIN LATERAL (
select emp_id, sum(amount) as debit_amount
from debits
where debits.emp_id = Credit.emp_id and Credit.date >= debits.date
group by debits.emp_id
) Debit
ON true
Here I'm breaking the query to make it more readable.
First of all, we need to get the total amount on a day-level for both credit and debit both, so that we can join the credit and debit table on the day level with the same emp_id.
with
credit as(
select emp_id,credit_date date,sum(credit_amount) as amount
from credit
group by 1,2),
debit as(
select emp_id,debit_date,sum(debit_amount) as amount
from expenses
group by 1,2),
Now we need to full outer join the "credit" and "debit" subqueries
payments as (
select distinct
case when c.emp_id is null then d.person_id else c.emp_id end as emp_id ,
case when c.emp_id is null then d.date else c.date end as date,
case when c.emp_id is null then 0 else i.amount end as credit ,
case when d.emp_id is null then 0 else d.amount end as debit
from credit c
full outer join debit d on d.emp_id=c.emp_id and d.date=c.date
),
Now we will take day-wise cumulative sum for credit, debit and total balance as shown below.
total_balance as(
SELECT emp_id, date,
sum(credit) OVER (PARTITION BY emp_id ORDER BY date asc) AS total_credit,
sum(debit) OVER (PARTITION BY emp_id ORDER BY date asc) AS total_debit,
(sum(income) OVER (PARTITION BY person_id ORDER BY date asc) -
sum(expense) OVER (PARTITION BY person_id ORDER BY date asc)) as total_balance
FROM group_payment
ORDER BY person_id, date),
Now we need to use the rank() function to assign rank based on total balance (desc) for an emp_id (ie. rank=1 will be assigned to the largest total balance on a day for a particular emp_id). The query is shown below.
ranks as (select emp_id,date,total_balance,
rank() over (partition by emp_id order by total_balance desc) as rank
from total_balance ),
Now pick the rows having rank=1 (ie. MAX of total_balance on a day for an emp_id and the date on which it was MAX).
Order it by total_balance descending and pick the top 5 rows
emp_order as (select emp_id,date,total_balance
from ranks
where rank=1
order by 3 desc
limit 5)
Now pick the name from the employee table.
select emp_id,name, date, total_balance as balance
from emp_order eo
join Employee e on e.emp_id = eo.emp_id
order by 4 desc
Group by and sum allows you to get the total credit for each person into 1 record. You can do a similar thing in a subquery to subtract the debit.
Select top 5 emp_id, credit_date, (sum(credit_amount) -
(select sum(debit_amount) from debit d
where c.emp_id = d.emp_id and c.credit_date = d.debit_date)
) as total
from Credit c group by emp_id, credit_date order by total
I want to get the value of balance from two tables policies and payments. MySQL code below:
SELECT Sum(policy.premium) AS
`total`
,
(SELECT Sum(payments.amount)
FROM payments
WHERE ( payments.date_paid BETWEEN '2019-03-01' AND '2019-03-31' )) AS
`paid`
FROM `policy`
LEFT JOIN payments
ON policy.code = payments.code
WHERE ( policy.st BETWEEN '2019-03-01' AND '2019-03-31' )
AND policy.trn_type = 0
paid column returns give null, and how can I get the difference between total and paid.
You can calculate the 2 values like this:
select t.total, coalesce(t.paid, 0) paid, (t.total - coalesce(t.paid, 0)) diff from (
select
(select sum(premium) from policy where st between '2019-03-01' and '2019-03-31' and policy.trn_type=0) total,
(select sum(amount) from payments where date_paid between '2019-03-01' and '2019-03-31') paid
) t
Paid column is just this part:
SELECT SUM(payments.amount)
FROM payments
WHERE payments.date_paid BETWEEN '2019-03-01' AND '2019-03-31'
If it returns null, check first what you have in your data
I have 2 queries right now for which I am looking to combine into 1 if possible.
I have open tickets stored in the Tickets_Open table and closed tickets in Tickets_Closed. Both tables have "Date_Requested" and "Date_Completed" columns. I need to count the number of tickets requested and completed each day.
My tickets requested count query is the following:
SELECT SUM(Count) AS TotalOpen, Date FROM(
SELECT COUNT(Ticket_Request_Code) AS Count, Date_Requested AS Date
FROM Tickets_Closed
WHERE Date_Requested >='2018-01-01 00:00:00'
GROUP BY(Date_Requested)
UNION
SELECT COUNT(Work_Request_Code) AS Count, Date_Requested AS Date
FROM Tickets_Open
WHERE Date_Requested >='2018-01-01 00:00:00'
GROUP BY(Date_Requested)
) AS t1 GROUP BY Date ORDER BY `t1`.`Date` DESC
My tickets completed count query is the following:
SELECT COUNT(Ticket_Request_Code) AS CountClosed, Date_Completed AS Date
FROM Tickets_Closed
Where Date_Completed >='2018-01-01 00:00:00'
GROUP BY(Date_Completed)
Both queries return the correct result. For open it returns with the column headings Date and TotalOpen. For close it returns with the column headings Date and CountClosed.
Is it possible to return it with the following column headings Date, TotalOpen, CountClosed.
You can combine these as:
SELECT Date, SUM(isopen) as isopen, SUM(isclose) as isclose
FROM ((SELECT date_requested as date, 1 as isopen, 0 as isclose
FROM Tickets_Closed
WHERE Date_Requested >= '2018-01-01'
) UNION ALL
(SELECT date_requested, 1 as isopen, 0 as isclose
FROM Tickets_Open
WHERE Date_Requested >= '2018-01-01'
) UNION ALL
(SELECT date_closed as date, 0 as isopen, 1 as isclose
FROM Tickets_Closed
WHERE date_closed >= '2018-01-01'
)
) t
GROUP BY Date
ORDER BY Date DESC;
This assumes that Ticket_Request_Code and Work_Request_Code are not NULL. If COUNT() is really being used to check for NULL values, then add the condition to the WHERE clause in each subquery.
This query uses the FULL OUTER JOIN on the Dates as well, but it correctly adds the Open/Closed counts together to give you a TotalOpen Count. This will also handle possible NULL values for cases where you have a day that doesn't close any tickets.
WITH open AS
(
SELECT COUNT(Work_Request_Code) AS OpenCount, Date_Requested AS Date
FROM Tickets_Open
WHERE Date_Requested >='2018-01-01 00:00:00'
GROUP BY(Date_Requested)
)
, close AS
(
SELECT COUNT(Ticket_Request_Code) AS ClosedCount, Date_Requested AS Date
FROM Tickets_Closed
WHERE Date_Requested >='2018-01-01 00:00:00'
GROUP BY(Date_Requested)
)
SELECT
COALESCE(c.Date, o.Date) AS Date
, IFNULL(o.OpenCount, 0) + IFNULL(c.ClosedCount, 0) AS TotalOpen
, IFNULL(c.CountClosed, 0) AS CountClosed
FROM open o
FULL OUTER JOIN closed c ON o.Date = c.Date
first one:
SELECT MONTH(timestamp) AS d, COUNT(*) AS c
FROM table
WHERE YEAR(timestamp)=2012 AND Status = 1
GROUP BY MONTH(timestamp)
one of the issues I'm facing for this one is that I have to run multiple queries that use different values for Status. Is there a way to combine them into one? Like in one column it would have all the counts for when Status=1 and another column for when Status=2, etc.
second one:
SELECT COUNT(*) c , MONTH(timestamp) t FROM
(
SELECT t.adminid, timestamp
FROM table1 t
LEFT JOIN admins a ON a.adminID=t.adminID
WHERE YEAR(timestamp)=2012
GROUP BY t.adminID, DATE(Timestamp)
ORDER BY timestamp DESC
) AS a
GROUP BY MONTH(timestamp)
ORDER BY MONTH(timestamp) ASC;
a nested query, not sure if I can improve on this. I'm running this one on 2 tables, one has ~35k rows and one has ~300k rows. It takes about half a second for the first table and 4-5 seconds for the second.
These might help:
First one:
SELECT MONTH(timestamp) AS d,
sum(case when Status=1 then 1 else 0 end) as Status1Count,
sum(case when Status=2 then 1 else 0 end) as Status2Count,
sum(case when Status=3 then 1 else 0 end) as Status3Count
FROM `table`
WHERE timestamp between '2012-01-01 00:00:00' and '2012-12-31 23:59:59'
AND Status in (1,2,3)
GROUP BY MONTH(timestamp);
Second one:
Make sure that there is an index on the timestamp column and then make sure that you do not run any conversion functions e.g. MONTH(timestamp) on the indexed column. Somthing like:
SELECT COUNT(*) c , a.m as t FROM
(
SELECT t.adminid, timestamp, MONTH(timestamp) as m
FROM table1 t
LEFT JOIN admins a ON a.adminID=t.adminID
WHERE timestamp between '2012-01-01 00:00:00' and '2012-12-31 23:59:59'
GROUP BY t.adminID, DATE(Timestamp)
ORDER BY timestamp DESC
) AS a
GROUP BY a.m
ORDER BY a.m ASC;
Second one is a bit tricky since I do not have the data in front of me so I can't see the DB access path!