sql for Group By and SUM with date range condition - mysql

Below is the Invoices table:
I am trying to make a sql query which gives me output based on due_date range with the sum of balance_amount group by company_id
My try:
select invoices.company_id,
SUM(invoices_cmonth.balance_amount) as cmonth,
SUM(invoices_1month.balance_amount) as 1month,
SUM(invoices_2month.balance_amount) as 2month
from `invoices`
LEFT JOIN invoices invoices_cmonth
ON (invoices.company_id = invoices_cmonth.company_id and invoices_cmonth.due_date >= '2021-11-10')
LEFT JOIN invoices invoices_1month
ON (invoices.company_id = invoices_1month.company_id and invoices_1month.due_date < '2021-11-10' and invoices_1month.due_date >= '2021-10-10')
LEFT JOIN invoices invoices_2month
ON (invoices.company_id = invoices_2month.company_id and invoices_2month.due_date < '2021-10-10' and invoices_2month.due_date >= '2021-9-10')
where invoices.`status` = 'ACTIVE'
and invoices.`balance_amount` > 0
and `invoices`.`deleted_at` is null
group by invoices.`company_id`
But it is giving me wrong figures in balance amount.

I suggest just making a single pass over the invoices table using conditional aggregation for the various time windows:
SELECT
company_id,
SUM(CASE WHEN due_date >= '2021-11-10' THEN balance_amount ELSE 0 END) AS cmonth,
SUM(CASE WHEN due_date >= '2021-10-10' AND due_date < '2021-11-10'
THEN balance_amount ELSE 0 END) AS 1month,
SUM(CASE WHEN due_date >= '2021-09-10' AND due_date < '2021-10-10'
THEN balance_amount ELSE 0 END) AS 2month
FROM invoices
WHERE
status = 'ACTIVE' AND balance_amount > 0 AND deleted_at IS NULL
GROUP BY
company_id;

Related

Query with multiple AND conditions

I have my query like below
SELECT dates,
COUNT(link_data_id) as TotalClicks,
sum(case when link_redirect_status = 1 then 1 else 0 end) AS GoodClicks,
sum(case when link_redirect_status != 1 then 1 else 0 end) AS BadClicks
FROM tbl_calendar
LEFT JOIN tbl_links_data ON dates = CAST(link_data_time AS DATE)
WHERE (`dates` BETWEEN '2021-11-28' AND DATE_ADD('2021-11-28', INTERVAL 7 DAY))
GROUP BY dates
It's giving me expected output like below
But I want add one more condition called link_order_id ='abcde', so I am trying like below
SELECT dates,
COUNT(link_data_id) as TotalClicks,
sum(case when link_redirect_status = 1 then 1 else 0 end) AS GoodClicks,
sum(case when link_redirect_status != 1 then 1 else 0 end) AS BadClicks
FROM tbl_calendar
LEFT JOIN tbl_links_data ON dates = CAST(link_data_time AS DATE)
WHERE link_order_id = 'abcde'
AND (`dates` BETWEEN '2021-11-28' AND DATE_ADD('2021-11-28', INTERVAL 7 DAY))
GROUP BY dates
But it's giving me only two rows like below
Why I am getting only two rows instead of 8 rows like first picture?
Move the criteria in the WHERE clause to the ON clause of the join:
SELECT
c.dates,
COUNT(d.link_data_id) AS TotalClicks,
SUM(CASE WHEN d.link_redirect_status = 1 THEN 1 ELSE 0 END) AS GoodClicks,
SUM(CASE WHEN d.link_redirect_status != 1 THEN 1 ELSE 0 END) AS BadClicks
FROM tbl_calendar c
LEFT JOIN tbl_links_data d
ON c.dates = CAST(d.link_data_time AS DATE) AND
d.link_order_id = 'abcde'
WHERE (c.dates BETWEEN '2021-11-28' AND DATE_ADD('2021-11-28', INTERVAL 7 DAY))
GROUP BY c.dates;

Display all data grouped by date in a particular timeframe

Currently I have 2 tables, a listing table and a logs table. With the following query I'm trying to get the listings of a product on a particular day, and it returns the right output.
with X as (
select
l.*,
(select status_from from logs where logs.refno = l.refno and logs.logtime >= '2021-10-01' order by logs.logtime limit 1) logstat
from listings l
where l.added_date < '2021-10-01'
)
, Y as (select X.*, ifnull(X.logstat, X.status) stat from X)
SELECT
status.text,
COUNT(Y.id) AS c
from status
left join Y on Y.stat = status.code
group by status.code, status.text;
This gives an output like this:
Here I've filtered the query by 1 date which in this case is 2021-10-01. Now I have 2 input forms where the user can select a from date and a to date. So I want to be able to get all the data between the date range provided. So basically if I choose a date between 2021-10-01 and 2021-10-02, it should show everything on and between that date. The output should look like:
Date
Publish
Action
Let
Sold
Draft
2021-10-01
0
3
0
1
1
2021-10-02
0
2
0
1
2
Dbfiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=5e0b8d484a41ac9104d0fb002e7f9a3c
I've formatted the table to show the entries in a row wise manner with the following query:
with X as (
select l.*,
(select status_from from logs where logs.refno = l.refno and logs.logtime >= '2021-10-01' order by logs.logtime limit 1) logstat
from listings l
where l.added_date < '2021-10-01'
)
, Y as (select X.*, ifnull(X.logstat, X.status) stat20211001 from X)
SELECT
sum(case when status.text= 'Action' and Y.id is not null then 1 else 0 end) as `Action`,
sum(case when status.text= 'Draft' and Y.id is not null then 1 else 0 end) as `Draft`,
sum(case when status.text= 'Let' and Y.id is not null then 1 else 0 end) as `Let`,
sum(case when status.text= 'Sold' and Y.id is not null then 1 else 0 end) as `Sold`,
sum(case when status.text= 'Publish' and Y.id is not null then 1 else 0 end) as `Publish`
from status
left join Y on Y.stat20211001 = status.code
Output for this statement:
If you open my dbfiddle and enter date as 2021-10-01 it gives correct output and if you enter 2021-10-02 it shows correct output. Now I just want a way to show these both together. Also if it is suppose 2021-10-01 and 2021-10-05, it should show everything in middle too which means 2021-10-01, 2021-10-02, 2021-10-03, 2021-10-04 and 2021-10-05
Your listings.added_date column has the DATETIME data type. Therefore, to select a date range of 2021-10-01 to 2021-10-02 you need to do this.
WHERE added_date >= '2021-10-01'
AND added_date < '2021-10-02' + INTERVAL 1 DAY
This pulls in all the rows from midnight on 1-October, up to but not including midnight on 3-October.
If you want to aggregate your results by day, you can use GROUP BY DATE(added_date).
A sample query -- to show all days in September -- might look like this:
SELECT DATE(added_date) day,
SUM(CASE WHEN status.text= 'Action' THEN 1 ELSE 0 END) AS `Action`,
SUM(CASE WHEN status.text= 'Draft' THEN 1 ELSE 0 END) AS `Draft`,
SUM(CASE WHEN status.text= 'Let' THEN 1 ELSE 0 END) AS `Let`
FROM tbl
WHERE added_date >= '2021-09-01'
AND added_date < '2021-09-01' + INTERVAL 1 MONTH
GROUP BY DATE(added_date);
Sorry to say, I don't understand how your sample query works well enough to rewrite it with GROUP BY. But this should get you started.

Get records with difference on 2 different date ranges in single query

I have sales table and have two different date ranges.
i.e, I have total sales between (2016-12-21 - 2016-12-30) is 100 and for period (2016-12-11 - 2016-12-20) is 85.
Now the result I want is
100 (sales of 2016-12-21 - 2016-12-30), 85 (sales of 2016-12-11 - 2016-12-20), 15 (difference of both periods) through single query.
What I am thinking is
select *, (a.sales - b.sales) as diff
from (select id, sum(sales) as sales from salestable where date >= '2016-12-21' and date <= '2016-12-30') a
join (select id, sum(sales) as sales from salestable where date >= '2016-12-11' and date <= '2016-12-20') b
on a.id = b.id;
Is there any other better way to do this?
You can use conditional aggregation:
select sum(case when date >= '2016-12-21' and date <= '2016-12-30' then sales else 0
end) as sales_a,
sum(case when date >= '2016-12-11' and date <= '2016-12-20' then sales else 0
end) as sales_b,
sum(case when date >= '2016-12-21' and date <= '2016-12-30'
then sales else 0
when date >= '2016-12-11' and date <= '2016-12-20'
then -sales
else 0
end) as sales_diff
from salestable;
If you want the overall sum by id (as suggested by your inclusion of id), then add id to the select and add group by id.
You can use case to do a conditional sum like this:
select id,
sum_21_to_30,
sum_11_to_20,
sum_21_to_30 - sum_11_to_20 diff
from (select id,
sum(case when date >= '2016-12-21' and date <= '2016-12-30' then sales else 0 end) sum_21_to_30,
sum(case when date >= '2016-12-11' and date <= '2016-12-20' then sales else 0 end) sum_11_to_20
from table group by id) t;

How to make aliases sql queries with certain criteria

I Have wrote sql query something like this :
SELECT `petugas_input`,
COUNT(`petugas_input`) AS `01-MAR`,
COUNT(`petugas_input`) AS `02-MAR`,
COUNT(`petugas_input`) AS `03-MAR`
FROM `tabel_arsip`
WHERE `tgl_input_arsip`>='2016-03-01 00:00:00' AND `tgl_input_arsip`<='2016-03-01 23:59:59'
GROUP BY `petugas_input`
and its generate result like this
My question is how to add criteria to the aliases column so that it will show different value on different date. (not the same value in the date column as above)
You'd have to rely on a little complex grouping:
SELECT
`petugas_input`,
SUM(CASE WHEN DATE(tgl_input_arsip) = '2016-03-01' THEN 1 ELSE 0 END) AS `01-MAR`,
SUM(CASE WHEN DATE(tgl_input_arsip) = '2016-03-02' THEN 1 ELSE 0 END) AS `02-MAR`,
SUM(CASE WHEN DATE(tgl_input_arsip) = '2016-03-03' THEN 1 ELSE 0 END) AS `03-MAR`,
FROM `tabel_arsip`
WHERE `tgl_input_arsip`>='2016-03-01 00:00:00' AND `tgl_input_arsip`<='2016-03-01 23:59:59'
GROUP BY `petugas_input`
You should not think for these hard-coded column aliases rather make a query for each petugas_input and for each date (within the given date range) along with the count.
Something like this:
SELECT
`petugas_input`,
DATE(`tgl_input_arsip`) `date`,
COUNT(*) total
FROM `tabel_arsip`
WHERE `tgl_input_arsip`>='2016-03-01 00:00:00' AND `tgl_input_arsip`<='2016-03-01 23:59:59'
GROUP BY `petugas_input`,`date`;
And you will get the following output structure:
petugas_input date total
A yyyy-mm-dd n1
B yyyy-mm-dd n2
Try this one:
SELECT `petugas_input`,
COUNT(CASE WHEN DATE(tgl_input_arsip) = '2016-03-01' THEN petugas_input ELSE 0 END) AS `01-MAR`,
COUNT(CASE WHEN DATE(tgl_input_arsip) = '2016-03-02' THEN petugas_input ELSE 0 END) AS `02-MAR`,
COUNT(CASE WHEN DATE(tgl_input_arsip) = '2016-03-03' THEN petugas_input ELSE 0 END) AS `03-MAR`
FROM `tabel_arsip`
WHERE `tgl_input_arsip`>='2016-03-01 00:00:00' AND `tgl_input_arsip`<='2016-03-03 23:59:59'
GROUP BY `petugas_input`;
:)

SQL Query to find rows that didn't occur this month

I am trying to find the number of sellers that made a sale last month but didn't make a sale this month.
I have a query that works but I don't think its efficient and I haven't figured out how to do this for all months.
SELECT count(distinct user_id) as users
FROM transactions
WHERE MONTH(date) = 12
AND YEAR(date) = 2015
AND transactions.status = 'COMPLETED'
AND transactions.amount > 0
AND transactions.user_id NOT IN
(
SELECT distinct user_id
FROM transactions
WHERE MONTH(date) = 1
AND YEAR(date) = 2016
AND transactions.status = 'COMPLETED'
AND transactions.amount > 0
)
The structure of the table is:
+---------+------------+-------------+--------+
| user_id | date | status | amount |
+---------+------------+-------------+--------+
| 1 | 2016-01-01 | 'COMPLETED' | 1.00 |
| 2 | 2015-12-01 | 'COMPLETED' | 1.00 |
| 3 | 2015-12-01 | 'COMPLETED' | 2.00 |
| 1 | 2015-12-01 | 'COMPLETED' | 3.00 |
+---------+------------+-------------+--------+
So in this case, users with ID 2 and 3, didn't make a sale this month.
Use conditional aggregation:
SELECT count(*) as users
FROM
(
SELECT user_id
FROM transactions
-- 1st of previous month
WHERE date BETWEEN SUBDATE(SUBDATE(CURRENT_DATE, DAYOFMONTH(CURRENT_DATE)-1), interval 1 month)
-- end of current month
AND LAST_DAY(CURRENT_DATE)
AND transactions.status = 'COMPLETED'
AND transactions.amount > 0
GROUP BY user_id
-- any row from previous month
HAVING MAX(CASE WHEN date < SUBDATE(CURRENT_DATE, DAYOFMONTH(CURRENT_DATE)-1)
THEN date
END) IS NOT NULL
-- no row in current month
AND MAX(CASE WHEN date >= SUBDATE(CURRENT_DATE, DAYOFMONTH(CURRENT_DATE)-1)
THEN date
END) IS NULL
) AS dt
SUBDATE(CURRENT_DATE, DAYOFMONTH(CURRENT_DATE)-1) = first day of current month
SUBDATE(first day of current month, interval 1 month) = first day of previous month
LAST_DAY(CURRENT_DATE) = end of current month
if you want to generify it, you can use curdate() to get current month, and DATE_SUB(curdate(), INTERVAL 1 MONTH) to get last month (you will need to do some if clause for January/December though):
SELECT count(distinct user_id) as users
FROM transactions
WHERE MONTH(date) = MONTH(DATE_SUB(curdate(), INTERVAL 1 MONTH))
AND transactions.status = 'COMPLETED'
AND transactions.amount > 0
AND transactions.user_id NOT IN
(
SELECT distinct user_id
FROM transactions
WHERE MONTH(date) = MONTH(curdate())
AND transactions.status = 'COMPLETED'
AND transactions.amount > 0
)
as far as efficiency goes, I don't see a problem with this one
The following should be pretty efficient. In order to make it even more so, you would need to provide the table definition and and the EXPLAIN.
SELECT COUNT(DISTINCT user_id) users
FROM transactions t
LEFT
JOIN transactions x
ON x.user_id = t.user_id
AND x.date BETWEEN '2016-01-01' AND '2016-01-31'
AND x.status = 'COMPLETED'
AND x.amount > 0
WHERE t.date BETWEEN '2015-12-01' AND '2015-12-31'
AND t.status = 'COMPLETED'
AND t.amount > 0
AND x.user_id IS NULL;
Just some input for thought:
You could create aggregated lists of user-IDs per month, representing all the unique buyers in that month. In your application, you would then simply have to subtract the two months in question in order to get all user-IDs that have only made a sale in one of the two months.
See below for query- and post-processing-examples.
In order to make your query efficient, I would recommend at least a 2-column index for table transactions on [status, amount]. However, in order to prevent the query from having to look up data in the actual table, you could even create a 4-column index [status, amount, date, user_id], which should further improve the performance of your query.
Postgres (v9.0+, tested)
SELECT (DATE_PART('year', t.date) || '-' || DATE_PART('month', t.date)) AS d,
STRING_AGG( DISTINCT t.user_id::TEXT, ',' ) AS buyers
FROM transactions t
WHERE t.status = 'COMPLETED'
AND t.amount > 0
GROUP BY DATE_PART('year', t.date),
DATE_PART('month', t.date)
ORDER BY DATE_PART('year', t.date),
DATE_PART('month', t.date)
;
MySQL (not tested)
SELECT (YEAR(t.date) || '-' || MONTH(t.date)) AS d,
GROUP_CONCAT( DISTINCT t.user_id ) AS buyers
FROM transactions t
WHERE t.status = 'COMPLETED'
AND t.amount > 0
GROUP BY YEAR(t.date), MONTH(t.date)
ORDER BY YEAR(t.date), MONTH(t.date)
;
Ruby (example for post-processing)
db_result = ActiveRecord::Base.connection_pool.with_connection { |con| con.execute( db_query ) }
unique_buyers = db_result.map{|e|[e['d'],e['buyers'].split(',')]}.to_h
buyers_dec15_but_not_jan16 = unique_buyers['2015-12'] - unique_buyers['2016-1']
buyers_nov15_but_not_dec16 = unique_buyers['2015-11']||[] - unique_buyers['2015-12']
...(and so on)...