I have several datetime columns. I need to calculate in SQL Server 2008 for each timestamp how many datetime stamps in the same column are smaller than each of datetime stamps.
For example: for 2016-05-01 14:24:000.00 in column DateTime1 I need to calculate how many datetime values are smaller then it in DateTime1 column.
I also need to know how many datetimestamps are smaller than a datetime stamp for the same record (in the same row) in column DateTime2 and 3.
DateTime1 DateTime2 DateTime3
----------------------------------------------------------------------------
2016-05-01 13:24:000.00 2016-05-01 15:24:000.00 2016-05-01 16:20:000.00
2016-05-01 13:30:000.00 2016-05-01 14:21:000.00 2016-05-01 15:10:000.00
2016-05-01 14:24:000.00 2016-05-01 17:21:000.00 2016-05-01 18:10:000.00
If I understand correctly, you can use rank():
select t.*,
rank() over (order by datetime1) as dt1_rank,
rank() over (order by datetime2) as dt2_rank,
rank() over (order by datetime3) as dt3_rank
from t ;
Depending on how you want to treat tied values, you might actually want dense_rank(). Also, you might want to subtract 1 from the ranking value.
Assume I have a Table name [TestTB] has 3 columns DateTime1,DateTime2,DateTime3 .
I said CountSmallerDateTime1 as "how many datetime values are smaller then it in DateTime1 column"
I said CountSmallerDateTime2 as "how many datetimestamps are smaller than a datetime stamp for the same record (in the same row) in column DateTime2" , Similarity , CountSmallerDateTime3 for DateTime3 .
Then I have a query for your request :
SELECT [DateTime1]
,[DateTime2]
,[DateTime3]
,(SELECT COUNT(1)
FROM [TestTB] Sub
WHERE TB.[DateTime1] >Sub.[DateTime1]) AS CountSmallerDateTime1
,(
CASE WHEN TB.[DateTime2] > TB.[DateTime1] AND TB.[DateTime2] > TB.[DateTime3] THEN
2
WHEN ( (TB.[DateTime2] <= TB.[DateTime1] AND TB.[DateTime2] > TB.[DateTime3])
OR (TB.[DateTime2] > TB.[DateTime1] AND TB.[DateTime2] <= TB.[DateTime3])) THEN
1
ELSE
0
END
) AS CountSmallerDateTime2,
(
CASE WHEN TB.[DateTime3] > TB.[DateTime1] AND TB.[DateTime3] > TB.[DateTime2] THEN
2
WHEN ( (TB.[DateTime3] <= TB.[DateTime1] AND TB.[DateTime3] > TB.[DateTime2])
OR (TB.[DateTime3] > TB.[DateTime1] AND TB.[DateTime3] <= TB.[DateTime2])) THEN
1
ELSE
0
END
) AS CountSmallerDateTime3 FROM [TestTB] TB
;WITH CTE(DATE1, DATE2, DATE3,RN)
AS
(
SELECT CONVERT(DATETIME , '2016-05-01 13:24:000.00'), CONVERT(DATETIME,'2016-05-01 15:24:000.00'), CONVERT(DATETIME,'2016-05-01 16:20:000.00'),1
UNION ALL
SELECT CONVERT(DATETIME , '2016-05-01 13:30:000.00'), CONVERT(DATETIME,'2016-05-01 14:21:000.00'), CONVERT(DATETIME,'2016-05-01 15:10:000.00'),2
UNION ALL
SELECT CONVERT(DATETIME , '2016-05-01 14:24:000.00'), CONVERT(DATETIME,'2016-05-01 17:21:000.00'), CONVERT(DATETIME,'2016-05-01 18:10:000.00'),3
)
SELECT RANK() OVER (ORDER BY DATE1) -1 AS SAME_COLUMN_DATE1
, RANK() OVER (ORDER BY DATE2) -1 AS SAME_COLUMN_DATE2
, RANK() OVER (ORDER BY DATE3) -1 AS SAME_COLUMN_DATE3
, CASE WHEN RN=1 AND DATE1< DATE2 AND DATE1<DATE3 THEN 0
WHEN RN=1 AND DATE1< DATE2 AND DATE1>DATE3 THEN 1
WHEN RN=1 AND DATE1> DATE2 AND DATE1<DATE3 THEN 1
ELSE 2
SAME_ROW_1
, CASE WHEN RN=2 AND DATE2< DATE1 AND DATE2<DATE3 THEN 0
WHEN RN=2 AND DATE2< DATE1 AND DATE2>DATE3 THEN 1
WHEN RN=2 AND DATE2> DATE1 AND DATE2<DATE3 THEN 1
ELSE 2
END SAME_ROW_2
, CASE WHEN RN=3 AND DATE3< DATE1 AND DATE3<DATE2 THEN 0
WHEN RN=3 AND DATE3< DATE1 AND DATE3>DATE2 THEN 1
WHEN RN=3 AND DATE3> DATE1 AND DATE3<DATE2 THEN 1
ELSE 2
END SAME_ROW_3
FROM CTE ORDER BY RN
Related
How do you rewrite this code correctly in Snowflake?
select account_code, date,
sum(box_revenue_recognition_amount) as box_revenue_recognition_amount
, sum(case when box_flg = 1 then box_sku_quantity end) as box_sku_quantity
, sum(box_revenue_recognition_refund_amount) as box_revenue_recognition_refund_amount
, sum(box_discount_amount) as box_discount_amount
, sum(box_shipping_amount) as box_shipping_amount
, sum(box_cogs) as box_cogs
, max(invoice_number) as invoice_number
, max(order_number) as order_number
, min(box_refund_date) as box_refund_date
, first (case when order_season_rank = 1 then box_type end) as box_type
, first (case when order_season_rank = 1 then box_order_season end) as box_order_season
, first (case when order_season_rank = 1 then box_product_name end) as box_product_name
, first (case when order_season_rank = 1 then box_coupon_code end) as box_coupon_code
, first (case when order_season_rank = 1 then revenue_recognition_reason end) as revenue_recognition_reason
from dedupe_sub_user_day
group by account_code, date
I have tried to apply window rule has explained in first_value Snowflake documentation to no avail with the SQLCompilation Error: ... is not a valid group by expression
select account_code, date,
first_value(case when order_season_rank = 1 then box_type end) over (order by box_type ) as box_type
first_value(case when order_season_rank = 1 then box_order_season end) over (order by box_order_season ) as box_order_season,
first_value(case when order_season_rank = 1 then box_product_name end) over (order by box_product_name ) as box_product_name,
first_value(case when order_season_rank = 1 then box_coupon_code end) over (order by box_coupon_code ) as box_coupon_code,
first_value(case when order_season_rank = 1 then revenue_recognition_reason end) over (order by revenue_recognition_reason ) as revenue_recognition_reason
, sum(box_revenue_recognition_amount) as box_revenue_recognition_amount
, sum(case when box_flg = 1 then box_sku_quantity end) as box_sku_quantity
, sum(box_revenue_recognition_refund_amount) as box_revenue_recognition_refund_amount
, sum(box_discount_amount) as box_discount_amount
, sum(box_shipping_amount) as box_shipping_amount
, sum(box_cogs) as box_cogs
, max(invoice_number) as invoice_number
, max(order_number) as order_number
, min(box_refund_date) as box_refund_date
from dedupe_sub_user_day
group by 1,2
First_value is not an aggregate function. But an window function, thus you get an error when you use it in relation to a GROUP BY. If you want to use it with a group up put an ANY_VALUE around it.
here is some data I will use below in a CTE:
with data(id, seq, val) as (
select * from values
(1, 1, 10),
(1, 2, 11),
(1, 3, 12),
(1, 4, 13),
(2, 1, 20),
(2, 2, 21),
(2, 3, 22)
)
So to show FIRST_VALUE is a window function we can just use it
select *
,first_value(val)over(partition by id order by seq) as first_val
from data
ID
SEQ
VAL
FIRST_VAL
1
1
10
10
1
2
11
10
1
3
12
10
1
4
13
10
2
1
20
20
2
2
21
20
2
3
22
20
So if we GROUP BY id, to avoid an error we have to wrap the FIRST_VALUE by an aggregate value, as given the are all equal, ANY_VALUE is a good pick, and it seems it needs to be in another layer of SQL:
select id
,count(*) as count
,any_value(first_val) as first_val
from (
select *
,first_value(val)over(partition by id order by seq) as first_val
from data
)
group by 1
order by 1;
ID |COUNT |FIRST_VAL
1 |4 |10
2 |3 |20
now MAX can be fun to use where used in relation to ROW_NUMBER() to pick the best value:
select id
,count(*) as count
,max(first_val) as first_val
from (
select *
,row_number() over (partition by id order by seq) as rn
,iff(rn=1, val, null) as first_val
from data
)
group by 1
order by 1;
but this is almost more complex than the ANY_VALUE solution, but I feel the performance would be better, but if they have the same magnitude of performance, I would always choose readable to you and your team, over a smaller performance difference.
With the way you've written your case statement, it leads me to believe that there is only one row with order_season_rank = 1 when grouping by account_code and date.
If that is true, then you can use several of Snowflake's aggregate functions and you will get what you want. Rather than trying to get the first value, you could use min, max, any_value, mode (or really any aggregate function that will ignore nulls) to return the only non-null value in the aggregation.
first() this link suggests first is only supported by MS ACCESS however you've tagged the question with MYSQL, Snowflake. Could you confirm the DBMS's you are using?
by moving the first_value() function outside the aggregation it seems to work fine
I have this query:
SELECT
vcl.id,
vcl.batch_id,
vcl.type,
vcl.amount,
vcl.date
FROM vrcorporateledger vcl
LEFT JOIN payroll_list pl ON pl.id = vcl.batch_id
which gives the following output:
Whenever there is "CREDIT" in col type I want to increase the running balance by the value in col amount; whenever there is "DEBIT" in col type I want to decrease the accumulated balance by the value in col amount after grouping by batch_id col. So expected result is:
1000-2+5-4-49=950.
If possible I want to also create a column "balance" where at each point/step I see the resulting balance.
expected output like:
WITH cte AS (
SELECT type,
SUM(amount) OVER (PARTITION BY CASE type WHEN 'CREDIT' THEN RAND()
WHEN 'DEBIT' THEN batchID
ELSE 0 END ) amount,
MIN(`date`) OVER (PARTITION BY CASE type WHEN 'CREDIT' THEN RAND()
WHEN 'DEBIT' THEN batchID
ELSE 0 END ) `date`,
SUM(CASE type WHEN 'CREDIT' THEN amount
WHEN 'DEBIT' THEN -amount
ELSE 0 END) OVER (ORDER BY `date`) balance,
batchID,
LEAD(batchID) OVER (ORDER BY `date`) next_batchID
FROM source_data
)
SELECT type,
amount,
balance,
`date`
FROM cte
WHERE CASE WHEN batchID = next_batchID THEN 0 ELSE 1 END
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=75255728f6d64a91a2ebf62edc2d0a0b
I think you're looking for SQL Window functions. They basically allow you to do an aggregate "over a partition".
On a side note: This is a really bad way of calculating doing running balance.
I would strongly suggest storing balance in a separate column at runtime. This should allow you to:
have a strict check even when rows are changed or deleted
normal speed when you have millions of records
If your MySQL version is 8 or above then you can use common table expression with window function as below:
Schema (MySQL v8.0)
create table vrcorporateledger (id int,batch_id int,type varchar(10),amount float,Tdate timestamp);
insert into vrcorporateledger values (1,null,'CREDIT',1000,'2021/03/04 06:19:00');
insert into vrcorporateledger values (2,1,'DEBIT',1,'2021/03/04 07:00:19');
insert into vrcorporateledger values (3,1,'DEBIT',1,'2021/03/04 07:00:25');
insert into vrcorporateledger values (4,null,'CREDIT',5,'2021/03/05 06:19:00');
insert into vrcorporateledger values (5,2,'DEBIT',1,'2021/03/04 08:58:10');
insert into vrcorporateledger values (6,2,'DEBIT',3,'2021/03/04 08:58:16');
insert into vrcorporateledger values (7,null,'DEBIT',49,'2021/03/04 16:42:33');
Query #1
WITH cte AS (
SELECT id,type,
(case when batch_id is null then (case when type='DEBIT' then -amount else amount end) else
SUM(case when type='DEBIT' then -amount else amount end) OVER (PARTITION BY batch_id)end) amount,
(case when batch_id is null then Tdate else
MIN(Tdate) OVER (PARTITION BY batch_id ) end) Trandate,
batch_id,
LEAD(batch_id) OVER (ORDER BY id) next_batch
FROM vrcorporateledger
)
SELECT type,
amount,
sum(amount)over(order by id) running_balance,
Trandate date
FROM cte
WHERE batch_id is null or batch_id =next_batch
order by id;
type
amount
date
running_balance
CREDIT
1000
2021-03-04 06:19:00
1000
DEBIT
-2
2021-03-04 07:00:19
998
CREDIT
5
2021-03-05 06:19:00
1003
DEBIT
-4
2021-03-04 08:58:10
999
DEBIT
-49
2021-03-04 16:42:33
950
View on DB Fiddle
Consider the following schema:
CREATE TABLE `Result` (
`startDate` date NOT NULL,
`description` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`value` decimal(15,4) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `Result`
(`startDate`,
`description`,
`value`)
VALUES
('2020-09-01' ,'Allowance' ,4000),
('2020-09-01' ,'Salary' ,1500),
('2020-10-01' ,'Allowance' ,2000),
('2020-10-01' ,'Salary' ,3000),
('2020-10-01' ,'Deduction' ,-200);
Given a date,the result should show as the total for a description & startdate and the difference between the date selected and previous date(month). So if October was the month selected the result of the query should show as,
description SeptemberTotal OctoberTotal Variance
Allowance 4000 2000 -2000
Salary 1500 3000 1500
Deduction 0 -200 -200
My attempt using a union & a pivot,
SELECT #selectDate:='2020-10-01'; -- set desired date
SELECT
t.month,
t.description,
Gross,
from (
SELECT
DATE_FORMAT(pi.startDate, '%b/%y') AS 'Month',
SUM(pi.value) AS gross,
description
FROM
Result pi
WHERE
pi.startDate = DATE_SUB(#selectDate, INTERVAL 1 MONTH) -- select previous month
GROUP BY description
UNION SELECT
DATE_FORMAT(pi.startDate, '%b/%y') AS 'Month',
SUM(pi.value) AS gross,
description
FROM
Result pi
WHERE
pi.startDate = #selectDate
GROUP BY description) t
GROUP BY t.Month,t.description
;
which gives the result as,
Month description Gross
Sep/20 Allowance 4000
Sep/20 Salary 1500
Oct/20 Allowance 2000
Oct/20 Salary 3000
Oct/20 Deduction -200
which is not exactly what the requirement is. I have tried a pivot query as well, that too is not showing the output as required.
db-fiddle
SET #m1 := '2020-09-01';
SET #m2 := '2020-10-01';
SELECT Result.Description,
COALESCE(SUM(CASE WHEN Result.startDate = #m1 THEN value END), 0) Total1,
COALESCE(SUM(CASE WHEN Result.startDate = #m2 THEN value END), 0) Total2,
COALESCE(SUM(CASE WHEN Result.startDate = #m2 THEN value END), 0) -
COALESCE(SUM(CASE WHEN Result.startDate = #m1 THEN value END), 0) Variance
FROM ( SELECT #m1 startDate UNION ALL SELECT #m2 ) baseDates
LEFT JOIN Result USING (startDate)
GROUP BY Result.Description
fiddle
select
description,
sum(if(startDate between '2020-11-01 00:00:00' and '2020-11-31 23:59:59' ,value,0)) 1st,
sum(if(startDate between '2020-12-01 00:00:00' and '2020-12-31 23:59:59',value,0)) 2nd,
sum(if(startDate between '2020-11-01 00:00:00' and '2020-11-31 23:59:59' ,value,0))-sum(if(created_at between '2020-12-01 00:00:00' and '2020-12-31 23:59:59',value,0)) Variance
from
Result
where startDate between '2020-11-01 00:00:00' and '2020-12-31 23:59:59' group by description
I need to get running totals between 2 dates in my sql server table and update the records simultaneoulsy. My data is as below and ordered by date,voucher_no
DATE VOUCHER_NO OPEN_BAL DEBITS CREDITS CLOS_BAL
-------------------------------------------------------------------
10/10/2017 1 100 10 110
12/10/2017 2 110 5 105
13/10/2017 3 105 20 125
Now if i insert a record with voucher_no 4 on 12/10/2017 the output should be like
DATE VOUCHER_NO OPEN_BAL DEBITS CREDITS CLOS_BAL
------------------------------------------------------------------
10/10/2017 1 100 10 110
12/10/2017 2 110 5 105
12/10/2017 4 105 4 109
13/10/2017 3 109 20 129
I have seen several examples which find running totals upto a certain date but not between 2 dates or from a particular date to end of file
You should consider changing your database structure. I think it will be better to keep DATE, VOUCHER_NO, DEBITS, CREDITS in one table. And create view to calculate balances. In that case you will not have to update table after each insert. In this case your table will look like
create table myTable (
DATE date
, VOUCHER_NO int
, DEBITS int
, CREDITS int
)
insert into myTable values
('20171010', 1, 10, null),( '20171012', 2, null, 5)
, ('20171013', 3, 20, null), ('20171012', 4, 4, null)
And view will be
;with cte as (
select
DATE, VOUCHER_NO, DEBITS, CREDITS, bal = isnull(DEBITS, CREDITS) * case when DEBITS is null then -1 else 1 end
, rn = row_number() over (order by DATE, VOUCHER_NO)
from
myTable
)
select
a.DATE, a.VOUCHER_NO, a.DEBITS, a.CREDITS
, OPEN_BAL = sum(b.bal + case when b.rn = 1 then 100 else 0 end) - a.bal
, CLOS_BAL = sum(b.bal + case when b.rn = 1 then 100 else 0 end)
from
cte a
join cte b on a.rn >= b.rn
group by a.DATE, a.VOUCHER_NO, a.rn, a.bal, a.DEBITS, a.CREDITS
Here's another solution if you can not change your db structure. In this case you must run update statement each time after inserts. In both cases I assume that initial balance is 100 while recalculation
create table myTable (
DATE date
, VOUCHER_NO int
, OPEN_BAL int
, DEBITS int
, CREDITS int
, CLOS_BAL int
)
insert into myTable values
('20171010', 1, 100, 10, null, 110)
,( '20171012', 2, 110, null, 5, 105)
, ('20171013', 3, 105, 20, null, 125)
, ('20171012', 4, null, 4, null, null)
;with cte as (
select
DATE, VOUCHER_NO, DEBITS, CREDITS, bal = isnull(DEBITS, CREDITS) * case when DEBITS is null then -1 else 1 end
, rn = row_number() over (order by DATE, VOUCHER_NO)
from
myTable
)
, cte2 as (
select
a.DATE, a.VOUCHER_NO
, OPEN_BAL = sum(b.bal + case when b.rn = 1 then 100 else 0 end) - a.bal
, CLOS_BAL = sum(b.bal + case when b.rn = 1 then 100 else 0 end)
from
cte a
join cte b on a.rn >= b.rn
group by a.DATE, a.VOUCHER_NO, a.rn, a.bal
)
update a
set a.OPEN_BAL = b.OPEN_BAL, a.CLOS_BAL = b.CLOS_BAL
from
myTable a
join cte2 b on a.DATE = b.DATE and a.VOUCHER_NO = b.VOUCHER_NO
I have a table like this. Now I want to show the total of same dates of different status in single row.
What should be the query?
Expected Result
Created | Total1 | Total2 | Total3
2017-02-28 | 1 | 1 | 2
you could use a sum for case when for each status and group by
select
created
, sum( case when story_status ='Draft' then total else 0 end ) as Draft_count
, sum( case when story_status ='Private' then total else 0 end ) as Private_count
, sum( case when story_status ='Published' then total else 0 end ) as Published_count
from my_table
group by created
This will give you one row per date created, with columns for each story_status:
SELECT
`created`,
SUM(if(`story_status` = 'Draft',`total`,0)) as `Total Draft`,
SUM(if(`story_status` = 'Private',`total`,0)) as `Total Private`,
SUM(if(`story_status` = 'Published',`total`,0)) as `Total Published`
FROM table
GROUP BY `created`
ORDER BY `created`