I want to assign the rank to 'drug_name' as per the order of 'svcdate' for each 'patient_id' in a dataset. (here, to describe the issue I'm only showing one patient_id in the image)
select patient_id
,svcdate
,drug_name
,dense_rank() over(partition by patient_id order by first_date) as rank
from (
select *
,first_value(svcdate) over (
partition by patient_id, drug_name
order by svcdate) as first_date
from table
)
order by 1,2;
With this query I'm getting the following output,
Although, I want something like this (as shown in image below)
Please help me understand what I'm missing out in the query and how to address this issue.
Thanks!!
using this CTE for the data:
with data(patient_id, svcdate, drug_name) as (
select * from values
(110, '2018-08-09'::date, 'TRANEXAMIC ACID'),
(110, '2020-05-28'::date, 'TAKHZYRO'),
(110, '2020-06-10'::date, 'ICATIBANT'),
(110, '2020-06-24'::date, 'TAKHZYRO'),
(110, '2020-07-22'::date, 'TAKHZYRO'),
(110, '2020-07-24'::date, 'ICATIBANT'),
(110, '2020-08-31'::date, 'ICATIBANT'),
(110, '2020-08-31'::date, 'TAKHZYRO')
)
And using CONDITONAL_CHANGE_EVENT gives you what you want
select patient_id
,svcdate
,drug_name
,CONDITIONAL_CHANGE_EVENT( drug_name ) OVER (
PARTITION BY patient_id ORDER BY svcdate )+1 as rank
from data
order by 1,2;
gives:
PATIENT_ID
SVCDATE
DRUG_NAME
RANK
110
2018-08-09
TRANEXAMIC ACID
1
110
2020-05-28
TAKHZYRO
2
110
2020-06-10
ICATIBANT
3
110
2020-06-24
TAKHZYRO
4
110
2020-07-22
TAKHZYRO
4
110
2020-07-24
ICATIBANT
5
110
2020-08-31
ICATIBANT
5
110
2020-08-31
TAKHZYRO
6
We can try to use LAG window function in the subquery to get each previous drug_name, then compare by condition aggregate window function to make rank column.
select patient_id
,svcdate
,drug_name
,SUM(CASE WHEN prev_drug_name <> drug_name THEN 1 ELSE 0 END) over(partition by patient_id order by first_date) as rank
from (
select *,LAG(drug_name) OVER(partition by patient_id ORDER BY svcdate) prev_drug_name
from table
)
order by 1,2;
Related
I have a system that stores the data only when they are changed. So, the dataset looks like below.
data_type_id
data_value
inserted_at
2
240
2022-01-19 17:20:52
1
30
2022-01-19 17:20:47
2
239
2022-01-19 17:20:42
1
29
2022-01-19 17:20:42
My data frequency is every 5 seconds. So, whether there's any timestamp or not I need to get the result by assuming in this 5th-second data value the same as the previous value.
As I am storing the data that are only changed, indeed the dataset should be like below.
data_type_id
data_value
inserted_at
2
240
2022-01-19 17:20:52
1
30
2022-01-19 17:20:52
2
239
2022-01-19 17:20:47
1
30
2022-01-19 17:20:47
2
239
2022-01-19 17:20:42
1
29
2022-01-19 17:20:42
I don't want to insert into my table, I just want to retrieve the data like this on the SELECT statement.
Is there any way I can create this query?
PS. I have many data_types hence when the OP makes a query, it usually gets around a million rows.
EDIT:
Information about server Server version: 10.3.27-MariaDB-0+deb10u1 Debian 10
The User is going to determine the SELECT DateTime. So, there's no certain between time.
As #Akina mentioned, sometimes there're some gaps between the inserted_at. The difference might be ~4seconds or ~6seconds instead of a certain 5seconds. Since it's not going to happen so frequently, It is okay to generate by ignoring this fact.
With the help of a query that gets you all the combinations of data_type_id and the 5-second moments you need, you can achieve the result you need using a subquery that gets you the closest data_value:
with recursive u as
(select '2022-01-19 17:20:42' as d
union all
select DATE_ADD(d, interval 5 second) from u
where d < '2022-01-19 17:20:52'),
v as
(select * from u cross join (select distinct data_type_id from table_name) t)
select v.data_type_id,
(select data_value from table_name where inserted_at <= d and data_type_id = v.data_type_id
order by inserted_at desc limit 1) as data_value,
d as inserted_at
from v
Fiddle
You can replace the recursive CTE with any query that gets you all the 5-second moments you need.
WITH RECURSIVE
cte1 AS ( SELECT #start_datetime dt
UNION ALL
SELECT dt + INTERVAL 5 SECOND FROM cte1 WHERE dt < #end_datetime),
cte2 AS ( SELECT *,
ROW_NUMBER() OVER (PARTITION BY test.data_type_id, cte1.dt
ORDER BY test.inserted_at DESC) rn
FROM cte1
LEFT JOIN test ON FIND_IN_SET(test.data_type_id, #data_type_ids)
AND cte1.dt >= test.inserted_at )
SELECT *
FROM cte2
WHERE rn = 1
https://dbfiddle.uk/?rdbms=mariadb_10.3&fiddle=380ad334de0c980a0ddf1b49bb6fa38e
version = MySQL 8.0
MRE:
create table test_table(
item_id int,
price decimal,
transaction_time datetime
);
insert into test_table(item_id, price, transaction_time)
Values (1, 5500, "2020-01-01 00:11:11")
, (1, 1000, "2020-01-07 01:11:11")
, (3, 1100, "2020-01-06 18:10:10")
, (3, 7700, "2020-01-03 18:10:10")
, (4, 1900, "2020-01-02 12:00:11");
Using windowing function to get cumulative price for each item_id I run:
select *
, sum(price) over(partition by item_id) as cum_fee
from test_table;
which outputs:
item_id price transaction_time cum_fee
1 5500 2020-01-01 00:11:11 6500
1 1000 2020-01-07 01:11:11 6500
3 1100 2020-01-06 18:10:10 8800
3 7700 2020-01-03 18:10:10 8800
4 1900 2020-01-02 12:00:11 1900
Now I want to get rid of duplicate item_id. The reason I added windowing function is I want to get rid of duplicate item_id but want to keep their cumulative price "cum_fee".
My initial attempt was groupby item_id at the end:
select *
, sum(price) over(partition by item_id) as cum_fee
from test_table
group by item_id;
this seems to groupby item_id first then runs windowing function outputting:
item_id price transaction_time cum_fee
1 5500 2020-01-01 00:11:11 5500
3 1100 2020-01-06 18:10:10 1100
4 1900 2020-01-02 12:00:11 1900
I know people comparing groupby Vs. windowing function which probably means we use either one but not both? is it true?
is yes, what is an alternative method to achieve my goal?
You seem to want aggregation. Perhaps this?
select item_id, min(price), min(transaction_time), sum(price)
from test_table
group by item_id;
Window functions do not change the number of rows. That is what group by does.
I have the following table structure..
emp_id | base_rate | base_sal | effective_on
1001 26.22 1200 2015-10-12
1001 26.00 1100 2015-11-12
1001 26.00 1100 2015-12-12
1002 18 1200 2015-10-12
1002 19 1100 2015-11-12
I need to find get the last updated base_rate with effective_on date for each emp_id
Like output ..
1001 26.00 1100 2015-11-12
1002 19 1100 2015-11-12
See, for 1001 2015-11-12 is selected instead of 2015-12-12 which is latest as the base_rate is same and hence previously effective from 2015-11-12
I have tried.. everything.. not able to find the exact query..
This method is simple and easy to understand.
1) Assign rank for all the effective dates in descending order by partitioning
for each employee.
2) Select all the required fields for the last updated effective date from the
inner query and display the result.
SELECT emp_id,base_rate,base_sal
FROM
(
SELECT *,
ROW_NUMBER() OVER ( PARTITION BY emp_id ORDER BY effective_on DESC ) AS rn
FROM table
)
WHERE rn = 1;
One method is to generate a subset of employees with max effective on and join back to the base set..
In the below we generate set "B" with Emp_ID and ME (max effective) and then we join back to the entire data set in the table and use the columns emp_ID and ME to limit the data in the base set and return all columns we care about.
Put in English:
We generated a data set for all the employess with only their max effective date, and then joined this data set back to the base set to limit the data in the base set to only contain records for employees with their most recent effective_on date.
SELECT A.Emp_ID, A.Base_Rate, A.Base_Sal, min(C.Effective_On)
FROM Table A
INNER JOIN (SELECT emp_ID, Max(Effective_on) ME
FROM Table A
GROUP BY Emp_ID) B
on A.Emp_ID = B.Emp_ID
and A.Effective_ON = B.ME
INNER JOIN TABLE C
on C.Emp_ID = A.Emp_ID
and C.Base_Rate= A.Base_rate
and C.base_Sal = A.Base_Sal
GROUP BY A.Emp_ID, A.Base_Rate, A.Base_Sal
This is more or less database agnostic whereas a row_number and limit would not work on mySQL as it doesn't support window functions.
You can first get the minimum date each base_rate becomes effective on for every employee and then take the max from there. Here is how you can do it using row_number() in oracle:
with temp(emp_id, base_rate, base_sal, effective_on)
as (select 1001, 26.22, 1200, '2015-10-12' from dual union all
select 1001, 26.00, 1100, '2015-11-12' from dual union all
select 1001, 26.00, 1100, '2015-12-12' from dual union all
select 1002, 18, 1200, '2015-10-12' from dual union all
select 1002, 19, 1100, '2015-11-12' from dual
)
SELECT emp_id,base_rate,base_sal,effective_on FROM(
SELECT temp2.*,
row_number() OVER (PARTITION BY EMP_ID ORDER BY effective_on DESC) AS rn2
FROM
(
SELECT temp.*,
row_number() OVER (PARTITION BY EMP_ID, BASE_RATE ORDER BY effective_on) AS rn
FROM temp
) temp2
WHERE rn = 1
)
WHERE rn2 = 1;
I am trying to get the second last records use mysql.
I did some research, some sample has fix gap between numbers or date. But my situation is that the contract_id is not always +1 from the previous one. Anyone ideas? Thank you so much.
merchant_id contract_id start_date
10 501 2016-05-01
10 506 2016-06-01
13 456 2015-12-01
13 462 2016-01-01
14 620 2016-06-01
14 642 2016-07-01
14 656 2016-07-05
merchant_id Second_last_contract_id
10 501
13 456
14 642
contract_id != previous contract_id + X. (The X is not fixed)
'start_date' tell us the contracts creating order.
Here's one option using user-defined variables to establish a row number per group of merchants and then filtering on the 2nd in each group ordered by contracts:
select *
from (
select *,
#rn:=if(#prevMerchantId=merchantid,
#rn+1,
if(#prevMerchantId:=merchantid, 1, 1)
) as rn
from yourtable cross join (select #rn:=0, #prevMerchantId:=null) t
order by merchantId, contractid desc
) t
where rn = 2
SQL Fiddle Demo
Here's another option, filtering the results of GROUP_CONCAT() using SUBSTRING_INDEX():
SELECT merchant_id,
SUBSTRING_INDEX(SUBSTRING_INDEX(
GROUP_CONCAT(contract_id ORDER BY start_date DESC),
',', 2), ',', -1) AS Second_last_contract_id
FROM the_table
GROUP BY merchant_id
See it on sqlfiddle.
I am looking for some query help
here is the following table data
Name Runs Status
Ram 50 out
Ram 103 not out
Krish 51 out
Sam 15 out
Ram 15 out
Krish 78 not out
I am expecting a single query to give the folllowing results
Name Total >100 >50&<100 TotalTimes Notout
Ram 168 1 1 3 1
Sam 15 0 0 1 0
Krish 129 0 2 2 1
I am able to write the query to get the total, Totaltimes with the help of Group By functionalities, I am stuck with the rest
Here is the query I have come up
select Name, sum(Runs) as total, count(*) as totalTimes
from tempTable
where classID IN (Select classID from upcoming_Clases where classes_id=175)
group by Name order by total desc
I am using the Mysql Database
You can do this using case:
select Name,
sum(Runs) as total,
count(case when Runs>100 then 1 end) `>100`,
count(case when Runs>50 and Runs<100 then 1 end) `>50&<100`,
count(*) as totalTimes,
count(case when Status='not out' then 1 end) `Not Out`
from tempTable
where classID IN (Select classID from upcoming_Clases where classes_id=175)
group by Name order by total desc
You can use SUM() together with IF() to test your criteria:
SELECT
Name,
SUM(Runs) AS Total,
SUM(IF(Runs>100, 1, 0)) AS `>100`,
SUM(IF(Runs>50 AND Runs<100), 1, 0) AS `>50&<100`,
COUNT(*) AS TotalTimes,
SUM(IF(Status='not out', 1, 0)) AS Notout
FROM tempTable
WHERE classID IN (SELECT classID FROM upcoming_Clases WHERE classes_id = 175)
GROUP BY Name
ORDER BY Total DESC