Get last updated value SQL - mysql

I have the following table structure..
emp_id | base_rate | base_sal | effective_on
1001 26.22 1200 2015-10-12
1001 26.00 1100 2015-11-12
1001 26.00 1100 2015-12-12
1002 18 1200 2015-10-12
1002 19 1100 2015-11-12
I need to find get the last updated base_rate with effective_on date for each emp_id
Like output ..
1001 26.00 1100 2015-11-12
1002 19 1100 2015-11-12
See, for 1001 2015-11-12 is selected instead of 2015-12-12 which is latest as the base_rate is same and hence previously effective from 2015-11-12
I have tried.. everything.. not able to find the exact query..

This method is simple and easy to understand.
1) Assign rank for all the effective dates in descending order by partitioning
for each employee.
2) Select all the required fields for the last updated effective date from the
inner query and display the result.
SELECT emp_id,base_rate,base_sal
FROM
(
SELECT *,
ROW_NUMBER() OVER ( PARTITION BY emp_id ORDER BY effective_on DESC ) AS rn
FROM table
)
WHERE rn = 1;

One method is to generate a subset of employees with max effective on and join back to the base set..
In the below we generate set "B" with Emp_ID and ME (max effective) and then we join back to the entire data set in the table and use the columns emp_ID and ME to limit the data in the base set and return all columns we care about.
Put in English:
We generated a data set for all the employess with only their max effective date, and then joined this data set back to the base set to limit the data in the base set to only contain records for employees with their most recent effective_on date.
SELECT A.Emp_ID, A.Base_Rate, A.Base_Sal, min(C.Effective_On)
FROM Table A
INNER JOIN (SELECT emp_ID, Max(Effective_on) ME
FROM Table A
GROUP BY Emp_ID) B
on A.Emp_ID = B.Emp_ID
and A.Effective_ON = B.ME
INNER JOIN TABLE C
on C.Emp_ID = A.Emp_ID
and C.Base_Rate= A.Base_rate
and C.base_Sal = A.Base_Sal
GROUP BY A.Emp_ID, A.Base_Rate, A.Base_Sal
This is more or less database agnostic whereas a row_number and limit would not work on mySQL as it doesn't support window functions.

You can first get the minimum date each base_rate becomes effective on for every employee and then take the max from there. Here is how you can do it using row_number() in oracle:
with temp(emp_id, base_rate, base_sal, effective_on)
as (select 1001, 26.22, 1200, '2015-10-12' from dual union all
select 1001, 26.00, 1100, '2015-11-12' from dual union all
select 1001, 26.00, 1100, '2015-12-12' from dual union all
select 1002, 18, 1200, '2015-10-12' from dual union all
select 1002, 19, 1100, '2015-11-12' from dual
)
SELECT emp_id,base_rate,base_sal,effective_on FROM(
SELECT temp2.*,
row_number() OVER (PARTITION BY EMP_ID ORDER BY effective_on DESC) AS rn2
FROM
(
SELECT temp.*,
row_number() OVER (PARTITION BY EMP_ID, BASE_RATE ORDER BY effective_on) AS rn
FROM temp
) temp2
WHERE rn = 1
)
WHERE rn2 = 1;

Related

I want to assign rank following with some condition

I want to assign the rank to 'drug_name' as per the order of 'svcdate' for each 'patient_id' in a dataset. (here, to describe the issue I'm only showing one patient_id in the image)
select patient_id
,svcdate
,drug_name
,dense_rank() over(partition by patient_id order by first_date) as rank
from (
select *
,first_value(svcdate) over (
partition by patient_id, drug_name
order by svcdate) as first_date
from table
)
order by 1,2;
With this query I'm getting the following output,
Although, I want something like this (as shown in image below)
Please help me understand what I'm missing out in the query and how to address this issue.
Thanks!!
using this CTE for the data:
with data(patient_id, svcdate, drug_name) as (
select * from values
(110, '2018-08-09'::date, 'TRANEXAMIC ACID'),
(110, '2020-05-28'::date, 'TAKHZYRO'),
(110, '2020-06-10'::date, 'ICATIBANT'),
(110, '2020-06-24'::date, 'TAKHZYRO'),
(110, '2020-07-22'::date, 'TAKHZYRO'),
(110, '2020-07-24'::date, 'ICATIBANT'),
(110, '2020-08-31'::date, 'ICATIBANT'),
(110, '2020-08-31'::date, 'TAKHZYRO')
)
And using CONDITONAL_CHANGE_EVENT gives you what you want
select patient_id
,svcdate
,drug_name
,CONDITIONAL_CHANGE_EVENT( drug_name ) OVER (
PARTITION BY patient_id ORDER BY svcdate )+1 as rank
from data
order by 1,2;
gives:
PATIENT_ID
SVCDATE
DRUG_NAME
RANK
110
2018-08-09
TRANEXAMIC ACID
1
110
2020-05-28
TAKHZYRO
2
110
2020-06-10
ICATIBANT
3
110
2020-06-24
TAKHZYRO
4
110
2020-07-22
TAKHZYRO
4
110
2020-07-24
ICATIBANT
5
110
2020-08-31
ICATIBANT
5
110
2020-08-31
TAKHZYRO
6
We can try to use LAG window function in the subquery to get each previous drug_name, then compare by condition aggregate window function to make rank column.
select patient_id
,svcdate
,drug_name
,SUM(CASE WHEN prev_drug_name <> drug_name THEN 1 ELSE 0 END) over(partition by patient_id order by first_date) as rank
from (
select *,LAG(drug_name) OVER(partition by patient_id ORDER BY svcdate) prev_drug_name
from table
)
order by 1,2;

How to optimize the subqueries in SQL?

I have a data-set, the columns sample information are like below:
Date ID Cost
05/01 1001 30
05/01 1024 19
05/01 1001 29
05/02 1001 28
05/02 1002 19
05/02 1008 16
05/03 1017 89
05/04 1003 28
05/04 1001 16
05/05 1017 28
05/06 1002 44
... etc...
And I want to create a table to display the top one payer(cost the most) on each day, which means there are only two columns in the table, and the output sample should be like this:
Date ID
05/01 1001
05/02 1001
05/03 1017
05/04 1003
...etc...
I know this question is simple, and my problem is that I want to simplify the queries.
My query:
select Date, ID
from (select Date, ID, max(SumCost)
from (select Date, ID, sum(cost) as SumCost
from table1
group by Date, ID) a
group by Date, ID) b;
It seems kind of stupid, and I want to optimize the queries. The point is that I want to only output the Date and the Id, these two columns.
Any suggestions?
Here is a method using a correlated subquery:
select t.*
from t
where t.cost = (select max(t2.cost) from t t2 where t2.date = t.date);
If we take a max cost when there are multiple costs for the player on the same day, then this query will work. The query that you have written above is incorrect.
Select date, ID
from
(
Select Date, ID, row_number() over(partition by date order by cost desc) as rnk
from table
) a
where rnk = 1

MySQL last record in each group with multiple records in same date

Below is the sample data
row_id cust txn_dt txn_amount
-------------------------------------
1 1 31-01-2018 3000
2 1 04-02-2018 4000
3 1 04-02-2018 6000
4 2 29-01-2018 2500
5 2 02-02-2018 3900
6 1 01-02-2018 5000
7 1 01-02-2018 3900
Below is the Expected output
row_id cust txn_dt txn_amount
-------------------------------------
3 1 04-02-2018 6000
5 2 02-02-2018 3900
Need to pick the latest record for each customer based on date and then row_id
It is tricky when there are two columns that define the ordering. Here is one method:
select t.*
from t
where t.row_id = (select t2.row_id
from t t2
where t2.cust = t.cust
order by t2.txn_date desc, row_id desc
limit 1
);
An index on t(cust, txn_date, row_id) should help performance a bit.
Here's an approach that will return the specified result:
SELECT t.row_id
, t.cust
, t.txn_date
, t.txn_amount
FROM ( SELECT r.cust
, MAX(r.row_id) AS max_row_id
FROM ( SELECT p.cust
, DATE_FORMAT(
MAX(
STR_TO_DATE( p.txn_date ,'%d-%m-%Y')
)
,'%d-%m-%Y'
) AS max_txn_date
FROM sample_data p
GROUP BY p.cust
) q
JOIN sample_data r
ON r.cust = q.cust
AND r.txn_date = q.max_txn_date
GROUP BY r.cust
) s
JOIN sample_data t
ON t.cust = s.cust
AND t.row_id = s.max_row_id
ORDER BY t.row_id ASC
Inline view q gets the latest txn_date for each cust
inline view s gets the maximum row_id value for the latest txn_date for each cust
(If txn_date column was DATE datatype, we could avoid the rigmarole of the STR_TO_DATE and DATE_FORMAT functions. And with an appropriate index available, we would (likely) avoid a full scan and an expensive "Using filesort" operation.)

Count number of rows of a column SQL

How to make the count of rows of a specific column in a table:
ReportID Reader ReadTime
100 A 12:00
100 A 12:10
100 A 12:15
200 B 15:00
200 B 15:00
200 B 15:05
Expected OutCome:
ReportID Reader ReadTime Count Read by Reader and Time
100 A 12:00 1
100 A 12:10 1
100 A 12:15 1
200 B 15:00 2
200 B 15:00 2
200 B 15:05 1
You want to count without group by, this is done via over (so-called window functions)
COUNT(*) OVER (PARTITION BY ReportID, Reader, ReadTime)
Whether this works in your DB or not, I cannot tell (because you didn't tag).
However, here are some slides that explain window functions and also show which DBs support them.
https://www.slideshare.net/MarkusWinand/modern-sql/75
If your dbms doesn't support window functions, a simple correlated sub-query will do the trick:
select t1.ReportID, t1.Reader, t1.ReadTime,
(select count(*) from tablename t2
where t2.ReportID = t1.ReportID
and t2.Reader = t1.Reader
and t2.ReadTime = t1.ReadTime) as cnt
from tablename t1
Or, join with a derived table:
select t1.ReportID, t1.Reader, t1.ReadTime, t2.cnt
from tablename t1
join (select ReportID, Reader, ReadTime, count(*) as cnt
from tablename
group by ReportID, Reader, ReadTime) t2
on t2.ReportID = t1.ReportID
and t2.Reader = t1.Reader
and t2.ReadTime = t1.ReadTime
You could use :
Select reportid,reader,readtime,count(*) over (partition by reportid,reader,readtime) from table;
Simple do count(*) over()...
SELECT *, COUNT(*) OVER (PARTITION BY Reader, ReadTime) [Count] FROM <table>
Try
Select reportid,reader,readtime,count(*) from table1
group by Reader,readtime
http://sqlfiddle.com/#!9/2b938d/12

Second Last records

I am trying to get the second last records use mysql.
I did some research, some sample has fix gap between numbers or date. But my situation is that the contract_id is not always +1 from the previous one. Anyone ideas? Thank you so much.
merchant_id contract_id start_date
10 501 2016-05-01
10 506 2016-06-01
13 456 2015-12-01
13 462 2016-01-01
14 620 2016-06-01
14 642 2016-07-01
14 656 2016-07-05
merchant_id Second_last_contract_id
10 501
13 456
14 642
contract_id != previous contract_id + X. (The X is not fixed)
'start_date' tell us the contracts creating order.
Here's one option using user-defined variables to establish a row number per group of merchants and then filtering on the 2nd in each group ordered by contracts:
select *
from (
select *,
#rn:=if(#prevMerchantId=merchantid,
#rn+1,
if(#prevMerchantId:=merchantid, 1, 1)
) as rn
from yourtable cross join (select #rn:=0, #prevMerchantId:=null) t
order by merchantId, contractid desc
) t
where rn = 2
SQL Fiddle Demo
Here's another option, filtering the results of GROUP_CONCAT() using SUBSTRING_INDEX():
SELECT merchant_id,
SUBSTRING_INDEX(SUBSTRING_INDEX(
GROUP_CONCAT(contract_id ORDER BY start_date DESC),
',', 2), ',', -1) AS Second_last_contract_id
FROM the_table
GROUP BY merchant_id
See it on sqlfiddle.