MySql-Pivot Table - mysql

I have a database table (return_period) having records
id ReturnPeriod Value Date
1 10 10X 11/1/2012
2 20 20x 11/1/2012
3 30 30x 11/1/2012
4 10 10xx 12/1/2013
5 20 20xx 12/1/2013
6 30 30y 1/1/2015
7 30 303 1/1/2015
and expecting an output table like below:
Date Rp10_Value Rp20_Value Rp30_Value
11/1/2012 10x 20x 30x
12/1/2013 10XX 20XX
1/1/2015 30y
1/1/2015 303
I want records based on the dates(want the multiple records).Is there a way I can write a query to this type of requirement.Thanks

select date,
case when rp=10 then value else null end as Rp10_Value,
case when rp=10 then value else null end as Rp20_Value,
case when rp=10 then value else null end as Rp30_Value
from
(
SELECT date, value, ReturnPeriod
FROM Table2 where ReturnPeriod=10
union all
SELECT date, value, ReturnPeriod
FROM Table2 where ReturnPeriod=20
union all
SELECT date, value, ReturnPeriod
FROM Table2 where ReturnPeriod=30
) ;

This is a pivot. In MySQL, you can use conditional aggregation:
select rp.date,
max(case when returnperiod = 10 then value end) as rp10_value,
. . .
from return_period rp
group by rp.date;
EDIT:
I see you have duplicates. The same idea applies, but you need to include a sequence number:
select rp.date,
max(case when returnperiod = 10 then value end) as rp10_value,
. . .
from (select rp.*,
row_number() over (partition by date, returnperiod order by date) as seqnum
from return_period rp
) rp
group by rp.date, seqnum;

Related

Interpolate Multiseries Data In SQL

I have a system that stores the data only when they are changed. So, the dataset looks like below.
data_type_id
data_value
inserted_at
2
240
2022-01-19 17:20:52
1
30
2022-01-19 17:20:47
2
239
2022-01-19 17:20:42
1
29
2022-01-19 17:20:42
My data frequency is every 5 seconds. So, whether there's any timestamp or not I need to get the result by assuming in this 5th-second data value the same as the previous value.
As I am storing the data that are only changed, indeed the dataset should be like below.
data_type_id
data_value
inserted_at
2
240
2022-01-19 17:20:52
1
30
2022-01-19 17:20:52
2
239
2022-01-19 17:20:47
1
30
2022-01-19 17:20:47
2
239
2022-01-19 17:20:42
1
29
2022-01-19 17:20:42
I don't want to insert into my table, I just want to retrieve the data like this on the SELECT statement.
Is there any way I can create this query?
PS. I have many data_types hence when the OP makes a query, it usually gets around a million rows.
EDIT:
Information about server Server version: 10.3.27-MariaDB-0+deb10u1 Debian 10
The User is going to determine the SELECT DateTime. So, there's no certain between time.
As #Akina mentioned, sometimes there're some gaps between the inserted_at. The difference might be ~4seconds or ~6seconds instead of a certain 5seconds. Since it's not going to happen so frequently, It is okay to generate by ignoring this fact.
With the help of a query that gets you all the combinations of data_type_id and the 5-second moments you need, you can achieve the result you need using a subquery that gets you the closest data_value:
with recursive u as
(select '2022-01-19 17:20:42' as d
union all
select DATE_ADD(d, interval 5 second) from u
where d < '2022-01-19 17:20:52'),
v as
(select * from u cross join (select distinct data_type_id from table_name) t)
select v.data_type_id,
(select data_value from table_name where inserted_at <= d and data_type_id = v.data_type_id
order by inserted_at desc limit 1) as data_value,
d as inserted_at
from v
Fiddle
You can replace the recursive CTE with any query that gets you all the 5-second moments you need.
WITH RECURSIVE
cte1 AS ( SELECT #start_datetime dt
UNION ALL
SELECT dt + INTERVAL 5 SECOND FROM cte1 WHERE dt < #end_datetime),
cte2 AS ( SELECT *,
ROW_NUMBER() OVER (PARTITION BY test.data_type_id, cte1.dt
ORDER BY test.inserted_at DESC) rn
FROM cte1
LEFT JOIN test ON FIND_IN_SET(test.data_type_id, #data_type_ids)
AND cte1.dt >= test.inserted_at )
SELECT *
FROM cte2
WHERE rn = 1
https://dbfiddle.uk/?rdbms=mariadb_10.3&fiddle=380ad334de0c980a0ddf1b49bb6fa38e

MySQL find rows where yesterday's value is > X AND where last 5 days value < X exists

Let's say I have the following table:
date | name | value
----------------------------
2020-09-01 | name1 | 10
2020-09-02 | name1 | 9
2020-09-03 | name1 | 12
2020-09-04 | name1 | 11
2020-09-05 | name1 | 11
I would like to identify names where the latest value >= 10 AND where over the last 5 days it has ever dropped below 10. In the example table above, name1 would be returned because the latest date has a value of 11 (which is > 10), and over the last 5 days it has dropped below 10 at least once.
Here is my SELECT statement, but it always returns zero rows:
SELECT
name,
count(value) as count
FROM table_name
WHERE
(date = #date AND value >= 10) AND
date BETWEEN date_sub(#date, interval 5 day) AND #date AND value < 10
GROUP BY name
HAVING count < 5
ORDER BY name
I understand why it's failing, but I don't know what to change.
In MySQL 8.0, you could use window functions and aggregation:
select name
from (
select t.*, row_number() over(partition by name order by date desc) rn
from mytable t
where date >= #date - interval 5 day and date <= #date
) t
having max(case when rn = 1 then value end) >= 10 and min(value) <= 10
How about something like this:
SELECT Name, COUNT(*) AS Ct FROM
(SELECT A.*,B.mdate,
CASE WHEN A.date=B.mdate AND A.value >= 10 THEN 1
WHEN A.date >= B.mdate - INTERVAL 5 DAY AND A.date <> B.mdate AND A.value < 10 THEN 1
ELSE 0 END AS Chk
FROM table_name A
JOIN (SELECT Name,MAX(DATE) AS mdate FROM table_name GROUP BY Name) B ON A.Name=B.Name
HAVING Chk <> 0) V
GROUP BY Name
HAVING Ct >= 2
Here's a fiddle for reference: https://www.db-fiddle.com/f/jX4GktCdTrUbqHBf7ZQwdr/0
And here's a breakdown of what the query above is doing.
Joining table_name with a sub-query of the same table but with MAX(DATE) value for comparison.
Using CASE function to check for your conditions; if matches with the conditions, it will return 1, if not, return 0. Added HAVING to exclude any 0 value from the CASE function.
Turn the query to become a sub-query (assigned as V) and do a COUNT(*) over how many occurrence happen on the name then using HAVING again to get any name that have 2 or more occurrence.

MySQL last record in each group with multiple records in same date

Below is the sample data
row_id cust txn_dt txn_amount
-------------------------------------
1 1 31-01-2018 3000
2 1 04-02-2018 4000
3 1 04-02-2018 6000
4 2 29-01-2018 2500
5 2 02-02-2018 3900
6 1 01-02-2018 5000
7 1 01-02-2018 3900
Below is the Expected output
row_id cust txn_dt txn_amount
-------------------------------------
3 1 04-02-2018 6000
5 2 02-02-2018 3900
Need to pick the latest record for each customer based on date and then row_id
It is tricky when there are two columns that define the ordering. Here is one method:
select t.*
from t
where t.row_id = (select t2.row_id
from t t2
where t2.cust = t.cust
order by t2.txn_date desc, row_id desc
limit 1
);
An index on t(cust, txn_date, row_id) should help performance a bit.
Here's an approach that will return the specified result:
SELECT t.row_id
, t.cust
, t.txn_date
, t.txn_amount
FROM ( SELECT r.cust
, MAX(r.row_id) AS max_row_id
FROM ( SELECT p.cust
, DATE_FORMAT(
MAX(
STR_TO_DATE( p.txn_date ,'%d-%m-%Y')
)
,'%d-%m-%Y'
) AS max_txn_date
FROM sample_data p
GROUP BY p.cust
) q
JOIN sample_data r
ON r.cust = q.cust
AND r.txn_date = q.max_txn_date
GROUP BY r.cust
) s
JOIN sample_data t
ON t.cust = s.cust
AND t.row_id = s.max_row_id
ORDER BY t.row_id ASC
Inline view q gets the latest txn_date for each cust
inline view s gets the maximum row_id value for the latest txn_date for each cust
(If txn_date column was DATE datatype, we could avoid the rigmarole of the STR_TO_DATE and DATE_FORMAT functions. And with an appropriate index available, we would (likely) avoid a full scan and an expensive "Using filesort" operation.)

Is it possible to group by a few different date periods in mysql?

There is a table likes:
like_user_id | like_post_id | like_date
----------------------------------------
1 | 2 | 1399274149
5 | 2 | 1399271149
....
1 | 3 | 1399270129
I need to make one SELECT query and count records for specific like_post_id by grouping according periods for 1 day, 7 days, 1 month, 1 year.
The result must be like:
period | total
---------------
1_day | 2
7_days | 31
1_month | 87
1 year | 141
Is it possible?
Thank you.
I have a created a query for Oracle syntax please change it according to your db
select '1_Day' as period , count(*) as Total
from likes
where like_date>(sysdate-1)
union
select '7_days' , count(*)
from likes
where like_date>(sysdate-7)
union
select '1_month' , count(*)
from likes
where like_date>(sysdate-30)
union
select '1 year' , count(*)
from likes
where like_date>(sysdate-365)
here idea is to get single sub query for single period and apply the filter in where to match the filter.
This code shows how to build a cross-tab style query that you will likely need. This aggregates by like_post_id and you may want to put restrictions on it. Further, in terms of last month I don't know whether you mean month to date, last 30 days or last calendar month so I've left that to you.
SELECT
like_post_id,
-- cross-tab example, rinse and repeat as required
-- aside of date logic, the SUM(CASE logic is designed to be ANSI compliant but you could use IF instead of CASE
SUM(CASE WHEN FROM_UNIXTIME(like_date)>=DATE_SUB(CURRENT_DATE(), interval 1 day) THEN 1 ELSE 0 END) as 1_day,
...
FROM likes
-- to restrict the number of rows considered
WHERE FROM_UNIXTIME(like_date)>=DATE_SUB(CURRENT_DATE(), interval 1 year)
GROUP BY like_post_id
To be flexible, simply make a table time_intervals which holds from_length and to_length in seconds:
CREATE TABLE time_intervals
( id int(11) not null auto_increment primary key,
name varchar(255),
from_seconds int,
to_seconds int
);
The select is then quite straight:
select like_post_id, ti.name as interval, count(*) as cnt_likes
from time_intervals ti
left /* or inner */ join likes on likes.like_post_id = 175
and likes.like_date between unix_timestamp(now()) - ti.to_seconds and unix_timestamp(now()) + ti.from_seconds
group by ti.id
With left join you get always all intervals (even when holes exist), with inner join only the intervals which exist.
So you change only table time_intervals and can get what you want. The "175" stands for the post you want, and of course you can change to where ... in () if you want.
Here is an alternative using CROSS JOIN. First, the time difference is calculated using the TIMESTAMPDIFF function and the appropriate parameter (DAY/WEEK/MONTH/YEAR). Then, if the counts are equal to 1, then the value is added up. Finally, the CROSS JOIN is made with an inline view containing the names of the periods.
SELECT
periods.period,
CASE periods.period
WHEN '1_day' THEN totals.1_day
WHEN '7_days' THEN totals.7_days
WHEN '1_month' THEN totals.1_month
WHEN '1_year' THEN totals.1_year
END total
FROM
(
SELECT
SUM(CASE days WHEN 2 THEN 1 ELSE 0 END) 1_day,
SUM(CASE weeks WHEN 1 THEN 1 ELSE 0 END) 7_days,
SUM(CASE months WHEN 1 THEN 1 ELSE 0 END) 1_month,
SUM(CASE years WHEN 1 THEN 1 ELSE 0 END) 1_year
FROM
(
SELECT
TIMESTAMPDIFF(YEAR, FROM_UNIXTIME(like_date), NOW()) years,
TIMESTAMPDIFF(MONTH, FROM_UNIXTIME(like_date), NOW()) months,
TIMESTAMPDIFF(WEEK, FROM_UNIXTIME(like_date), NOW()) weeks,
TIMESTAMPDIFF(DAY, FROM_UNIXTIME(like_date), NOW()) days
FROM likes
) counts
) totals
CROSS JOIN
(
SELECT
'1_day' period
UNION ALL
SELECT
'7_days'
UNION ALL
SELECT
'1_month'
UNION ALL
SELECT
'1_year'
) periods

Select All Columns By Most Recent Date and Highest Version

I have been stumped on this for quite awhile. Request#, SlotId, Segment, and Version all make up the primary key. What i want from my stored proc is to be able to retrieve all rows by passing in the Request # and Segment, but for each slot i want the most recent effective date on or before todays date and from that i need the highest version #. I appriciate your time.
Values in database
Request# SlotId Segment Version Effective Date ContentId
A123 1 A 1 2012-01-01 1
A123 2 A 1 2012-01-01 2
A123 2 A 2 2012-02-01 34
A123 2 A 3 2012-02-01 24
A123 2 A 4 2015-01-01 6 //beyond todays date. dont want
Values I want to return from my stored proc when i pass in A123 for Request # and A for Segment.
A123 1 A 1 2012-01-01 1
A123 2 A 3 2012-02-01 24
The query could be written like this:
; WITH cte AS
( SELECT Request, SlotId, Segment, Version, [Effective Date], ContentId,
ROW_NUMBER() OVER ( PARTITION BY Request, Segment, SlotId
ORDER BY Version DESC ) AS RowN
FROM
tableX
WHERE
Request = #Req AND Segment = #Seg --- the 2 parameters
AND [Effective Date] < DATEADD(day, 1, GETDATE())
)
SELECT Request, SlotId, Segment, Version, [Effective Date], ContentId
FROM cte
WHERE Rn = 1 ;
Consider this:
;
WITH A as
(
SELECT DISTINCT
Request
, Segment
, SlotId
FROM Table1
)
SELECT A.Request
, A.SlotId
, A.Segment
, B.EffectiveDate
, B.Version
, B.ContentID
FROM A
JOIN (
SELECT Top 1
Request
, SlotId
, Segment
, EffectiveDate
, Version
, ContentId
FROM Table1 t1
WHERE t1.Request = A.Request
AND t1.SlotId = A.SlotId
AND T1.Segment = A.Segment
AND T1.EffectiveDate <= GetDate()
ORDER BY
T1.EffectiveDate DESC
, T1.Version DESC
) as B
ON A.Request = B.Request
AND A.SlotId = B.SlotId
AND A.Segment = B.Segment