Limit query result per unique columns combination [duplicate] - mysql

This question already has answers here:
Get top n records for each group of grouped results
(12 answers)
Closed 3 days ago.
I would like to modify this query to limit result to have max 2 rows (latest) per group:
select
distinct clusterName,
aksNamespace,
acrName,
acrImageName,
acrImageVersion,
date
from
(
select
clusterName,
aksNamespace,
acrName,
acrImageName,
acrImageVersion,
date
from
aks_images
order by
acrImageName,
date desc
) as t
where
acrName = "storage"
order by
clusterName,
acrImageName,
date desc
Current result:
clusterName
aksNamespace
acrName
acrImageName
acrImageVersion
`date`
dev
support
storage
app
f74581b
17.02.2023 14:35
dev
support
storage
app
c6040a0
17.02.2023 7:45
dev
support
storage
app
4410f39
16.02.2023 10:43
dev
abc
storage
qwer
93241f1
15.02.2023 12:45
dev
abc
storage
qwer
249b089
14.02.2023 13:15
dev
abc
storage
qwer
1c40785
13.02.2023 13:30
prod
support
storage
app
469a492
07.02.2023 14:15
test
support
storage
app
07e22a6
17.02.2023 14:40
test
support
storage
app
daf975d
17.02.2023 13:40
test
support
storage
app
7e1a50b
15.02.2023 13:10
test
support
storage
app
8f27715
15.02.2023 9:35
Expected result:
clusterName
aksNamespace
acrName
acrImageName
acrImageVersion
`date`
dev
support
storage
app
f74581b
17.02.2023 14:35
dev
support
storage
app
c6040a0
17.02.2023 7:45
dev
abc
storage
qwer
93241f1
15.02.2023 12:45
dev
abc
storage
qwer
249b089
14.02.2023 13:15
prod
support
storage
app
469a492
07.02.2023 14:15
test
support
storage
app
07e22a6
17.02.2023 14:40
test
support
storage
app
daf975d
17.02.2023 13:40
Mysql version: 8.0.31
I'd be grateful for any advice or solutions.

On MySQL 8+, we can use ROW_NUMBER():
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY clusterName, acrImageName
ORDER BY date DESC) rn
FROM aks_images
WHERE acrName = 'storage'
)
SELECT
clusterName,
aksNamespace,
acrName,
acrImageName,
acrImageVersion,
date
FROM cte
WHERE rn <= 2
ORDER BY
clusterName,
acrImageName,
date DESC;

Related

MySQL / BigQuery - Weighted Average & Group By

I am trying to calculate a weighted average of a dataset and return the maximum value, monthly over a period of 12 months along with its' corrosponding ticket description.
I'm aware that there are tons of questions out there addressing similar problems, but I have yet to find a solution that combines the syntaxes I believe are required.
Here's some sample table data:
Month_Begin_Date
Priority
ticket_about_tag
Phone_Time
Occurances
2019-02-01
Urgent
Power Bill
22.42
36
2019-02-01
Normal
Power Bill
3.41
89
2019-05-01
Normal
Wifi Issue
45.32
12
Here's my current query for determining the weighted average:
SELECT (Month_Begin_Date,
(sum(phone_time * occurances))/sum(occurances)) AS Weighted_Average_Phone_Time
FROM database
GROUP BY month_begin_date
This returns the weighted average total for all ticket_about_tags, monthly.
But I still need to get this so that it displays the maximum weighted average grouped by ticket description. I.e. something that looks like this:
Month_Begin_Date
ticket_about_tag
Weighted_average_phone_time
2019-01-01
Power Bill
22.42
2019-02-01
Power Bill
3.41
2019-03-01
Wifi Issue
45.32
I've tried adding this as a subquery into another query in order to return the data I'm after, like so:
SELECT Month_Begin_date, Ticket_About_Tag, Phone_Average_Handle_Time
FROM database WHERE CONCAT(month_begin_date,phone_time) IN
(SELECT CONCAT (Month_Begin_Date,
(sum(phone_time * occurances))/sum(occurances)) AS Weighted_Average_Phone_Time
FROM database
GROUP BY month_begin_date
)
ORDER BY month_begin_date ASC
Thanks very much for any assistance
Not sure I got your question right, but using the following data:
Month_Begin_Date
Priority
Ticket_About_Tag
Phone_Time
Occurences
2019-02-01
Urgent
Power Bill
22.42
36
2019-02-01
Normal
Power Bill
3.41
89
2019-05-01
Normal
Wifi Issue
45.32
12
2019-02-01
Urgent
Wifi Issue
14.2
7
2019-02-01
Normal
Wifi Issue
30.7
5
Is this the query you're after?
SELECT
Month_Begin_Date, Ticket_About_Tag,
SUM(Phone_Time * Occurences) / SUM(Occurences) AS Weighted_Average_Phone_Time
FROM `database`
GROUP BY Month_Begin_Date, Ticket_About_Tag
ORDER BY Month_Begin_Date ASC, Ticket_About_Tag ASC;
That gives you a result like the one you posted:
Month_Begin_Date
Ticket_About_Tag
Weighted_Average_Phone_Time
2019-02-01
Power Bill
8.884880083084106
2019-02-01
Wifi Issue
21.075000206629436
2019-05-01
Wifi Issue
45.31999969482422
Response to your comment
To answer your comment you could:
SELECT
a.Month_Begin_Date,
a.Ticket_About_Tag,
b.Max_Weighted_Average_Phone_Time
FROM (
SELECT
Month_Begin_Date,
Ticket_About_Tag,
SUM(Phone_Time * Occurences) / SUM(Occurences) AS Weighted_Average_Phone_Time
FROM `database`
GROUP BY Month_Begin_Date, Ticket_About_Tag
) a
LEFT JOIN (
SELECT
b1.Month_Begin_Date,
MAX(b1.Weighted_Average_Phone_Time) AS Max_Weighted_Average_Phone_Time
FROM (
SELECT
Month_Begin_Date,
Ticket_About_Tag,
SUM(Phone_Time * Occurences) / SUM(Occurences) AS Weighted_Average_Phone_Time
FROM `database`
GROUP BY Month_Begin_Date, Ticket_About_Tag
) b1
GROUP BY b1.Month_Begin_Date
) b ON a.Month_Begin_Date = b.Month_Begin_Date
WHERE a.Weighted_Average_Phone_Time = b.Max_Weighted_Average_Phone_Time
That gives you the following output:
Month_Begin_Date
Ticket_About_Tag
Max_Weighted_Average_Phone_Time
2019-02-01
Wifi Issue
21.075000206629436
2019-05-01
Wifi Issue
45.31999969482422
There are other ways of doing this, but I think this is by far the easiest way to understand without using other SQL constructs. It reflects your need of going through the same data twice, first to aggregate by month and ticket tag, then to find the maximum of the aggregate data by month.

how to retrieve latest data from mysql table out of duplicate records

I am trying to retrieve latest data from my sql table for each record. There will be duplicate data for each record with some data changes. I need to retrieve the latest timestamped data. Can someone suggest which is the optimum solution in terms of performance. Have seen some solutions with inner joins and sub queries.
Sample data given below
Technology Students Amount Area Date
python 500 1000 Bangalore 2021-08-06 12:03:26
Ruby 100 1000 Bangalore 2021-08-06 05:18:50
Java 300 1000 Bangalore 2021-08-06 18:23:40
python 900 1000 Bangalore 2021-08-06 16:23:30
Java 100 1000 Bangalore 2021-08-06 12:23:50
Ruby 500 1000 Bangalore 2021-08-06 15:13:40
my o/p should contain latest data for each tech
Technology Students Amount Area Date
Java 300 1000 Bangalore 2021-08-06 18:23:40
python 900 1000 Bangalore 2021-08-06 16:23:30
Ruby 500 1000 Bangalore 2021-08-06 15:13:40
One way to do this:-
*Replace Table with real table Name.
select table.*
from table
join
(
select Technology, max(Date) as max_dt
from table
group by Technology
) t
on table.Technology= t.Technology and table.Date = t.max_dt
The most performant solution where you don't need to self-join is to use a window function, optionally with a cte although a sub-query is fine also.
Unfortunately row_number() is supported only from version 8.0, however including here for completeness and to show why you should upgrade!
with latest as (
select * , Row_Number() over(partition by technology order by date desc) rn
from t
)
select Technology, Students, Amount, Area, Date
from latest
where rn=1

Select rows from MySQL in the last 24 hours and skip rows based on interval X?

I have created a trading bot and I use MySQL to import data and calculate technical indicators, I want to create a feature that allows me to import data more frequently and control the interval of how I select the data.
Is there a query that will allow me to select data at a fixed interval in mysql?
SELECT * FROM PriceHistory
WHERE `RefrenceID`=1001
and `TimeStamp` > (SELECT max(`TimeStamp`) FROM PriceHistory) -
Interval 1440 Minute
Group by `TimeStamp`;
Using this Query I am able to select price data for the Last 24 Hours. Is there a solution for me to select data in intervals of 5 minute, 10 minutes, 30 minutes etc?
DataSet Example
`TimeStamp` `RefrenceID`
1. 2018-12-14 23:00:05 1001
2. 2018-12-14 23:05:10 1001
3. 2018-12-14 23:11:16 1001
4. 2018-12-14 23:16:21 1001
5. 2018-12-14 23:21:25 1001
6. 2018-12-14 23:26:30 1001
7. 2018-12-14 23:32:41 1001
8. 2018-12-14 23:37:46 1001
9. 2018-12-14 23:42:51 1001
10. 2018-12-14 23:47:51 1001
11. 2018-12-14 23:52:56 1001
I have thought of two possible solutions unfortunately I have yet figured out how to implement them.
add an auto-increment-id to my table, create a query that selects the rownumber. create a local variable #rownum and select all rows where #rownum = #rownum + (interval).
Select the first timestamp, create a local variables #start_time, #offset, #count then select min(TimeStamp)> #start_time + INTERVAL(#offset * #count)MINUTE
The issues I am facing by using an auto-increment ID solution is that I am tracking the price of 220 items in the same table (so sequential ids will not work) and therefore there may need to be a new index row created at the start of the query. The other issue I am facing is that my code is synchronous and therefore due to other running processes every import of data is between 5min - 5min 30sec.
thanks for your help!
best regards,
slurp
Expected output:
1. 2018-12-14 23:00:05 1001
3. 2018-12-14 23:11:16 1001
5. 2018-12-14 23:21:25 1001
7. 2018-12-14 23:32:41 1001
9. 2018-12-14 23:42:51 1001
11. 2018-12-14 23:52:56 1001
Using window functions (MySQL-8.0, MariaDB-10.2), we DIV 600 to partition by the 10 minute (600 seconds) interval. We take the first in each group by id.
SELECT id, entrytime, RefrenceID
FROM (
SELECT
id, entrytime, RefrenceID,
ROW_NUMBER() OVER (PARTITION BY RefrenceID,UNIX_TIMESTAMP(entrytime) DIV 600 ORDER BY id) AS `rank`
FROM timedata
ORDER BY id
) AS tmp
WHERE tmp.`rank` = 1
ORDER BY id, entrytime;
Ref: dbfiddle
SELECT *
FROM PriceHistory
WHERE
`RefrenceID`=1001
AND `TimeStamp` > (SELECT max(`TimeStamp`) FROM PriceHistory) - Interval 1440 Minute
AND substring_index(TimeStamp,':',1)%5=0
GROUP BY `TimeStamp`;

Getting the latest entry for a distinct column value [duplicate]

This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 4 years ago.
I have 2 tables I need to reference software_releases and platforms. The software releases will contain a platform id, software id and version number.
I want to get the latest version number for each of the distinct platforms.
The query that I currently have is the following:
SELECT ReleaseID,
DateCreated,
SoftwareID,
platforms.PlatformID,
PlatformName,
VersionNumber
FROM software_releases, platforms
WHERE software_releases.PlatformID = platforms.PlatformID
AND software_releases.SoftwareID='3'
ORDER BY VersionNumber DESC
Which returns:
5 27/05/2017 22:37 3 7 Windows 3.0.0.0
9 27/05/2017 22:56 3 7 Windows 2.6.0.0
7 27/05/2017 22:46 3 5 Android 2.5.1.1
1 27/05/2017 23:21 3 5 Android 2.5.0.0
The column order is as follows:
ReleaseID
Date Released
Software ID
Platform ID
Version ID
What I am looking for is getting the latest release for each platform returned for the specified software ID, therefore, I am only wanting to return the following:
5 27/05/2017 22:37 3 7 Windows 3.0.0.0
7 27/05/2017 22:46 3 5 Android 2.5.1.1
I got this answer from CBrowe comment, I had no idea what I was after was called a group wise maximum query.
I changed my query to be
SELECT * FROM (SELECT DateCreated, SoftwareID, software_releases.PlatformID,
VersionNumber, PlatformName FROM software_releases, platforms WHERE
SoftwareID='3' AND
platforms.PlatformID = software_releases.PlatformID ORDER BY VersionNumber DESC)
AS s GROUP BY PlatformID

Does any way to get the last inserted values in each days

id date calls
5 2015-02-17 01:06:01 1
6 2015-02-17 11:07:01 2
7 2015-02-17 23:06:01 3
8 2015-02-18 03:07:01 1
9 2015-02-18 09:06:01 2
10 2015-02-18 17:07:01 3
11 2015-02-18 22:06:01 4
12 2015-02-19 01:07:01 1
13 2015-02-19 08:06:01 2
14 2015-02-19 18:07:01 3
15 2015-02-19 23:06:01 4
my table structure is like this and I need to calculate the sum of call in each days. In this table, you can see that, the last call in feb 17 was at 23:06:01 and call count was 3. In feb 18 was at 22:06:01 and call count was 4. Can I get the sum of all this last call counts of each day.
You can use a subquery to determine which rows to sum (the ones matching the last call for each date, using MySQL it would be:
select sum(calls) sum_last_calls
from your_table
where `date` in (
select max(date) max_date
from your_table
group by date(`date`)
)
This query will return 11 as the sum (from 3+4+4).
The date() function used in the subquery is specific to your database and might need to be changed according to your specific database syntax - the point is that it should return the date without time (it could be date::date (Postgresql) or cast(date as date) (MSSQL and others)).
Sample SQL Fiddle for MySQL and Postgresql
Postgresql version:
select sum(calls) as calls
from (
select max(calls) as calls
from t
where date::date between '2015-02-17' and '2015-02-19'
group by date::date
) s