MySQL select last unique occurrence - mysql

Let's say you have the following table (the column of interest here is binid):
id datetime agid binid status
1 2013-02-01 11:03:49 0 25 1
2 2013-02-01 11:03:53 0 25 1
3 2013-02-01 11:04:21 0 26 1
4 2013-02-01 11:04:23 0 26 0
5 2013-03-01 11:04:26 0 25 0
6 2013-03-01 11:04:30 0 36 0
7 2013-03-01 11:04:34 0 36 1
8 2013-03-01 11:04:35 0 36 1
9 2013-03-01 11:04:36 0 36 1
10 2013-03-01 11:04:39 0 36 0
11 2013-03-01 11:04:41 0 36 1
13 2013-03-01 11:04:50 0 25 1
14 2013-03-01 11:04:53 0 26 1
15 2013-03-01 11:15:25 0 25 1
16 2013-03-01 11:15:30 0 25 0
17 2013-03-01 11:15:39 0 23 1
18 2013-03-01 11:15:43 0 26 1
How can I extract the last occurrence of each binid that occurs in a certain timeframe?
This is what I am using so far:
SELECT * FROM ( reports ORDER BY datetime ASC )
WHERE datetime >= TIMESTAMP('2013-03-01')
GROUP BY binid
but it returns the first occurrences instead. How can I return the last occurrence of each unique binid?

You should use a subquery to get the result:
select r1.*
from reports r1
inner join
(
select max(datetime) MaxDate, binid
from reports
WHERE datetime >= TIMESTAMP('2013-03-01')
group by binid
) r2
on r1.binid = r2.binid
and r1.datetime = r2.maxdate
WHERE r1.datetime >= TIMESTAMP('2013-03-01')
See SQL Fiddle with Demo
The problem is that when you are using a GROUP BY on a single column then MySQL can return an unexpected value for the other columns not in the GROUP BY. (see MySQL Extensions to GROUP BY).
From the MySQL Docs:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. ... You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.

SELECT binid,
SUBSTRING_INDEX(GROUP_CONCAT(agid ORDER BY datetime DESC), ',', 1) AS agid,
SUBSTRING_INDEX(GROUP_CONCAT(status ORDER BY datetime DESC), ',', 1) AS status
FROM reports
WHERE datetime <= TIMESTAMP('2013-03-01')
GROUP BY binid;

SELECT * FROM ( reports ORDER BY datetime DESC)
WHERE datetime >= TIMESTAMP('2013-03-01')
GROUP BY binid LIMIT 1

Change ORDER BY ASC to ORDER BY DESC. Should do the trick.

Related

Query runs too slow and even it stops because exceded of time with 17000 rows

I have table 1:
historial_id
timestamp
address
value
insertion_time
1
2022-01-29
1
84
2022-01-31
2
2022-01-29
2
40
2022-01-31
3
2022-01-30
1
84
2022-01-31
4
2022-01-30
2
41
2022-01-31
5
2022-01-30
2
41
2022-01-31
(sometimes it has repeated rows)
...
I need a Query to get:
timestamp
value(address 1)
value(address 2)
2022-01-29
84
40
2022-01-30
84
41
......
I tried with:
SELECT timestamp, ( SELECT value
FROM historical
WHERE register_type=11
AND address=2
AND timestamp=t1.timestamp
GROUP BY value
) AS CORRIENTE_mA,
( SELECT value
FROM historical
WHERE register_type=11
AND address=1
AND timestamp=t1.timestamp
GROUP BY value ) AS Q_M3pH
FROM historical AS t1
GROUP BY timestamp;
But it's too slow, it even stops because of exceeded time.
I tried with distinct too instead of group by
I think you need dynamic pivot.
Please try and avoid MySQL reserved words like timestamp.
Below query return only the max value for address 1 and 2 grouping by timestamp.
This is a simplified version of your query :
select
`timestamp`
, max(case when address=1 then value end) as value_address1
, max(case when address=2 then value end) as value_address2
from historical
group by `timestamp`;
Result:
timestamp value_address1 value_address2
2022-01-29 84 40
2022-01-30 84 41
Demo

MySQL SELECT record immediately prior to range and select range

I have a MySQL table of states for three things, a,b and c
id a b c time
--------------------------
1 0 1 1 78
2 1 1 0 89
3 1 0 0 105
4 0 0 0 107
5 1 0 1 122
6 0 0 1 134
7 0 1 0 167
8 1 1 1 168
9 0 1 0 177
10 0 0 0 180
As an example, the bounds of time are chosen by the user as time>100
AND time<170
But I need to know the value of ‘a’ immediately prior to the 1st returned record. (where id=2)
I’m trying to find the most efficient way of creating this query, without resorting to 2 separate queries.
SELECT a, time FROM states WHERE time<100 order by time DESC limit 1
AND
SELECT a, time FROM states WHERE time>100 AND time<170 ORDER BY time ASC
To return a result set of ...
a time
1 89
1 105
0 107
1 122
0 134
0 167
0 168
Any advice would be gratefully received, thanks!
One method uses LEAD():
SELECT a, time
FROM (SELECT s.*, LEAD(time) OVER (ORDER BY time) as next_time
FROM states s
) s
WHERE next_time > 100 AND time < 170;
You can also use:
select s.*
from states s
where s.time >= (select s2.time from states s2 where s2.time <= 100 order by s2.time desc limit 1) and
s.time < 170;
This, alas, doesn't work when the subquery returns no values. That can be fixed, but it complicates the query.
However, your solution is actually fine (with union all):
(SELECT a, time
FROM states
WHERE time <= 100
ORDER BY time DESC
LIMIT 1
) UNION ALL
(SELECT a, time
FROM states
WHERE time > 100 AND time < 170
)
ORDER BY time ASC;
From a performance perspective, this should be okay if you have an index on time. This also readily handles the problem when there are no values 100 or less.

MySQL - Average ignoring Null and based on weekday

I´m trying to do some analysis in the following data
WeekDay Date Count
5 06/09/2018 20
6 07/09/2018 Null
7 08/09/2018 19
1 09/09/2018 16
2 10/09/2018 17
3 11/09/2018 24
4 12/09/2018 25
5 13/09/2018 24
6 14/09/2018 23
7 15/09/2018 23
1 16/09/2018 9
2 17/09/2018 23
3 18/09/2018 33
4 19/09/2018 22
5 20/09/2018 31
6 21/09/2018 17
7 22/09/2018 10
1 23/09/2018 12
2 24/09/2018 26
3 25/09/2018 29
4 26/09/2018 27
5 27/09/2018 24
6 28/09/2018 29
7 29/09/2018 27
1 30/09/2018 19
2 01/10/2018 26
3 02/10/2018 39
4 03/10/2018 32
5 04/10/2018 37
6 05/10/2018 Null
7 06/10/2018 26
1 07/10/2018 11
2 08/10/2018 32
3 09/10/2018 41
4 10/10/2018 37
5 11/10/2018 25
6 12/10/2018 20
The problem that I want to solve is: I want to create a table with the average of the 3 last same weekdays related to the day. But, when there is a NULL in the weekday, I want to ignore and do the average only with the remain numbers, not count NULL as an 0. I will give you an example here:
The date in this table is day/month/year :)
Ex: On day 12/10/2018, I need the average from
the days 05/10/2018; 28/09/2018; 21/09/2018. These are the last 3 same weekday(six) as 12/10/2018.
. Their values are Null; 29; 17. Then the result of this average must be 23, because I need to ignore the NULL, and not be 15,333.
How can I do this?
The count() function ignores nulls (i.e. does NOT increment if it encounters null) so I suggest you simply count the values then may contain the nulls you wish to ignore.
dow datecol value
6 21/09/2018 17
6 28/09/2018 29
6 05/10/2018 Null
e.g. sum(value) above = 46, and the count(value) = 2 so the average is 23.0 (and avg(value) will also return 23.0 as it also ignores nulls)
select
weekday
, `date`
, `count`
, (select (sum(`count`) * 1.0) / (count(`count`) * 1.0)
from atable as t2
where t2.weekday = t1.weekday
and t2.`date` < t1.`date
order by t2.`date` DESC
limit 3
) as average
from atable as t1
You could just use avg(count) in the query above, and get the same result.
ps. I do hope you do NOT use count as a column name! I also would suggest you do NOT use date as a column name either. i.e. Avoid using SQL terms as names.
SELECT WeekDay, AVG(Count)
FROM myTable
WHERE Count IS NOT NULL
GROUP BY WeekDay
Use IsNULL(Count,0) in your Select
SELECT WeekDay, AVG(IsNULL(Count,0))
FROM myTable
GROUP BY WeekDay
First off, you need to get the number of instances of that weekday in the data since you just need the last 3 same week days
create table table2
as
select
row_number() over(partition by weekday order by date desc) as rn
,weekday
,date
,count
from table
From here, you can get what you want. With you explanation, you don't need to filter out the NULL values for count. Just doing the avg() aggregation will simply ignore it.
select
weekday
,avg(count)
from table2
where rn in (1,2,3)
group by weekday

How can I ignore duplicate values in another column when using GROUP?

I have the following query:
SELECT
DATE(`timeStamp`),COUNT(*)
FROM
`wf`.sh`
WHERE
(DATE(`timeStamp`) >= curdate()- INTERVAL 31 DAY)
GROUP BY
DATE(`timeStamp`)
HAVING
COUNT(DATE(`timeStamp`)) > 0
ORDER BY
DATE(`timeStamp`) ASC;
The purpose of this query is to retrieve the amount of users online in my system per day, in the space of a month.
Example dataset:
uID timeStamp
1 2016-11-28 00:27:01
1 2016-11-28 01:10:15
1234 2016-11-28 02:50:00
2 2016-11-28 06:11:09
47 2016-11-28 08:32:48
1246 2016-11-28 09:51:47
In its current format, this query returns the count of rows with duplicate dates, for example:
timeStamp COUNT(*)
2017-01-29 256
2017-01-30 224
2017-01-31 240
2017-02-01 95
2017-02-02 136
I have another field uID; I need to modify my query so that GROUP also ignores rows with a duplicate uID field for each day. I tried creating another GROUP BY but was given an error that 'incorrect GROUP BY clause' (or something of that nature).
Can this be done via pure MySQL?
You can use a subselect
SELECT
visitDate,COUNT(*)
FROM
(SELECT DISTINCT DATE(`timeStamp`) as visitDate, uID FROM `wf`.sh`) alias_t
WHERE
(visitDate >= curdate()- INTERVAL 31 DAY)
GROUP BY
visitDate
HAVING
COUNT(visitDate) > 0
ORDER BY
visitDate ASC;

Have to get the corresponding time stamp when i get max of a column from a table

I need to extract the required fields from a table along with relevant time stamp
SELECT * FROM Glm_Test.LicenseUsage where FeatureId='2';
Output :
VendorId,FeatureId,Total_Lic_Installed,Total_Lic_Used,Reserved,CurrentTime
1 2 106 19 67 2015-12-15 15:00:05
1 2 106 19 67 2015-12-15 15:02:02
1 2 106 19 69 2015-12-15 15:04:02
1 2 106 19 67 2015-12-15 15:06:01
1 2 106 20 67 2015-12-15 15:08:02
select VendorId,FeatureId,Total_Lic_Installed,Max(Total_Lic_Used),Reserved,CurrentTime from Glm_Test.LicenseUsage where FeatureId= '2' group by VendorId,FeatureId;
output:
1 2 106 20 69 2015-12-15 15:00:05
In the above 2 queries
1st query lists all entries from the table
and i want second query to return time stamp for the MAX value of column Total_Lic_Used but somehow it is returning me only timestamp of the first entry.
Help is much appreciated.
Selecting the columns which are not part of an aggregation function like count/max/min/sum... or not in group by clause will give unexpected results:
Other RBBMS wont allow these statements(gives error like):
sql server ==> the select list because it is not contained in either
an aggregate function or the GROUP BY clause
Oracle ==>not a GROUP BY expression
You can do this by a sub query and join
select
a.VendorId,
a.FeatureId,
a.Total_Lic_Installed,
b.max_Total_Lic_Used,
a.Reserved,
a.CurrentTime
from Glm_Test.LicenseUsage a
join (
select
VendorId,
FeatureId,
Max(Total_Lic_Used) max_Total_Lic_Used
from Glm_Test.LicenseUsage
where FeatureId = '2'
group by VendorId, FeatureId
) b
on a.VendorId = b.VendorId and
a.FeatureId = b.FeatureId and
a.Total_Lic_Used = b.max_Total_Lic_Used
sql fiddle demo
You can try this also;
select
`VendorId`,
`FeatureId`,
`Total_Lic_Installed`,
`Total_Lic_Used`,
`Reserved`,
`CurrentTime`
from Glm_Test.LicenseUsage
order by Total_Lic_Used desc
limit 1
demo