Shortest way to GROUP BY similar values and retain latest rows? - mysql

Sometimes I want to get just one row of each similar value, I ussually do somethingl ike this:
SELECT * GROUP BY Text ORDER BY Date DESC
My problem using GROUP to select similar rows is that I don't get the values from the latest rows in the row (I'm not quite sure what's the criteria to choosing the row that stays). I want to retain only the newest row in the group.
I know how to do it with a self join but when statements are already very long it seems a bit complicated. Is there any shorter method? Maybe using DISTINCT instead of GROUP BY?

Assuming you have a table that has multiple columns and two of which are GroupID and DATE. If you want to select the latest record for each GroupID, you need to have a subquery which gets the latest Date for each GroupID, example
SELECT a.* -- this just selects all records from original table
FROM tableName a
INNER JOIN
(
-- this subquery gets the latest DATE entry for each GROUPID
SELECT GroupID, MAX(DATE) maxDate
FROM tableName
GROUP BY GroupID
) b ON a.GroupID = b.GroupID AND
a.Date = b.maxDate
if this answer is not clear, please do ask :D

Did you try to use the max function:
SELECT A,B,max(Date) GROUP BY Text

Related

MySQL 5.7 How to do GROUP BY with sorting?

Similar to this issue: MySQL 5.7 group by latest record
I'm not sure how to do this properly in 5.7. Also with possibility of 2nd sort column. Working query in 5.6 that I'm trying to replicate in 5.7:
SELECT id FROM test
GROUP BY category
ORDER BY sort1 DESC, sort2 DESC
id is not always the highest, so MAX(id) does not work.
Looking into the link above, the solution for single sort should be:
SELECT t1.*
FROM test t1
INNER JOIN (
SELECT category, max(sort) AS sort FROM test GROUP BY category
) t2 ON t2.category = t1.category AND t2.sort = t1.sort
But how will it work with 2 sorting?
You are using GROUP BY the wrong way.
Think of group by as a way to separate data row into different groups. Each group has multiple rows, based on the value of group by column.
Once you get those groups, selecting table columns (as in: select *) is like picking any row from that group randomly. This is not helpful nor useful.
Usually once we group records (or rows), we need to find meta information about those records. For example: get us the count of records in that group (as in: select count(*)), or the sum of values of a specific column in that group (as in: select sum(price)), or get the min, max or avg values.
So in a nutshell, when you use group by you should use on of the aggregation functions with it, otherwise it's not going to do you any good.
Why don't you have the ORDER BY at your outer query, instead?
SELECT *
FROM (
SELECT 100 AS id, 1 AS category, NULL AS sort
UNION
SELECT 200 AS id, 1 AS category, 2 AS sort
) dt
GROUP BY category
ORDER BY sort DESC;
It seems that what happened to the data when it was grouped, it took the first data while neglecting the ORDER BY DESC. On your first query, it ordered descending first then group by took the first record which is 200. And yes, this shouldn't be the way you should use GROUP BY. It is used in conjunction with aggregate functions.
when you select a column in a group by query that is not one of the columns you are grouping by, (ie, your id) you have no control over the value unless you use another aggregate function. If you want to sort, use MIN or MAX:
SELECT MAX(id), category, FROM `test2`
GROUP BY category; -- always returns 200
SELECT MIN(id), category, FROM `test2`
GROUP BY category; -- always returns 100

Mysql: Select last occurence of every ID

I have multiple occurrences of a Client-ID called "IDKLIENT". I want to select the last occurrence of IDKLIENT for each ID, like
1|x 2|x
1|y 2|y
1|z 2|z
would be:
1|z
2|z
I used this code:
select a.*
from test a inner join
(select Name_Kl, max(IDKLIENT) as maxid from test group by IDKLIENT) as b on a.IDKLIENT = b.maxid
This way, I only get the same output as with
select a.*
Any help is appreciated!
Thanks in advance.
Edit: The table also has timestamps. So I would be content, if for each ID the max(timestamp) is selected.
Judging by the expected output, I believe you are looking to group by id to find the alphabetically greatest value for idklient. You can sort alphabetically by idklient using max if that is what you need:
select id, max(idklient) from test group by id;
If instead, you want it sorted by the insert order, I would suggest having an AUTO_INCREMENT field which you can then use to do the grouping. This might work better than inserting a timestamp
In response to your edit:
select id, max(timestamp) from test group by id;
This is a classic example for the group-by statement
I think group by has been changed.
Try this way
select a.*
from test a
inner join
(select Name_Kl, max(IDKLIENT) as maxid
from test
group by Name_Kl
) as b
on a.IDKLIENT = b.maxid

SQL find distinct and show other columns

I have read many replies and to similar questions but cannot seem to apply it to my situation. I have a table that averages 10,000 records and is ever changing. It containing a column called deviceID which has about 20 unique values, another called dateAndTime and many others including status1 and status2. I need to isolate one instance each deviceID, showing the record that had the most current dateAndTime. This works great using:
select DISTINCT deviceID, MAX(dateAndTime)
from MyTable
Group By deviceID
ORDER BY MAX(dateAndTime) DESC
(I have noticed omitting DISTINCT from the above statement also yields the same result)
However, I cannot expand this statement to include the fields status fields without incurring errors in the statement or incorrect results. I have tried using IN and EXISTS and syntax to isolate rows, all without luck. I am wondering how I can nest or re-write this query so that the results will display the unique deviceID's, the date of the most recent record and the corresponding status fields associated with those unique records.
If you can guarantee that the DeviceID + DateAndTime is UNIQUE you can do the following:
SELECT *
FROM
MyTable as T1,
(SELECT DeviceID, max(DateAndTime) as mx FROM MyTable group by DeviceID) as T2
WHERE
T1.DeviceID = T2.DeviceID AND
T1.DateAndTime = T2.mx
So basically what happens is, that you do a group by on the DeviceID (NOTE: A GROUP BY always goes with an aggregate function. We are using MAX in this case).
Then you join the Query with the Table, and add the DeviceID + DateAndTime in the WHERE clause.
Side Note... GROUP BY will return distinct elements with or without adding DISTINCT because all rows are distinct by default.
Maybe:
SELECT a.*
FROM( SELECT DISTINCT *,
ROW_NUMBER() OVER (PARTITION BY deviceID ORDER BY dateAndTime DESC) as rown
FROM MyTable ) a
WHERE a.rown = 1

MySQL -- Making it all one query

I am trying to see if I can accomplish the following situation in one query:
I have a table with multiple columns, however, only two are important: version and groupId.
Many rows can share the same groupId value, the version column is a number that needs
to be sorted.
Given two groupId values, A and B, I would like to return two rows in the end. I want to find the most recent version number for each group A and B.
Thanks for your help. Sorry if this is fairly obvious, but I was having difficulty
Something like this?
SELECT groupId, MAX(version) max_version
FROM YourTable
WHERE groupId IN ('A', 'B')
GROUP BY groupId;
You didn't specify any data types so I assumed that groupId could actually take character values like 'A'. Just change this to suit your needs. The basic idea is that you GROUP BY your groupId after filtering out only those values which interest you. Then you SELECT the MAX(version) for each of those values.
Try below
select p.id,p.groupid,p.version from tablename p
left join
(
select max(id) id1 from tablename
group by groupId
order by max(id) desc
) t on t.id1 = p.id
Assuming you have a primary key column id in table
I assume version is an integer
SELECT MAX(version), `group` FROM table WHERE `group` IN (A, B) GROUP BY `group`

1st Row in Group By vs. Last Row

Wondering how to return the 1st row in a MySQL group by vs. the last row. Returning a series of dates, but need the first one displayed.
select * from calendar where approved = 'pending' group by eventid;
Sounds like you want to use ORDER BY instead of GROUP BY:
select * from calendar where approved = 'pending' order by eventid asc limit 1;
Use the min function insead of max to return the first row in a group vs the last row.
select min(date) from calendar ...
If you want the entire row for each group, join the table with itself filtered for the min dates on each group:
select c2.* from
(select eventid, min(dt) dt from calendar
where approved = 'pending' group by eventid) c1
inner join calendar c2 on c1.eventid=c2.eventid and c1.dt=c2.dt
group by eventid;
Demo: http://www.sqlize.com/6lOr55p67c
When you use GROUP BY like that, MySQL makes no guarantees about which row in each group you get. Indeed, theoretically it could even pick some of the columns from one row and some from another. The MySQL Reference Manual says:
"MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. [...] You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate."
What you want to do instead is something like this:
SELECT
calendar.*
FROM
calendar
NATURAL JOIN (
SELECT
eventid, MIN(date) AS date
FROM
calendar
GROUP BY eventid
) AS foo
If you want to show the first AND the last element of a group-by statement, this was useful for me:
SELECT min(date) AS minn, max(date) AS maxx
FROM table_name
GROUP BY occasion
Thanks for the question
I have the solution. To select the last row in a group, use this
select * from calendar
where approved = 'pending'
group by eventid
ORDER BY max(`date`) DESC;
or this to select the first row from a group...
select * from calendar
where approved = 'pending'
group by eventid
ORDER BY min(`date`) DESC;
The DESC does not matter, it only sorts the entire filtered result by descending order, after a row from each group has been selected.