I want to get staff_id = AZ has MAX(step) value of this table (name: test).
id
general_id
staff_id
step
1
4
A1
1
2
4
AZ
2
3
4
A2
3
4
5
A3
1
5
5
A4
2
6
5
AZ
3
SELECT MAX(step) AS STEP, staff_id
FROM test AS t
WHERE staff_id = AZ
GROUP BY general_id
For just one staff, you can use order by and limit:
select *
from test
where staff_id = 'AZ'
order by step desc limit 1
If you want to do this for multiple staffs at once, then one option uses a correlated subquery:
select *
from test t
where t.step = (select max(t1.step) from test t1 where t1.staff_id = t.staff_id)
Another feature of this approach is that it allows top ties, if any.
Or, in MySQL 8.0, you can use window functions:
select *
from (
select t.*,
rank() over(partition by staff_id order by step desc) rn
from test t
) t
where rn = 1
Related
This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed last year.
I have a table like this
id
name
version
ref_id
deleted
1
a
1
1
1
2
b
3
1
0
3
c
2
1
1
4
a
3
2
1
5
bb
1
2
0
6
cc
2
2
0
what I would like to achieve is to select the rows with the latest versions
id
name
version
ref_id
deleted
2
b
3
1
0
4
a
3
2
1
This is my original approach but is too slow for our system now:
select t.*
from (
select ref_id, max(version) as version
from table1
group by ref_id
) latest
inner join table1 t on t.ref_id = latest.ref_id and t.version = latest.version
Is there a way to do something like:
select if(version = max(version), id, other columns) from table group by ref_id ?
In MySQL 8 or later you can use row_number window function:
with cte as (
select *, row_number() over (partition by ref_id order by version desc) as rn
from t
)
select *
from t
where rn = 1
For earlier versions of MySQL your existing approach is best but an alternate solution worth trying:
select *
from t
where (ref_id, version) in (
select ref_id, max(version)
from t
group by ref_id
)
I'm working with a sample table like below. A Dataset has multiple groups, and each time a write to the table occurs, the RunNumber increments for the dataset, along with data for each group and the total. Each Dataset/Group combo will usually have multiple rows, example below:
RunNumber
Group
Dataset
Total
1
Group1
Dataset A
10
1
Group1
Dataset A
20
2
Group1
Dataset A
30
2
Group2
Dataset A
15
1
Group1
Dataset B
5
1
Group2
Dataset B
10
1
Group3
Dataset A
30
2
Group3
Dataset A
30
1
Group1
Dataset C
15
1
Group2
Dataset C
50
2
Group2
Dataset C
70
2
Group2
Dataset C
90
What I want to do is essential for each combination of Dataset and Group, return all data for rows that have the max(RunNumber) for the given Dataset/Group combination. So for example, the above sample would return this:
RunNumber
Group
Dataset
Total
2
Group1
Dataset A
30
2
Group2
Dataset A
15
1
Group1
Dataset B
5
1
Group2
Dataset B
10
2
Group3
Dataset A
30
1
Group1
Dataset C
15
2
Group2
Dataset C
70
2
Group2
Dataset C
90
Where the Dataset/Groups match, all rows are kept with the max RunNumber for that given combo.
For now, I've split this into 2 separate queries, where i first query for the max(RunNumber) for all distinct Dataset/Group combos, then do a select * for all matches. Any help would be appreciated, thanks in advance!
In MySQL 5.x you can use a sub-query.
SELECT *
FROM your_table
WHERE (`Group`, Dataset, RunNumber) IN (
SELECT `Group`, Dataset, MAX(RunNumber) AS MaxRunNumber
FROM your_table
GROUP BY `Group`, Dataset
);
Test on db<>fiddle here
Alternatives
--
-- LEFT JOIN on bigger
--
SELECT t.*
FROM your_table t
LEFT JOIN your_table t2
ON t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
WHERE t2.RunNumber IS NULL
ORDER BY t.`Group`, t.Dataset;
--
-- where NOT EXISTS on bigger
--
SELECT *
FROM your_table t
WHERE NOT EXISTS (
SELECT 1
FROM your_table t2
WHERE t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
)
ORDER BY `Group`, Dataset;
--
-- Emulating DENSE_RANK = 1 with variables
-- Works also in 5.x
--
SELECT RunNumber, `Group`, Dataset, Total
FROM
(
SELECT
#rnk:=IF(#ds=Dataset AND #grp=`Group`, IF(#run=RunNumber, #rnk, #rnk+1), 1) AS Rnk
, #grp := `Group` as `Group`
, #ds := Dataset as Dataset
, #run := RunNumber as RunNumber
, Total
FROM your_table t
CROSS JOIN (SELECT #grp:=null, #ds:=null, #run:=null, #rnk := 0) var
ORDER BY `Group`, Dataset, RunNumber DESC
) q
WHERE Rnk = 1
ORDER BY `Group`, Dataset;
--
-- DENSE_RANK = 1
-- MySql 8 and beyond.
--
SELECT *
FROM
(
SELECT *
, DENSE_RANK() OVER (PARTITION BY `Group`, Dataset ORDER BY RunNumber DESC) AS rnk
FROM your_table
) q
WHERE rnk = 1
ORDER BY `Group`, Dataset;
I have the following tabel structure:
Id Num1 Num2 Type Num3
1 2 2 1 4
1 3 1 2 5
1 1 1 3 2
2 2 1 1 3
2 0 1 2 2
2 4 3 3 6
I need a query with group by 'Id', sum of 'Num1', sum of 'Num2', max of 'Num3' and the 'Type' related to the MAX of 'Num3'.
So, the desired output is:
Id Sum(Num1) Sum(Num2) type Max(Num3)
1 6 4 2 5
2 6 4 3 6
Without this related 'Type' the query below works fine:
SELECT
Id,
SUM(Num1),
SUM(Num2),
MAX(Num3)
GROUP BY
Id
I tried different methods of subquery but can't make it work yet.
Your problem is a bit of a spin on the greatest value per group problem. In this case, we can use a subquery to find the max Num3 value for each Id. But, in the same subquery we also compute the sum aggregates.
SELECT
t1.Id,
t2.s1,
t2.s2,
t1.Type,
t1.Num3
FROM yourTable t1
INNER JOIN
(
SELECT Id, SUM(Num1) AS s1, SUM(Num2) AS s2, MAX(Num3) AS m3
FROM yourTable
GROUP BY Id
) t2
ON t1.Id = t2.Id AND t1.Num3 = t2.m3;
As a hat tip to MySQL 8+, and to ward off evil spirits, we can also write a query using analytic functions:
SELECT Id, s1, s2, Type, Num3
FROM
(
SELECT
Id,
SUM(Num1) OVER (PARTITION BY Id) s1,
SUM(Num2) OVER (PARTITION BY Id) s2,
Type,
Num3,
MAX(Num3) OVER (PARTITION BY Id) m3
FROM yourTable
) t
WHERE Num3 = m3;
I am aggregating data and I cannot sum certain columns so I would like to take the most frequent observation from that column, or the median value. Example follows, thanks in advance.
ID site
1 3
1 3
1 2
1 3
2 4
2 5
2 5
2 5
I want it to look like
ID Site
1 3
2 5
WITH temp AS(
SELECT ID, Site, COUNT(*) As counts
FROM id_table
GROUP BY ID, Site
)
SELECT temp.ID, temp.Site
FROM temp
JOIN (SELECT ID, MAX(counts) max_counts
FROM temp
GROUP BY ID
)b
ON temp.ID = b.ID
AND temp.counts = b.max_counts
ORDER BY ID ASC
SQL Fiddle
I am writing a query to grab the items that a specific user_id was the first to use. Here is some sample data -
item_id used_user_id date_used
1 1 2012-08-25
1 2 2012-08-26
1 3 2012-08-27
2 2 2012-08-27
3 1 2012-08-27
4 1 2012-08-21
4 3 2012-08-24
5 3 2012-08-23
query
select item_id as inner_item_id, ( select used_user_id
from test
where test.item_id = inner_item_id
order by date_used asc
limit 1 ) as first_to_use_it
from test
where used_user_id = 1
group by item_id
It returns the correct values
inner_item_id first_to_use_it
1 1
3 1
4 1
but the query is VERY slow on a giant table. Is there a certain index that I can use or a better query that I can write?
i can't get exactly what you mean because in your inner query you have sorted it by their used_user_id and and on your outer query you have filtered it also by their userid. Why not do this directly?
SELECT DISTINCT item_id AS inner_item_id,
used_user_id AS first_to_use_it
FROM test
WHERE used_user_id = 1
UPDATE 1
SELECT b.item_id,
b.used_user_id AS first_to_use_it
FROM
(
SELECT item_ID, MIN(date_used) minDate
FROM tableName
GROUP BY item_ID
) a
INNER JOIN tableName b
ON a.item_ID = b.item_ID AND
a.minDate = b.date_used
WHERE b.used_user_id = 1