SQL: Find top n in groupby data - mysql

I have a table like this:
id | name | surname | city|
-------------------------------
'1', 'mohit', 'garg', 'delhi'
'2', 'mohit', 'gupta', 'delhi'
'3', 'ankita', 'gupta', 'jaipur'
'4', 'ankita', 'garg', 'jaipur'
'5', 'vivek', 'garg', 'delhi'
I am looking for a query that returns (id,city) grouped by city, with at most two (id) per city, but without using nested queries.
Expected output:
'1', 'delhi'
'2', 'delhi'
'3', 'jaipur'
'4', 'jaipur'

Perhaps the only way without subqueries is to use a trick with substring_index() and group_concat():
select city, substring_index(group_concat(id), ',', 2)
from t
group by city;
This puts the ids in a comma-delimited list, rather than in separate rows. Also, you have to be careful about the size of the intermediate results.
Of course, the accepted practice would use either a subquery in the where clause or a subquery using variables.
EDIT:
Here is a method for getting two ids per city without listing the cities:
select city, min(id) as id
from t
group by city
union
select city, max(id)
from t
group by city;

You can do this with a LEFT OUTER JOIN, although using a subquery will probably be clearer and might be faster. Here's a method using the JOIN:
SELECT
T1.id,
T1.city
FROM
My_Table T1
LEFT OUTER JOIN My_Table T2 ON T2.city = T1.city AND T2.id <= T1.id
GROUP BY
T1.id,
T1.city
HAVING
COUNT(*) <= 2
You're effectively finding all rows in T1 where the number of rows with the same name and a lower id is <= 2, which means that it must be one of the top two rows by id.

Try like below
create table
#test (id int, name varchar(10), name2 varchar(10),place varchar(10))
insert into #test
select
'1', 'mohit', 'garg', 'delhi'
union
select
'2', 'mohit', 'gupta', 'delhi'
union
select
'3', 'ankita', 'gupta', 'jaipur'
union
select
'4', 'ankita', 'garg', 'jaipur'
union
select
'5', 'vivek', 'garg', 'delhi'
with data
as
(
select ROW_NUMBER() OVER(PARTITION BY place ORDER BY id) RN,id,name,name2,place
from #test
),
data1
as(
select id, place
from data
where rn <=2
)
select *from data1

Related

Find the latest rows by filtering the status

I have a table called person_list. The data is,
Insert into person_list(person_allocation_id, person_id, created_datetime, boss_user_name, allocation_status_id) values
(111008, 1190016, '2021-01-05 11:09:25', 'Rajesh', '2'),
(111007, 1190015, '2020-12-12 09:23:31', 'Sushmita', '2'),
(111006, 1190014, '2020-12-11 10:48:26', '', '3'),
(111005, 1190014, '2020-12-10 13:46:15', 'Rangarao', '2'),
(111004, 1190014, '2020-12-10 13:36:10', '', '3');
Here person_allocation_id is the primary key.
person_id may be duplicated some times.
All of these rows are sorted by person_allocation_id (in descending order)
Now, I would like to filter the rows which are having allocation_status_id = '2' and boss_user_name should be non-empty for the person_id.
The difficulty here is that I have to exclude the row if the person_id is having allocation_status_id = '3' as their latest status (according to date).
I am unable to understand how could I compare the dates in one row with another in the previous row.
So finally I should get only 2 rows in my final result set (person_allocation_id are 111008 and 111007).
Somehow I achieved this in Oracle.
select person_id, person_allocation_id, create_datetime, boss_user_name, allocation_status_id
from (
select person_id, person_allocation_id, create_datetime, boss_user_name, allocation_status_id,
rank() over (partition by person_id order by create_datetime desc) rnk
from person_list
where allocation_status_id = '2')
where rnk = 1;
But, I need this for MySql DB. Anyone, please help?
Thanks.
SELECT t1.*
FROM person_list t1
JOIN ( SELECT MAX(t2.person_allocation_id) person_allocation_id, t2.person_id
FROM person_list t2
GROUP BY t2.person_id ) t3 USING (person_allocation_id, person_id)
WHERE t1.allocation_status_id = '2'
fiddle
Add more conditions to WHERE clause if needed (for example, AND boss_user_name != '').
You can use a correlated subquery to get the latest allocation_status_id value per person_id:
select person_allocation_id
, person_id
, created_datetime
, boss_user_name
, allocation_status_id
from (
select person_allocation_id
, person_id
, created_datetime
, boss_user_name
, allocation_status_id
, (select pl2.allocation_status_id
from person_list pl2
where pl2.person_id = pl.person_id
order by pl2.created_datetime desc
limit 1) latest_allocation_status_id
from person_list pl) t
where
allocation_status_id = '2' and latest_allocation_status_id <> '3'
and boss_user_name <> ''
The outer query is able to check the latest status and return the expected result set. The query works for MySQL 5.7
Demo here
As a side note, for MySQL 8.0 you can replace the correlated subquery with a window function:
last_value(allocation_status_id) over (partition by person_id
order by created_datetime desc)
Demo for window function

SQL Query to get latest records for that user

I have a MySQL database and I need a little help with querying the data from the table.
// Table
id INTEGER,
column1 VARCHAR,
completiondate DATETIME
// Sample data
(101, 'a', '2020-03-20 12:00:00')
(101, 'b', '2020-03-21 12:00:00')
(101, 'c', '2020-03-22 12:00:00')
(101, 'c', '2020-03-23 12:00:00')
(101, 'd', '2020-03-24 12:00:00')
(102, 'a', '2020-03-20 12:00:00')
(102, 'b', '2020-03-21 12:00:00')
Here, I want to view all the records for that specific user and display only the latest one from the duplicates found in column1.
Expected Output for user 101:
(101, 'a', '2020-03-20 12:00:00')
(101, 'b', '2020-03-21 12:00:00')
(101, 'c', '2020-03-23 12:00:00')
(101, 'd', '2020-03-24 12:00:00')
I'm new with SQL. Would be great if anyone can provide any insight on this.
Thanks in advance!
You can filter with a subquery:
select t.*
from mytable t
where
t.id = 101
t.completiondate = (
select max(t1.completiondate)
from mytable t1
where t1.id = t.id and t1.id = t.id and t1.column1 = t.column1
)
Alternatively, in MySQL 8.0, you can use window function rank():
select *
from (
select t.*, rank() over(partition by id, column1 order by completiondate desc) rn
from mytable t
where id = 101
) t
where rn = 1
Note that, for this dataset, you could also use simple aggregation:
select id, column1, max(completiondate) completiondate
from mytable
where id = 101
group by id, column1
Here is one PHP-friendly way to do this, using joins:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT id, column1, MAX(completiondate) AS maxcompletiondate
FROM yourTable
GROUP BY id, column1
) t2
ON t1.id = t2.id AND
t1.column1 = t2.column1 AND
t1.completiondate = t2.maxcompletiondate;
I think the easiest way would be to join the tables max value to the current table somehow like this
SELECT user, `date`
FROM yourtable
INNER JOIN
(
SELECT MAX(date) AS `date`, user
FROM yourtable
GROUP BY user
) latest ON latest.`date`= yourtable.`date` AND latest.user = yourtable.user

Error Code: 1060. Duplicate column name

I've been receiving Error Code: 1060. :
Duplicate column name 'NULL'
Duplicate column name '2016-08-04 01:25:06'
Duplicate column name 'john'
However, I need to insert some field with the same value, but SQL is denying and showing the above error. The error is probably sql can't select the same column name, in that case is there other way of writing the code? Below is my current code
INSERT INTO test.testTable SELECT *
FROM (SELECT NULL, 'hello', 'john', '2016-08-04 01:25:06', 'john'
, '2016-08-04 01:25:06', NULL, NULL) AS tmp
WHERE NOT EXISTS (SELECT * FROM test.testTable WHERE message= 'hello' AND created_by = 'john') LIMIT 1
My Column:
(id, message, created_by, created_date, updated_by, updated_date, deleted_by, deleted_date)
Please assist, thanks.
Your duplicate column names are coming from your subquery. You select null, john, and 2016-08-04 01:25:06 multiple times. Provide the columns you are selecting with names/aliases:
INSERT INTO test.testTable
SELECT *
FROM (SELECT NULL as col1, 'hello' as col2,
'john' as col3, '2016-08-04 01:25:06' as col4,
'john' as col5, '2016-08-04 01:25:06' as col6,
NULL as col7, NULL as col8) AS tmp
WHERE NOT EXISTS (SELECT *
FROM test.testTable
WHERE message= 'hello' AND created_by = 'john')
LIMIT 1
Not sure limit 1 is useful here, you are only selecting a single row to potentially insert.
You are using a subquery. Because you don't give the columns aliases, MySQL has to choose aliases for you -- and it chooses the formulas used for the definition.
You can write the query without the subquery:
INSERT INTO test.testTable( . . .)
SELECT NULL, 'hello', 'john', '2016-08-04 01:25:06', 'john',
'2016-08-04 01:25:06', NULL, NULL
FROM dual
WHERE NOT EXISTS (SELECT 1
FROM test.testTable tt
WHERE tt.message = 'hello' AND tt.created_by = 'john'
);
If you do use a subquery in the SELECT, then use correlation clauses in the WHERE subquery:
INSERT INTO test.testTable( . . .)
SELECT *
FROM (SELECT NULL as col1, 'hello' as message, 'john' as created_by,
'2016-08-04 01:25:06' as date, 'john' as col2,
'2016-08-04 01:25:06' as col3, NULL as col4, NULL as col5
) t
WHERE NOT EXISTS (SELECT 1
FROM test.testTable tt
WHERE tt.message = t.message AND
tt.created_by = t.created_by
);
In addition, the LIMIT 1 isn't doing anything because you only have one row.

unable to group by category and unixtime desc

I have created: http://sqlfiddle.com/#!2/7bb44/1
CREATE TABLE if not exists tblA
(
id int(11) NOT NULL auto_increment ,
userid int(255),
category int(255),
unixtime int(255),
PRIMARY KEY (id)
);
INSERT INTO tblA (userid,category,unixtime) VALUES
('1', '1','1438689946'),
('1', '2','1438690005'),
('1', '3','1438690007'),
('5', '1','1438690009'),
('2', '1','1438690005'),
('2', '1','1438690398'),
('1', '2','1438691020'),
('1', '3','1438691028'),
('4', '2','1438690005'),
('2', '3','1438691025'),
('2', '2','1438691020'),
('3', '3','1438691022');
and
Select * from tblA group by category order by unixtime desc;
But I am getting wrong values.The values do not contain right unixtime desc.How can I make it work ? I really appreciate any help.
Try this query . If 2 unixtime are same this should display only 1
Select a.*
from tblA a join
(select category, max(unixtime) as maxut
from tblA
group by category
) c
on a.category = c.category and a.unixtime = c.maxut
group by unixtime order by a.unixtime desc;
You cannot express what you want in the way you have done it. The order by is processed after the group by. Presumably you want:
Select a.*
from tblA a join
(select category, max(unixtime) as maxut
from tblA
group by category
) c
on a.category = c.category and a.unixtime = c.maxut
order by a.unixtime desc;

join user id from tblB to user from tblA and get username

how to join userid to user and get the username ?
I really appreciate any help.Thanks in Advance.
http://sqlfiddle.com/#!2/ac600/1
CREATE TABLE if not exists tblA
(
id int(11) NOT NULL auto_increment ,
user varchar(255),
category int(255),
PRIMARY KEY (id)
);
CREATE TABLE if not exists tblB
(
id int(11) NOT NULL auto_increment ,
username varchar(255),
userid int(255),
PRIMARY KEY (id)
);
INSERT INTO tblA (user, category ) VALUES
('1', '1'),
('1', '2'),
('1', '3'),
('1', '1'),
('2', '1'),
('2', '1'),
('2', '1'),
('2', '1'),
('3', '1'),
('2', '1'),
('4', '1'),
('4', '1'),
('2', '1');
INSERT INTO tblB (userid, username ) VALUES
('1', 'A'),
('2', 'B'),
('3', 'C'),
('4', 'D'),
('5', 'E');
query:
SELECT
groups.*,
#rank:=#rank+1 AS rank
FROM
(select
user,
category,
count(*) as num
from
tblA
where
category=1
group by
user,
category
order by
num desc,
user) AS groups
CROSS JOIN (SELECT #rank:=0) AS init
the table looks like :
username category num Ascending rank
B 1 6 2
A 1 2 1
D 1 2 4
C 1 1 3
Use JOIN, for example:
SELECT
tblB.username,
groups.*,
#rank:=#rank+1 AS rank
FROM
(select
user,
category,
count(*) as num
from
tblA
where
category=1
group by
user,
category
order by
num desc,
user) AS groups
-- left join: in case if data integrity fails:
left join
tblB ON groups.user=tblB.userid
CROSS JOIN (SELECT #rank:=0) AS init
-check your modified demo.
You just need to do left join
SELECT
groups.*,
#rank:=#rank+1 AS rank
FROM
(select
user,
category,
count(*) as num,
tblB.username
from
tblA
left join tblB on tblA.id = tblB.userid
where
category=1
group by
user,
category
order by
num desc,
user) AS groups
CROSS JOIN (SELECT #rank:=0) AS init
See Demo
You don't need a subquery to do what you want. You can simply join in the name:
select username, category,
count(*) as num,
#rank:=#rank+1 AS rank
from tblA join
tblB
on tblA.user = tblB.userId CROSS JOIN
(SELECT #rank:=0) AS cont
where category = 1
group by username, category
order by num desc, username;
TblB has an odd format. Normally, the auto-incrementing id would be the "userid" for the table.
Also, because you are selecting only one category, strictly speaking it is unnecessary to put category in the group by statement.
EDIT:
You cannot create a view with this method of doing the rank because it uses variables. It isn't easy to generate a rank on aggregated data in a view-compatible way in MySQL.