MYSQL Group by column with 2 rows for each group

MYSQL Group by column with 2 rows for each group - mysql

I need 2 id for each group.
SELECT `id`, `category`.`cat_name`
FROM `info`
LEFT JOIN `category` ON `info`.`cat_id` = `category`.`cat_id`
WHERE `category`.`cat_name` IS NOT NULL
GROUP BY `category`.`cat_name`
ORDER BY `category`.`cat_name` ASC
How to do this?
Sample Data:
id cat_name
1 Cat-1
2 Cat-1
3 Cat-2
4 Cat-1
5 Cat-2
6 Cat-1
7 Cat-2
Output Will be:
id cat_name
6 Cat-1
4 Cat-1
7 Cat-2
5 Cat-2

If you need two arbitrary ids, then use min() and max():
SELECT c.`cat_name` , min(id), max(id)
FROM `info` i INNER JOIN
`category` c
ON i.`cat_id` = c.`cat_id`
WHERE c.`cat_name` IS NOT NULL
GROUP BY c`.`cat_name`
ORDER BY c.`cat_name` ASC ;
Note: You are using a LEFT JOIN and then aggregating by a column in the second table. This is usually not a good idea, because non-matches are all placed in a NULL group. Furthermore, your WHERE clause turns the LEFT JOIN to an INNER JOIN anyway, so I've fixed that. The WHERE clause may or may not be necessary, depending on whether or not cat_name is ever NULL.
If you want the two biggest or smallest -- and can bear to have them in the same column:
SELECT c.`cat_name`,
substring_index(group_concat id order by id), ',', 2) as ids_2
FROM `info` i INNER JOIN
`category` c
ON i.`cat_id` = c.`cat_id`
WHERE c.`cat_name` IS NOT NULL
GROUP BY c`.`cat_name`
ORDER BY c.`cat_name` ASC ;

SELECT id, cat_name
FROM
( SELECT #prev := '', #n := 0 ) init
JOIN
( SELECT #n := if(c.cat_name != #prev, 1, #n + 1) AS n,
#prev := c.cat_name,
c.cat_name,
i.id
FROM `info`
LEFT JOIN `category` ON i.`cat_id` = c.`cat_id`
ORDER BY c.cat_name ASC, i.id DESC
) x
WHERE n <= 2
ORDER BY cat_name ASC, id DESC;
More discussion in Group-wise-max blog.

In a database that supported window functions, you could enumerate the position of each record in each group (ROW_NUMBER() OVER (PARTITION BY cat_name ORDER BY id DESC)) and then select those records in relative position 1 or 2.
In MySQL, you can mimic this by a self-join which counts the number of records whose id is greater-than-or-equal-to a record of the same cat_name (PARTITION ... ORDER BY id DESC). Record #1 in a cat_name group has only one record of >= its id, and record #N has N such records.
This query
SELECT id, cat_name
FROM ( SELECT c.id, c.cat_name, COUNT(1) AS relative_position_in_group
FROM category c
LEFT JOIN category others
ON c.cat_name = others.cat_name
AND
c.id <= others.id
GROUP BY 1, 2) d
WHERE relative_position_in_group <= 2
ORDER BY cat_name, id DESC;
produces:
+----+----------+
| id | cat_name |
+----+----------+
| 6 | Cat-1 |
| 4 | Cat-1 |
| 7 | Cat-2 |
| 5 | Cat-2 |
+----+----------+

Your query is like Select id, cat_name from mytable group by cat_name then update it to Select SELECT SUBSTRING_INDEX(group_concat(id), ',', 2), cat_name from mytable group by cat_name and you will get output like as follows
id cat_name
1,2 Cat-1
3,5 Cat-2
Does it helps?

the only thing you need is adding limit option to the end of your query and ordering in descending order as shown below:
SELECT `id`, `category`.`cat_name`
FROM `info`
LEFT JOIN `category` ON `info`.`cat_id` = `category`.`cat_id`
WHERE `category`.`cat_name` IS NOT NULL
GROUP BY `category`.`cat_name`
ORDER BY `category`.`cat_name` DESC
LIMIT 2

Very simple Group By ID. it is group duplicate data

I have written query for you. I hope it will resolve your problem :
SELECT
id, cat_name
FROM
(SELECT
*,
#prevcat,
CASE
WHEN cat_name != #prevcat THEN #rownum:=0
END,
#rownum:=#rownum + 1 AS cnt,
#prevcat:=cat_name
FROM
category
CROSS JOIN (SELECT #rownum:=0, #prevcat:='') r
ORDER BY cat_name ASC , id DESC) AS t
WHERE
t.cnt <= 2;

Better to use rank function the below sample query for your ouput will be helpful check it
select a.* from
(
select a, b ,rank() over(partition by b order by a desc) as rank
from a
group by b,a) a
where rank<=2

please try this, it worked in the sample data given
SELECT `id`, `category`.`cat_name`
FROM `info`
LEFT JOIN `category` ON `info`.`cat_id` = `category`.`cat_id`
WHERE `category`.`cat_name` IS NOT NULL and (SELECT count(*)
FROM info t
WHERE t.id>=info.id and t.cat_id=category.cat_id )<3
GROUP BY `category`.`cat_name`,id
ORDER BY `category`.`cat_name` ASC

Well, it's pretty ugly, but it looks like it works.
select
cat_name,
max(id) as maxid
from
table1
group by cat_name
union all
select
cat_name,
max(id) as maxid
from
table1
where not exists
(select
cat_name,
maxid
from
(select cat_name,max(id) as maxid from table1 group by cat_name) t
where t.cat_name = table1.cat_name and t.maxid = table1.id)
group by cat_name
order by cat_name
SQLFiddle
Basically, it takes the max for each cat_name, and then unions that to a second query that excludes the actual max id for each cat_name, allowing you to get the second largest id for each. Hopefully all that made sense...

select id, cat_name from
(
select #rank:=if(#prev_cat=cat_name,#rank+1,1) as rank,
id,cat_name,#prev_cat:=cat_name
from Table1,(select #rank:=0, #prev_cat:="")t
order by cat_name, id desc
) temp
where temp.rank<=2
You may verify result at http://sqlfiddle.com/#!9/acd1b/7

Related

SQL/MySQL DELETE all rows EXCEPT 2 of them

I have a database table setup like this:
id | code | group_id | status ---
---|-------|---------|------------
1 | abcd1 | group_1 | available
2 | abcd2 | group_1 | available
3 | adsd3 | group_1 | available
4 | dfgd4 | group_1 | available
5 | vfcd5 | group_1 | available
6 | bgcd6 | group_2 | available
7 | abcd7 | group_2 | available
8 | ahgf8 | group_2 | available
9 | dfgd9 | group_2 | available
10 | qwer6 | group_2 | available
In the example above, each group_id has 5 total rows (arbitrary for example, total rows will be dynamic and vary), I need to remove every row that matches available in status except for 2 of them (which 2 does not matter, as long as there are 2 of them remaining)
Basically every unique group_id should only have 2 total rows with status of available. I am able to do a simple SQL query to remove all of them, but struggling to come up with a SQL query to remove all except for 2 ... please helppppp :)

If code is unique, you can use subqueries to keep the "min" and "max"
DELETE FROM t
WHERE t.status = 'available'
AND (t.group_id, t.code) NOT IN (
SELECT group_id, MAX(code)
FROM t
WHERE status = 'available'
GROUP BY group_id
)
AND (t.group_id, t.code) NOT IN (
SELECT group_id, MIN(code)
FROM t
WHERE status = 'available'
GROUP BY group_id
)
Similarly, with an auto increment id:
DELETE FROM t
WHERE t.status = 'available'
AND t.id NOT IN (
SELECT MAX(id) FROM t WHERE status = 'available' GROUP BY group_id
UNION
SELECT MIN(id) FROM t WHERE status = 'available' GROUP BY group_id
)
I reworked the subquery into a UNION instead in this version, but the "AND" format would work just as well too. Also, if "code" was unique across the whole table, the NOT IN could be simplified down to excluding the group_id as well (though it would still be needed in the subqueries' GROUP BY clauses).
Edit: MySQL doesn't like subqueries referencing tables being UPDATEd/DELETEd in the WHERE of the query doing the UPDATE/DELETE; in those cases, you can usually double-wrap the subquery to give it an alias, causing MySQL to treat it as a temporary table (behind the scenes).
DELETE FROM t
WHERE t.status = 'available'
AND t.id NOT IN (
SELECT * FROM (
SELECT MAX(id) FROM t WHERE status = 'available' GROUP BY group_id
UNION
SELECT MIN(id) FROM t WHERE status = 'available' GROUP BY group_id
) AS a
)
Another alternative, I don't recall if MySQL complains as much about joins in DELETE/UPDATE....
DELETE t
FROM t
LEFT JOIN (
SELECT MIN(id) AS minId, MAX(id) AS maxId, 1 AS keep_flag
FROM t
WHERE status = 'available'
GROUP BY group_id
) AS tKeep ON t.id IN (tKeep.minId, tKeep.maxId)
WHERE t.status = 'available'
AND tKeep.keep_flag IS NULL

To keep the min and max ids, I think a join is the simplest solution:
DELETE t
FROM t LEFT JOIN
(SELECT group_id, MIN(id) as min_id, MAX(id) as max_id
FROM t
WHERE t.status = 'available'
GROUP BY group_id
) tt
ON t.id IN (tt.min_id, tt.max_id)
WHERE t.status = 'available' AND
tt.group_id IS NULL;

If the column "id" is the PRIMARY KEY or a UNIQUE KEY, then we could use a correlated subquery to get the second lowest value for a particular group_id.
We could then use that to identify rows for group_id that have higher values of the "id" column.
A query something like this:
SELECT t.`id`
, t.`group_id`
FROM `setup_like_this` t
WHERE t.`status` = 'available'
AND t.`id`
> ( SELECT s.`id`
FROM `setup_like_this` s
WHERE s.`status` = 'available'
AND s.`group_id` = t.`group_id`
ORDER
BY s.`id`
LIMIT 1,1
)
We test that as a SELECT first, to examine the rows that are returned. When we are satisfied this query is returning the set of rows we want to delete, we can replace SELECT ... FROM with DELETE t.* FROM to convert it to a DELETE statement to remove the rows.
Error 1093 encountered converting to DELETE statement.
One workaround is to make the query above into a inline view, and then join to the target table
DELETE q.*
FROM `setup_like_this` q
JOIN ( -- inline view, query from above returns `id` of rows we want to delete
SELECT t.`id`
, t.`group_id`
FROM `setup_like_this` t
WHERE t.`status` = 'available'
AND t.`id`
> ( SELECT s.`id`
FROM `setup_like_this` s
WHERE s.`status` = 'available'
AND s.`group_id` = t.`group_id`
ORDER
BY s.`id`
LIMIT 1,1
)
) r
ON r.id = q.id

select id, code, group_id, status
from (
select id, code, group_id, status
, ROW_NUMBER() OVER (
PARTITION BY group_id
ORDER BY id DESC) row_num
) rownum
from a
) q
where rownum < 3

How select rows fully completed in IN-query?

We have a table with two columns - art_id and cat_id.
We need to select rows WHERE cat_id = 12, 13, 15
I tried to do something:
SELECT art_id
FROM table
WHERE cat_id IN (12,13,15)
GROUP BY art_id
HAVING COUNT(cat_id) > 2
but this selection also select art_id = 4 AND 9.
Expected Output:
art_id = 1 and 7

Assuming you do not have know beforehand the art_ids, perhaps this?
SELECT art_id, GROUP_CONCAT(cat_id SEPARATOR ',') as concatenated
FROM table
GROUP BY art_id
HAVING concatenated = '12,13,15'
If the sequence in each group can be different. E.g. 13,12,15 then you'll need to sort the combinations too.
SELECT art_id,
GROUP_CONCAT(DISTINCT cat_id ORDER BY cat_id ASC SEPARATOR ',') AS concatenated
FROM table
GROUP BY art_id
HAVING concatenated = '12,13,15'

I hope this will work for you
SELECT art_id
FROM table
WHERE cat_id IN (12,13,15)
AND (art_id = 4 OR art_id = 7)
GROUP BY art_id
HAVING COUNT(cat_id) > 2

You can try this, but it works only if you haven't art_id, cat_id repetition eg: it doesn't work if you add in your sample data INSERT INTO T1 VALUES (9,15) two times
SELECT DISTINCT T1.art_id
FROM T1
INNER JOIN (SELECT art_id, COUNT(*) AS RC
FROM T1
WHERE CAT_ID IN (12,13,15)
GROUP BY art_id) X ON T1.art_id = X.art_id
WHERE X.RC>2
Output:
art_id
-----------
1
7

Mysql - Get the difference between two sequential values

I want to get the difference between two sequential values from my table.
| id | count |
| 1 | 1 |
| 2 | 7 |
| 3 | 9 |
| 4 | 3 |
| 5 | 7 |
| 6 | 9 |
For example the difference between
id2-id1 = 6,
id3-id2 = -2,
...
How can I do it? SELECT SUM(id(x+1) - id(x)) FROM table1

You can use a subquery to find count for the preceding id.
In case there are no gaps in the ID column:
SELECT CONCAT(t.`id` ,' - ', t.`id` - 1) AS `IDs`
, t.`count` - (SELECT `count`
FROM `tbl`
WHERE `id` = t.`id` - 1) AS `Difference`
FROM `tbl` t
WHERE t.`id` > 1
SQLFiddle
In case there are gaps in the IDcolumn.
First solution, using ORDER BY <...> DESC with LIMIT 1:
SELECT CONCAT(t.id ,' - ', (SELECT `id` FROM tbl WHERE t.id > id ORDER BY id DESC LIMIT 1)) AS IDs
, t.`count` - (SELECT `count`
FROM tbl
WHERE t.id > id
ORDER BY id DESC
LIMIT 1) AS difference
FROM tbl t
WHERE t.id > 1;
SQLFiddle
Second solution, using another subquery to find count with the MAX(id) less than current id:
SELECT CONCAT(t.id ,' - ', (SELECT MAX(`id`) FROM tbl WHERE id < t.id)) AS IDs
, t.`count` - (SELECT `count`
FROM tbl
WHERE `id` = (SELECT MAX(`id`)
FROM tbl
WHERE id < t.id)
) AS difference
FROM tbl t
WHERE t.id > 1;
SQLFiddle
P.S. : First column, IDs, is just for readability, you can omit it or change completely, if it is necessary.

If you know that the ids have no gaps, then just use a join:
select t.*, (tnext.count - t.count) as diff
from table t join
table tnext
on t.id = tnext.id - 1;
If you just want the sum of the differences, then that is the same as the last value minus the first value (all the intermediate values cancel out in the summation). You can do this with limit:
select last.count - first.count
from (select t.* from table order by id limit 1) as first cross join
(select t.* from table order by id desc limit 1) as last;

Try this:
SELECT MAX(count)-MIN(count) diff WHERE id IN(1,2)
Or this way
SELECT 2*STD(count) diff WHERE id IN(1,2)

This works even if ids have distances between them:
SELECT *,
((SELECT value FROM example e2 WHERE e2.id > e1.id ORDER BY id ASC LIMIT 1) - value) as diff
FROM example e1;

Mysql select last 2 elements ascending followed by 1st element

I want to select the last two elements in ascending order followed by the first element. Here is my code
SELECT products.*, locations.logo FROM
(SELECT products.* FROM
(SELECT products.* FROM products AS products ORDER BY products.id DESC )
AS products LEFT JOIN users ON users.id=products.userid WHERE users.hide=0)
AS products LEFT JOIN locations ON products.location=locations.id LIMIT 2
UNION SELECT products.*, locations.logo FROM
(SELECT products.* FROM
(SELECT products.* FROM products AS products ORDER BY products.id ASC )
AS products LEFT JOIN users ON users.id=products.userid WHERE users.hide=0)
AS products LEFT JOIN locations ON products.location=locations.id LIMIT 3
E.g. for 20 products now I'm getting
20, 19, 1 (ordered by id).
I'm trying to get 19, 20, 1.
At this moment the above statement works according to the E.g. I know I have to put an ORDER BYclause but I don't know where cause in my trials I'm getting error
"Incorrect usage of UNION and ORDER BY"
Can anybody help me with that?

You can do something like this
SELECT id
FROM
(
(
SELECT id, 0 sort_order
FROM Table1
ORDER BY id DESC
LIMIT 2
)
UNION ALL
(
SELECT id, 1 sort_order
FROM Table1
ORDER BY id
LIMIT 1
)
) q
ORDER BY sort_order, id
Output:
| ID |
|----|
| 19 |
| 20 |
| 1 |
Here is SQLFiddle demo

Fetch 2nd Higest value from MySql DB with GROUP BY

I have a table tbl_patient and I want to fetch last 2 visit of each patient in order to compare whether patient condition is improving or degrading.
tbl_patient
id | patient_ID | visit_ID | patient_result
1 | 1 | 1 | 5
2 | 2 | 1 | 6
3 | 2 | 3 | 7
4 | 1 | 2 | 3
5 | 2 | 3 | 2
6 | 1 | 3 | 9
I tried the query below to fetch the last visit of each patient as,
SELECT MAX(id), patient_result FROM `tbl_patient` GROUP BY `patient_ID`
Now i want to fetch the 2nd last visit of each patient with query but it give me error
(#1242 - Subquery returns more than 1 row)
SELECT id, patient_result FROM `tbl_patient` WHERE id <(SELECT MAX(id) FROM `tbl_patient` GROUP BY `patient_ID`) GROUP BY `patient_ID`
Where I'm wrong

select p1.patient_id, p2.maxid id1, max(p1.id) id2
from tbl_patient p1
join (select patient_id, max(id) maxid
from tbl_patient
group by patient_id) p2
on p1.patient_id = p2.patient_id and p1.id < p2.maxid
group by p1.patient_id
id11 is the ID of the last visit, id2 is the ID of the 2nd to last visit.

Your first query doesn't get the last visits, since it gives results 5 and 6 instead of 2 and 9.
You can try this query:
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
union
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
where id not in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
GROUP BY patient_ID)
order by 1,2

SELECT id, patient_result FROM `tbl_patient` t1
JOIN (SELECT MAX(id) as max, patient_ID FROM `tbl_patient` GROUP BY `patient_ID`) t2
ON t1.patient_ID = t2.patient_ID
WHERE id <max GROUP BY t1.`patient_ID`

There are a couple of approaches to getting the specified resultset returned in a single SQL statement.
Unfortunately, most of those approaches yield rather unwieldy statements.
The more elegant looking statements tend to come with poor (or unbearable) performance when dealing with large sets. And the statements that tend to have better performance are more un-elegant looking.
Three of the most common approaches make use of:
correlated subquery
inequality join (nearly a Cartesian product)
two passes over the data
Here's an approach that uses two passes over the data, using MySQL user variables, which basically emulates the analytic RANK() OVER(PARTITION ...) function available in other DBMS:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM (
SELECT p.id
, p.patient_id
, p.visit_id
, p.patient_result
, #rn := if(#prev_patient_id = patient_id, #rn + 1, 1) AS rn
, #prev_patient_id := patient_id AS prev_patient_id
FROM tbl_patients p
JOIN (SELECT #rn := 0, #prev_patient_id := NULL) i
ORDER BY p.patient_id DESC, p.id DESC
) t
WHERE t.rn <= 2
Note that this involves an inline view, which means there's going to be a pass over all the data in the table to create a "derived tabled". Then, the outer query will run against the derived table. So, this is essentially two passes over the data.
This query can be tweaked a bit to improve performance, by eliminating the duplicated value of the patient_id column returned by the inline view. But I show it as above, so we can better understand what is happening.
This approach can be rather expensive on large sets, but is generally MUCH more efficient than some of the other approaches.
Note also that this query will return a row for a patient_id if there is only one id value exists for that patient; it does not restrict the return to just those patients that have at least two rows.
It's also possible to get an equivalent resultset with a correlated subquery:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
WHERE ( SELECT COUNT(1) AS cnt
FROM tbl_patients p
WHERE p.patient_id = t.patient_id
AND p.id >= t.id
) <= 2
ORDER BY t.patient_id ASC, t.id ASC
Note that this is making use of a "dependent subquery", which basically means that for each row returned from t, MySQL is effectively running another query against the database. So, this will tend to be very expensive (in terms of elapsed time) on large sets.
As another approach, if there are relatively few id values for each patient, you might be able to get by with an inequality join:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
LEFT
JOIN tbl_patients p
ON p.patient_id = t.patient_id
AND t.id < p.id
GROUP
BY t.id
, t.patient_id
, t.visit_id
, t.patient_result
HAVING COUNT(1) <= 2
Note that this will create a nearly Cartesian product for each patient. For a limited number of id values for each patient, this won't be too bad. But if a patient has hundreds of id values, the intermediate result can be huge, on the order of (O)n**2.

Try this..
SELECT id, patient_result FROM tbl_patient AS tp WHERE id < ((SELECT MAX(id) FROM tbl_patient AS tp_max WHERE tp_max.patient_ID = tp.patient_ID) - 1) GROUP BY patient_ID

Why not use simply...
GROUP BY `patient_ID` DESC LIMIT 2
... and do the rest in the next step?

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MYSQL Group by column with 2 rows for each group - mysql

Your query is like Select id, cat_name from mytable group by cat_name then update it to Select SELECT SUBSTRING_INDEX(group_concat(id), ',', 2), cat_name from mytable group by cat_name and you will get output like as follows id cat_name 1,2 Cat-1 3,5 Cat-2 Does it helps?

Very simple Group By ID. it is group duplicate data

Better to use rank function the below sample query for your ouput will be helpful check it select a.* from ( select a, b ,rank() over(partition by b order by a desc) as rank from a group by b,a) a where rank<=2

select id, cat_name from ( select #rank:=if(#prev_cat=cat_name,#rank+1,1) as rank, id,cat_name,#prev_cat:=cat_name from Table1,(select #rank:=0, #prev_cat:="")t order by cat_name, id desc ) temp where temp.rank<=2 You may verify result at http://sqlfiddle.com/#!9/acd1b/7

Related

SQL/MySQL DELETE all rows EXCEPT 2 of them

How select rows fully completed in IN-query?

Mysql - Get the difference between two sequential values

Mysql select last 2 elements ascending followed by 1st element

Fetch 2nd Higest value from MySql DB with GROUP BY

Categories

Resources