I'm trying to write a SQL query to do the following:
Given the following table:
+----+----------+-----------+
| id | group_id | value |
+----+----------+-----------+
| 1 | 1 | 0 |
+----+----------+-----------+
| 2 | 1 | 0 |
+----+----------+-----------+
| 3 | 2 | null |
+----+----------+-----------+
| 4 | 3 | -1 |
+----+----------+-----------+
| 5 | 3 | 1 |
+----+----------+-----------+
| 6 | 4 | something |
+----+----------+-----------+
| 7 | 5 | something |
+----+----------+-----------+
select *
where values do not equal each other
group by group_id
For this example, output should be:
+----+----------+-----------+
| id | group_id | value |
+----+----------+-----------+
| 4 | 1 | -1 |
+----+----------+-----------+
| 5 | 1 | 1 |
+----+----------+-----------+
Does anyone know if this is possible?
If you only want to find the group_id values which have different value, you can use this query:
SELECT group_id
FROM data
GROUP BY group_id
HAVING MIN(value) != MAX(value)
Output for your sample data:
group_id
3
If you want to get the rows that are associated with that group_id, use the above query as a subquery for an IN expression:
SELECT *
FROM data
WHERE group_id IN (
SELECT group_id
FROM data
GROUP BY group_id
HAVING MIN(value) != MAX(value)
)
Output
id group_id value
4 3 -1
5 3 1
Demo on dbfiddle
You can do with exists to get group_id which does not have same value. here is the demo.
select
distinct group_id
from data d1
where exists
(
select
group_id
from data d2
where d1.group_id = d2.group_id
and d1.value <> d2.value
)
Output
*--------*
|group_id|
*--------*
| 3 |
*--------*
If you want group_id and value both then try the follwoing
select
group_id,
value
from data d1
where exists
(
select
group_id
from data d2
where d1.group_id = d2.group_id
and d1.value <> d2.value
)
Output:
*------------------*
|group_id | value |
*------------------*
| 3 | -1 |
| 3 | 1 |
*------------------*
This query:
select group_id
from tablename
group by group_id
having count(distinct value) > 1 and count(distinct value) = count(value)
returns the group_ids that you want, so use it with the operator IN:
select * from tablename
where group_id in (
select group_id
from tablename
group by group_id
having count(distinct value) > 1 and count(distinct value) = count(value)
)
See the demo.
Results:
| id | group_id | value |
| --- | -------- | ----- |
| 4 | 3 | -1 |
| 5 | 3 | 1 |
Related
my table has duplicate row values in specific columns. i would like to remove those rows and keep the row with the latest id.
the columns i want to check and compare are:
sub_id, spec_id, ex_time
so, for this table
+----+--------+---------+---------+-------+
| id | sub_id | spec_id | ex_time | count |
+----+--------+---------+---------+-------+
| 1 | 100 | 444 | 09:29 | 2 |
| 2 | 101 | 555 | 10:01 | 10 |
| 3 | 100 | 444 | 09:29 | 23 |
| 4 | 200 | 321 | 05:15 | 5 |
| 5 | 100 | 444 | 09:29 | 8 |
| 6 | 101 | 555 | 10:01 | 1 |
+----+--------+---------+---------+-------+
i would like to get this result
+----+--------+---------+---------+-------+
| id | sub_id | spec_id | ex_time | count |
+----+--------+---------+---------+-------+
| 5 | 100 | 444 | 09:29 | 8 |
| 6 | 101 | 555 | 10:01 | 1 |
+----+--------+---------+---------+-------+
i was able to build this query to select all duplicate rows from multiple columns, according to this question
select t.*
from mytable t join
(select id, sub_id, spec_id, ex_time, count(*) as NumDuplicates
from mytable
group by sub_id, spec_id, ex_time
having NumDuplicates > 1
) tsum
on t.sub_id = tsum.sub_id and t.spec_id = tsum.spec_id and t.ex_time = tsum.ex_time
but now im not sure how to wrap this select with a delete query to delete the rows except for the ones with highest id.
as shown here
You can modify your sub-select query, to get maximum value of id for each duplication combination.
Now, while joining to the main table, simply put a condition that id value will not be equal to the maximum id value.
You can now Delete from this result-set.
Try the following:
DELETE t
FROM mytable AS t
JOIN
(SELECT MAX(id) as max_id,
sub_id,
spec_id,
ex_time,
COUNT(*) as NumDuplicates
FROM mytable
GROUP BY sub_id, spec_id, ex_time
HAVING NumDuplicates > 1
) AS tsum
ON t.sub_id = tsum.sub_id AND
t.spec_id = tsum.spec_id AND
t.ex_time = tsum.ex_time AND
t.id <> tsum.max_id
I have the followin problem:
I want to update all rows where COUNT criteria is greater 1, when not I want to update all except 1. It also should update per other_ID.
Dummytable:
+----+----------+----------+-------------+
| id | other_ID | cirteria | updatefield |
+----+----------+----------+-------------+
| 1 | 1 | 1 | 0 |
| 2 | 1 | 1 | 0 |
| 3 | 1 | 1234 | 0 |
| 4 | 2 | 2 | 0 |
| 5 | 2 | 1 | 0 |
| 6 | 2 | 1 | 0 |
| 7 | 4 | 20 | 0 |
| 8 | 4 | 1 | 0 |
| 9 | 4 | 60 | 0 |
| 10 | 5 | 1 | 0 |
| 11 | 5 | 1 | 0 |
| 12 | 6 | 5 | 0 |
+----+----------+----------+-------------+
excpected result:
+----+----------+----------+-------------+
| id | other_ID | cirteria | updatefield |
+----+----------+----------+-------------+
| 1 | 1 | 1 | 1 |
| 2 | 1 | 1 | 1 |
| 3 | 1 | 1234 | 0 |
| 4 | 2 | 2 | 0 |
| 5 | 2 | 1 | 1 |
| 6 | 2 | 1 | 1 |
| 7 | 4 | 20 | 0 |
| 8 | 4 | 1 | 1 |
| 9 | 4 | 60 | 0 |
| 10 | 5 | 1 | 0 |
| 11 | 5 | 1 | 1 |
| 12 | 6 | 5 | 0 |
+----+----------+----------+-------------+
my idea:
UPDATE pics AS tu SET updatefield=1 WHERE criteria=1 AND (select count(*) as cnt2 from pics where criteria>1 group by other_id)>1;
Error: Table 'tu' is specified twice, both as a target for 'UPDATE' and as a separate source for data
Also I have problems to geht the right count:
SELECT other_id, count() as cnt FROM pics AS ts WHERE criteria=1 and (select count() as cnt2 from pics where criteria>1)>0 GROUP BY other_id;
i want to get cnt = 1 for other_id=5, but i get cnt=2
with
SELECT other_id, COUNT(*) AS cnt2
FROM pics
WHERE criteria>1
GROUP BY other_id;
I get all other_ids where i want to update the updatefield. But how can I connect it with the update? And how to get all except one for other_id=5
You can alias the sub query into another query, e.g.:
UPDATE test
SET updatefield = 1
WHERE updatefield = 0 AND criteria = 1
AND other_id IN (
SELECT a.id FROM (
SELECT other_id AS id
FROM test
WHERE criteria > 1
GROUP BY other_id
HAVING COUNT(*) > 1
) a
);
Here's the SQL Fiddle.
Update
This will update the ids for records with criteria 0 and >1. Now, to update the records where there is more than one record with 1 criteria, you need to do something like this:
UPDATE test
SET updatefield = 1
WHERE updatefield = 0 AND criteria = 1
AND id IN (
SELECT a.id FROM (
SELECT MIN(id) AS id
FROM test
WHERE criteria = 1
GROUP BY other_id
HAVING COUNT(*) > 1
) a
);
Thanks to #Darshan Mehtas answer and help I finally found the solution to solve it as I want.
Here's the complete solution:
UPDATE test
SET updatefield = 1
WHERE updatefield = 0 AND criteria = 1
AND id not IN (
SELECT a.id FROM (
SELECT id
FROM test
WHERE criteria>1
) a
)
AND id not IN (
SELECT b.id FROM (
SELECT id
FROM test
GROUP BY other_id
HAVING COUNT(*) = 1
) b
)
AND id NOT IN (
SELECT c.id FROM (
SELECT id
FROM test
WHERE criteria=1 AND other_id NOT IN (
SELECT other_id FROM test WHERE Criteria>1
)
GROUP BY other_id, criteria
HAVING COUNT(criteria)>1
) c
);
Short description:
First Subquery (a) filters IDs where a criteria is greater 1
Second Subquery (b) filters IDs which have only on result
Third Subquery (c) filters IDs Where criteria is 1an don't have any higher criteria and keeps, thansk grouping, the first result.
Only bad thing could be to keep in the last subquery (c) the first (mostly oldest) result instead of newest.
€dit:
to keep the last result use this for subquery c instead:
AND id NOT IN (
SELECT c.id FROM (
SELECT id
FROM test t1
JOIN (SELECT other_id, max(id) maxid
FROM test
GROUP BY other_id) t2
ON t1.otheR_id=t2.other_id AND t1.id=t2.maxid
WHERE criteria=1 AND t1.other_id NOT IN (
SELECT other_id FROM test WHERE Criteria>1
)
GROUP BY t1.other_id, criteria
) c
);
After looking at other examples I still have not been able to find a solution, that is why I am asking for some help.
My table structure:
V_id | name | group_id | other columns
----------------------
1 | | 1
2 | | 1
3 | | 2
4 | | 3
5 | | 3
I have been struggling to build a query, to select all the rows which have the maximum value from the group_id column.
therefore output should be like this:
V_id | name | group_id | other columns
----------------------
4 | | 3
5 | | 3
which I believe can be solved by selecting all records where group_id is the highest.
and also need a query to get all the other remaining rows.
which in this case, should be like this:
V_id | name | group_id | other columns
----------------------
1 | | 1
2 | | 1
3 | | 2
which I believe can be done by selecting all records where group_id < Max(group_id)
for the first part of the problem,
SELECT *
FROM tableName
WHERE group_id = (SELECT MAX(group_ID) FROM TableName)
and for the second part,
SELECT *
FROM tableName
WHERE group_id < (SELECT MAX(group_ID) FROM TableName)
You can use JOIN for that:
SELECT a.*
FROM Table1 a
JOIN (SELECT MAX(Group_ID) AS MAXID
FROM Table1) B
ON a.Group_id = B.MaxID;
Result:
| V_ID | NAME | GROUP_ID |
----------------------------
| 4 | (null) | 3 |
| 5 | (null) | 3 |
For the remaining rows use LEFT JOIN with a condition like this:
SELECT a.*
FROM Table1 a
LEFT JOIN (SELECT MAX(Group_ID) AS MAXID
FROM Table1) B
ON a.Group_id = B.MaxID
WHERE B.MaxID IS NULL;
Result:
| V_ID | NAME | GROUP_ID |
----------------------------
| 1 | (null) | 1 |
| 2 | (null) | 1 |
| 3 | (null) | 2 |
See this SQLFiddle
Imagine this table t1,
+----------+-------+--------+
| group_id | name | age |
+----------+-------+--------+
| 1 | A1 | 1 |
| 1 | A2 | 2 |
| 1 | A3 | 3 |
| 2 | B1 | 4 |
+----------+-------+--------+
Using the following query in MySQL,
SELECT group_id, name, COUNT(*) FROM t1 GROUP BY group_id
we get,
+----------+-------+--------+----------+
| group_id | name | age | COUNT(*) |
+----------+-------+--------+----------+
| 1 | A1 | 2 | 3 |
| 2 | B1 | 4 | 1 |
+----------+-------+--------+----------+
As you can see here, it's possible that values name=A1 and age=2 are not from the same record.
My question is, how can I control which single results form the name and age columns are shown, so the content is from one record? Is there a way to sort them in some way? Fro example sorting by age in reverse order would give
+----------+-------+--------+----------+
| group_id | name | age | COUNT(*) |
+----------+-------+--------+----------+
| 1 | A3 | 3 | 3 |
| 2 | B1 | 4 | 1 |
+----------+-------+--------+----------+
Thanks.
I don't know why do you say that your query works. You should also group by name...
SELECT group_id, name, COUNT(*) FROM t1 GROUP BY group_id, name
If you want to get only one of them, try:
SELECT group_id, MIN(name), COUNT(*) FROM t1 GROUP BY group_id
I don't know about full control, but you can do like this
SELECT student_name, MIN(test_score), MAX(test_score)
FROM student
GROUP BY student_name;
SELECT group_id, name, COUNT(*)
FROM t1
WHERE name IN ( 'xxx', 'yyy', ..., 'zzz' )
GROUP BY group_id
SORT BY COUNT(*)
How to select 1st, 2nd or 3rd value before MAX ?
usually we do it with order by and limit
SELECT * FROM table1
ORDER BY field1 DESC
LIMIT 2,1
but with my current query I don't know how to make it...
Sample table
+----+------+------+-------+
| id | name | type | count |
+----+------+------+-------+
| 1 | a | 1 | 2 |
| 2 | ab | 1 | 3 |
| 3 | abc | 1 | 1 |
| 4 | b | 2 | 7 |
| 5 | ba | 2 | 1 |
| 6 | cab | 3 | 9 |
+----+------+------+-------+
I'm taking name for each type with max count with this query
SELECT
`table1b`.`name`
FROM
(SELECT
`table1a`.`type`, MAX(`table1a`.`count`) AS `Count`
FROM
`table1` AS `table1a`
GROUP BY `table1a`.`type`) AS `table1a`
INNER JOIN
`table1` AS `table1b` ON (`table1b`.`type` = `table1a`.`type` AND `table1b`.`count` = `table1a`.`Count`)
and I want one more column additional to name with value before max(count)
so result should be
+------+------------+
| name | before_max |
+------+------------+
| ab | 2 |
| b | 1 |
| cab | NULL |
+------+------------+
Please ask if something isn't clear ;)
AS per your given table(test) structure, the query has to be as follows :
select max_name.name,before_max.count
from
(SELECT type,max(count) as max
FROM `test`
group by type) as type_max
join
(select type,name,count
from test
) as max_name on (type_max.type = max_name.type and count = type_max.max )
left join
(select type,count
from test as t1
where count != (select max(count) from test as t2 where t1.type = t2.type)
group by type
order by count desc) as before_max on(type_max.type = before_max .type)