Mysql select N of X duplicates, omitting 1 duplicate - mysql

Considering this table:
+-----+--------+
| id | value |
+-----+--------+
| 1 | 22 |
+-----+--------+
| 2 | 12 |
+-----+--------+
| 3 | 22 |
+-----+--------+
| 4 | 22 |
+-----+ -------+
I can select all where the column value is duplicated like so:
select value from table having count(value) > 1 ;
This will output the Ids 1,3 and 4.
What I'm attempting to do is select where duplicates, but leaving 1 (one) duplicate un selected, so the above would output only the Ids 3 and 4 (or 1 and 3 etc... the duplicate omitted does not matter, only that it is.
How can I achieve this?
This question IS NOT a duplicate of
Using LIMIT within GROUP BY to get N results per group?

You could use an aggregatio function for filter a value for id and the select all the others
select * from table
where (value, id) not in (
select value, max(id)
from table
group by value
having count(value) > 1
)
;

You can do either as:
select *
from test t1
where exists (select 1
from test t2
where t2.value = t1.value
having count(value)>1)
limit 2
OR:
select t1.*
from test t1 inner join
(select value from test t2 having count(value)>1) t2
on t1.value = t2.value
limit 2;

Related

sql group by with excluded data

My table:
id | request | subject | date
1 | 5 | 1 | 576677
2 | 2 | 3 | 576698
3 | 5 | 1 | 576999
4 | 2 | 3 | 586999
5 | 2 | 7 | 596999
Need to select unique records by two columns(request,subject). But if we have different pairs of request-subject(2-3, 2-7), this records should be excluded from resulted query.
My query now is:
SELECT MAX(id), id, request, subject, date
FROM `tbl`
GROUP BY request, subject
having count(request) > 1
order by MAX(id) desc
How to exclude record with id=4, id=5 from this query? Thanks!
You may group by request, and then check for every group if all subjects in it are equal. You could do it using MIN() and MAX():
SELECT request, MIN(subject) AS subject
FROM table_1
GROUP BY request
HAVING MIN(subject) = MAX(subject)
As for your update, I assume you want all the fields for the max ID in the group (in your example, ID 3). The query would then look like this one:
SELECT *
FROM table_1 t
WHERE t.id IN (SELECT MAX(s.id)
FROM table_1 s
GROUP BY s.request
HAVING MIN(s.subject) = MAX(s.subject))
ORDER BY t.id
You can try this.
select * from MyTable T1
WHERE NOT EXISTS( SELECT * FROM MyTable T2
WHERE T1.id <> T2.id
and T1.request = T2.request
and T1.subject <> T2.subject)
Sql Fiddle

Mysql - Select at least one or select none

I have a table as so...
----------------------------------------
| id | name | group | number |
----------------------------------------
| 1 | joey | 1 | 2 |
| 2 | keidy | 1 | 3 |
| 3 | james | 2 | 2 |
| 4 | steven | 2 | 5 |
| 5 | jason | 3 | 2 |
| 6 | shane | 3 | 3 |
----------------------------------------
I'm running a select like so:
SELECT * FROM table WHERE number IN (2,3);
The problem im trying to solve is that I want to only grab get results from groups that have 1 or more rows of each number. For instance the above query is returning id's 1-2-3-5-6, when I'd like the results to exclude id 3 since the group of '2' can only return 1 result for the number of '2' and not for BOTH 2 and 3, since there's no row with the number 3 for the group 2 i'd like it to not even select id 3 at all.
Any help would be great.
Try it this way
SELECT *
FROM table1 t
WHERE number IN(2, 3)
AND EXISTS
(
SELECT *
FROM table1
WHERE number IN(2, 3)
AND `group` = t.`group`
GROUP BY `group`
HAVING MAX(number = 2) > 0
AND MAX(number = 3) > 0
)
or
SELECT *
FROM table1 t JOIN
(
SELECT `group`
FROM table1
WHERE number IN(2, 3)
GROUP BY `group`
HAVING MAX(number = 2) > 0
AND MAX(number = 3) > 0
) q
ON t.`group` = q.`group`;
or
SELECT *
FROM table1
WHERE `group` IN
(
SELECT `group`
FROM table1
WHERE number IN(2, 3)
GROUP BY `group`
HAVING MAX(number = 2) > 0
AND MAX(number = 3) > 0
);
Sample output (for both queries):
| ID | NAME | GROUP | NUMBER |
|----|-------|-------|--------|
| 1 | joey | 1 | 2 |
| 2 | keidy | 1 | 3 |
| 5 | jason | 3 | 2 |
| 6 | shane | 3 | 3 |
Here is SQLFiddle demo
On this, you can approach from a fun way with multiple joins for what you WANT qualified, OR, apply a prequery to get all qualified groups as others have suggested, but readability is a bit off for me..
Anyhow, here's an approach going through the table once, but with joins
select DISTINCT
T.id,
T.Name,
T.Group,
T.Number
from
YourTable T
Join YourTable T2
on T.Group = T2.Group AND T2.Group = 2
Join YourTable T3
on T.Group = T3.Group AND T3.Group = 3
where
T.Number IN ( 2, 3 )
So on the first record, it is pointing to by it's own group to the T2 group AND the T2 group is specifically a 2... Then again, but testing the group for the T3 instance and T3's group is a 3.
If it cant complete the join to either of the T2 or T3 instances, the record is done for consideration, and since indexes work great for joins like this, make sure you have one index for your NUMBER criteria, and another index on the (GROUP, NUMBER) for those comparisons and the next query sample...
If doing by more than this simple 2, but larger group, prequery qualified groups, then join to that
select
YT2.*
from
( select YT1.group
from YourTable YT1
where YT1.Number in (2, 3)
group by YT1.group
having count( DISTINCT YT1.group ) = 2 ) PreQualified
JOIN YourTable YT2
on PreQualified.group = YT2.group
AND YT2.Number in (2,3)
Maybe this,if I understand you
SELECT id FROM table WHERE `group` IN
(SELECT `group` FROM table WHERE number IN (2,3)
GROUP BY `group`
HAVING COUNT(DISTINCT number)=2)
SQL Fiddle
This will return all ids where BOTH numbers exist in a group.Remove DISTINCT if you want ids for groups where just one numbers is in.

MySQL update statement match only the first row

Here is the my table :
mysql> select * from t1;
+------+-------+
| id | value |
+------+-------+
| 1 | 1 |
+------+-------+
1 row in set (0.00 sec)
mysql> select * from t2;
+------+-------+
| id | value |
+------+-------+
| 1 | 2 |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
+------+-------+
4 rows in set (0.00 sec)
Then ,I run a sql to update the date in table t1 for some purpose:
mysql> update t1 join t2 on t1.id=t2.id set t1.value=t2.value ;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0
And now,see the changes:
mysql> select * from t1;
+------+-------+
| id | value |
+------+-------+
| 1 | 2 |
+------+-------+
1 row in set (0.00 sec)
I wonder that why the rows matched count is 1 ,and it's hardly understand that the column value of t1 has a value 2 where the id=1 rather than 3.Is that update stops when it matches the first row ?
I think it will do a full data match across t1 and t2 in this case.
Any help is appreciated!
update
Thanks,here is the situation that I'm dealing with actully:
For the values in t2 ,concat them seperated by ',' and the merge into the value in table t1 group by each id ,But ,all the element in t1's value should be distinct.For example: as table t1 and t2 list above , after the update operation,the t1's value should be :"1,2,3",neither 2 nor 3 .
if I use the function groupconcat(),It's will be hard to make values to be distinct for t1's value.
Agin,I don't think it's clever to update only on row as in this case.If a update across multi tables ,all the rows matched by the join condition should be updated one by one in a loop.
Based on your update to your question you can do it like this
UPDATE t1 JOIN
(
SELECT id, GROUP_CONCAT(DISTINCT value ORDER BY value) value
FROM t2
GROUP BY id
) q
ON t1.id = q.id
SET t1.value = q.value
Outcome:
+------+-------+
| id | value |
+------+-------+
| 1 | 1,2,3 |
+------+-------+
Here is SQLFiddle demo
UPDATE: Based on your comments which changed your question again. To be able to update a delimited string of values in t1 based on values in t2 you you'll need help of a numbers(tally) table to split t1.value on the fly.
You can easily create such table like this
CREATE TABLE tally(n INT NOT NULL PRIMARY KEY);
INSERT INTO tally (n)
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
That script creates a table with a sequence of numbers from 1 to 100 which will allow to effectively split up to 100 delimited values. If you need more or less you can easily adjust the script.
Now to update t1.value you can do
UPDATE t1 JOIN
(
SELECT id, GROUP_CONCAT(value ORDER BY value) value
FROM
(
SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(t1.value, ',', n.n), ',', -1) value
FROM t1 CROSS JOIN tally n
WHERE n.n <= 1 + (LENGTH(t1.value) - LENGTH(REPLACE(t1.value, ',', '')))
UNION
SELECT id, value
FROM t2
) v
GROUP BY id
) q
ON t1.id = q.id
SET t1.value = q.value
Assuming that you have in t1
| ID | VALUE |
|----|-------|
| 1 | 1,4 |
outcome of the update will be
| ID | VALUE |
|----|---------|
| 1 | 1,2,3,4 |
Here is SQLFiddle demo
That all being said in the long run you better reconsider your db schema and normalize your data. That will pay off big time by allowing normally maintain and query your data.
What a weird query! It seems that MySQL is clever enough to not update the same row 4 times. But besides that, on any database the result (the new value for t1.value) is undefined. You should always make sure that you update with a value of one row, or use an aggregate function (like min, max, ...)

Query with subquery not returning all results

I am doing the next query:
SELECT id, name, keyt
FROM table
WHERE id = (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND() LIMIT 1)
Supposing table is like this:
| id | name | keyt |
+ ------------------------- +
| 1 | Hello | 21 |
| 3 | Katzet | 1 |
| 1 | Welcome | 1 |
| 2 | Two | 21 |
| 2 | Other | 1 |
It should return one of this pairs:
Hello | Welcome (id 1 in common)
Two | Other (id 2 in common)
So, the idea is:
Get one id, which has the keyt value set to 21
Then, get all the rows with this selected id (independently of all the other keyt values)
If I do as you suggested... I would get mixed id values, and all result rows must have the same id.
SELECT x.*
FROM my_table x
JOIN
( SELECT id
FROM my_table
WHERE keyt = 21
ORDER
BY RAND() LIMIT 1
) y
ON y.id = x.id;
The subquery in this query
SELECT id, name, keyt
FROM table
WHERE id = (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND() LIMIT 1)
would return only one record as it has LIMIT 1 added at the end.
Also, in your question, the table contains only 1 record for which
value of keyt = 21, due to which you're getting only one record.
If you want more records, you should remove the LIMIT. In that case you may rephrase your query as:
SELECT id, name, keyt
FROM table
WHERE id IN (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND())
Hope this is what you expected. As your actual goal is not very clear from the question.
Your table has two 21 in the keyt column so your subquery in the where clause returns 2 values if id that is 1 and 2.So what you need to do is instead of using an equal to operator "=" use IN operator in the where clause.
SELECT id, name, keyt FROM table WHERE id IN (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND())

SELECTING SUM based on conditional if from another table

Database: mysql > ver 5.0
table 1: type_id (int), type
table 2: name_id, name, is_same_as = table2.name_id or NULL
table 3: id, table2.name_id, table1.type_id, value (float)
I want to sum values, and count values in table 3 where table2.name_id are same and also include the values of id where is_same_is=name_id. I want to select all data in table3 for all values in table2.
Apologize if my question is not very clear, and if it has already been answered but I am unable to find a relevant answer. Or dont exactly know what to look for.
[data]. table1
id | type
=========
1 | test1
2 | test2
[data].table2
name_id | name | is_same_as
==============================
1 | tb_1 | NULL
2 | tb_2 | 1
3 | tb_3 | NULL
4 | tb_4 | 1
[data].table3
id | name_id | type_id | value
======================================
1 | 1 | 1 | 1.5
2 | 2 | 1 | 0.5
3 | 2 | 2 | 1.0
output:
name_id| type_id|SUM(value)
=======================================================
1 | 1 |2.0 < because in table2, is_same_as = 1
2 | 2 |1.0
I think the following does what you want:
select coalesce(t2.is_same_as, t2.name_id) as name_id, t3.type_id, sum(value)
from table_3 t3 join
table_2 t2
on t3.name_id = t2.name_id
group by coalesce(t2.is_same_as, t2.name_id), t3.type_id
order by 1, 2
It joins the table on name_id. However, it then uses the is_same_as column, if present, or the name_id if not, for summarizing the data.
This might be what you are looking for: (I haven't tested it in MySQL, so there may be a typo)
with combined_names_tab (name_id, name_id_ref) as
(
select name_id, name_id from table2
union select t2a.name_id, t2b.name_id
from table2 t2a
join table2 t2b
on (t2a.name_id = t2b.is_same_as)
)
select cnt.name_id, t3.type_id, sum(t3.value) sum_val
from combined_names_tab cnt
join table3 t3
on ( cnt.name_id_ref = t3.name_id )
group by cnt.name_id, t3.type_id
having sum(t3.value) / count(t3.value) >= 3
Here's what the query does:
First, it creates 'combined_names_tab' which is a join of all the table2 rows that you want to GROUP BY using the "is_same_as" column to make that determination. I make sure to include the "parent" row by doing a UNION.
Second, once you have those rows above, it's a simply join to table3 with a GROUP BY and a SUM.
Note: table1 was unnecessary (I believe).
Let me know if this works!
john...