Select ID that doesn't have field with certain content - mysql

I have the following table:
userID | key | value
1 color green
1 eyes blue
1 hair brunette
2 color red
How can I select all the userIDs that don't have a key 'eyes'?

Using a single query you can just get a count for key = eyes and compare this count to be zero to have userIDs who don't have a key named as eyes
select `userID`,
sum(`key` = 'eyes') `count`
from t
group by `userID`
having `count` = 0
Demo

I prefer the LEFT JOIN ... IS NULL approach:
SELECT DISTINCT userID
FROM table_name tn
LEFT JOIN table_name tn2
ON tn2.userID = tn1.userID
AND tn2.key = eyes
WHERE tn2.userID IS NULL
This tends to outperform other approaches when tables are properly indexed.

You could just do something like this:
SELECT DISTINCT userID
FROM table
WHERE userID NOT IN (SELECT b.userID
FROM table b
WHERE b.key = 'eyes')
You'd be better off to have another Users table from which you could select the userId numbers, and if you did you could just substitute that out for the first FROM table and remove the DISTINCT requirement. In fact, you could even SELECT * FROM users if you had that, if that was what you were going for, to get all the details about any users who didn't have the key.
But this should work, in any event.

Related

Compare multiple values to subquery result in where clause

I have two related tables as follows :
USERS
user_id <\PK>
USERACTIONS
user_action_id <\PK>
user_id <\FK>
user_action <\int>
Whenever user performs an action, there is a new insertion in "useractions" table. I need a query to fetch those USERACTION rows where user performed only particular set of actions say (1,2) but not (3,4).
So I have a query like -
select * from USERACTIONS where (1,2) in(select user_action from USERACTIONS where user_id=100) and user_id=100;
Problem is the above query doesn't work as supplying (1,2) expects subquery also to return two columns which is understandable. This is the error I get -
ERROR: subquery has too few columns
Giving a single value say (1) or (2) works perfectly. I want to know if there is any way I can use the same query and compare the subquery's result with multiple values? I prefer the same query because the case demonstrated here is just a part of a large query.
Please note the query should not list users who performed (1,2,3,4) those who performed only (1,2) should be listed and also user_action values can be any random integer.
Any alternate queries are welcome but would prefer changes in the same query. Thanks in advance.
try this:
SELECT USERS.user_id, USERACTIONS.user_action
FROM USERACTIONS
LEFT JOIN USERS ON USERS.user_id = USERACTIONS.user_id where USERACTIONS.user_action in (1,2);
This Works for your query.
You add the numbers to the in Clause
SELECT a.user_id
FROM
(SELECT DISTINCT user_id
from
USERACTIONS
WHERE user_action
IN (1,2)) a
INNER JOIN
(SELECT DISTINCT user_id
from
USERACTIONS
WHERE user_action
NOT IN (1,2)) b
ON a.user_id <> b.user_id
;
CREATE TABLE USERACTIONS (id INT NOT NULL AUTO_INCREMENT
, PRIMARY KEY(id)
, user_action INT
, user_id INT
);
INSERT USERACTIONS VALUES (NULL,1,100),(NULL,2,100),(NULL,3,100), (NULL,1,101),(NULL,2,101);
✓
✓
SELECT a.user_id
FROM
(SELECT DISTINCT user_id
from
USERACTIONS
WHERE user_action
IN (1,2)) a
INNER JOIN
(SELECT DISTINCT user_id
from
USERACTIONS
WHERE user_action
NOT IN (1,2)) b
ON a.user_id <> b.user_id
;
| user_id |
| ------: |
| 101 |
db<>fiddle here
I see typical SO answers that aren't answering OP's question, but rather trying to steer them in a different direction. I know this is old, but if anyone stumbles upon this, I believe this will be more helpful.
I too have a large, enterprise solution where the WHERE check is MUCH more performant in a subquery than using a JOIN.
You can set a variable in your WHERE clause and use it afterwards. I am currently trying to find a better way to do this without setting a variable, but something like this works:
SELECT * FROM USERACTIONS
WHERE
( #useraction =
(select user_action from USERACTIONS where user_id=100 LIMIT 1)
= 1
OR #useraction = 2)
AND user_id=100;
What you are doing is creating a variable in your WHERE clause, setting that variable, then using it later. This is encapsulated, so it can match either one of the conditions.

How to get that Id when the condition does not match

I have a table T. We have multiple records for a particular user_id with match type = "Red Card". I just wanted that user_Id and match_id which has never received a "Red Card" in a entire match.
As per the table Image which is attached I would be getting output :
match_id : 3036 and 3090 and user_id 4 and 6 respectively
If you want to select all fields Use subquery
SELECT DISTINCT user_id, match_id from tbl where match_id NOT IN (
SELECT match_id from tbl where type = 'Red Card'
)
I'd do something like this:
SELECT * FROM T WHERE type <>'Red Card' GROUP BY match_id, user_id
This will select all records from the table where there is not "Red Card" and the group by will give you just one record from each couple match/user
SELECT * FROM `T` where not exists (SELECT * FROM `T` where `type`="Red card")
SELECT MATCH_ID AND USER_ID FROM TABLENAME WHERE TYPE NOT IN('RED CARD')
This Query may help you
Hope this will help.
SELECT DISTINCT user_id, match_id FROM T WHERE type <> 'Red Card' GROUP BY match_id
Because there are duplicate user_id and match_id-s in the table, I've used DISTINCT to select unique ids from the table.
Select match_id,user_id from T where user_id Not IN (Select user_id from T where type = 'Red Card') as k Group By user_id
The above query will remove all the users having red cards by first selecting user_id from T who have red cards in the nested query and then selecting users who are not in the nested statement using "NOT IN" and then grouping by user_id to remove duplicate records.

Get number of values that only appear once in a column

Firstly, if it is relevant, I'm using MySQL, though I assume a solution would work across DB products. My problem is thus:
I have a simple table with a single column. There are no constraints on the column. Within this column there is some simple data, e.g.
a
a
b
c
d
d
I need to get the number/count of values that only appear once. From the example above that would be 2 (since only b and c occur once in the column).
Hopefully it's clear I don't want DISTINCT values, but UNIQUE values. I have actually done this before, by creating an additional table with a UNIQUE constraint on the column and simply INSERTing to the new table from the old one, handling the duplicates accordingly.
I was hoping to find a solution that did not require the temporary table, and could somehow just be accomplished with a nifty SELECT.
Assuming your table is called T and your field is called F:
SELECT COUNT(F)
FROM (
SELECT F
FROM T
GROUP BY F
HAVING COUNT(*) = 1
) AS ONLY_ONCE
select count(*) from
(
select
col1, count(*)
from
Table
group by
Col1
Having
Count(Col1) = 1
)
just nest it a little...
select count( cnt ) from
( select count(mycol) cnt from mytab group by mycol )
where cnt = 1
select field1, count(field1) from my_table group by field1 having count(field1) = 1
select count(*) from (select field1, count(field1) from my_table group by field1 having count(field1) = 1)
first one will return the ones that are unique and second one will return the number of unique elements.
Could it be as simple as this:
Select count(*) From MyTable Group By MyColumn Where Count(MyColumn) = 1
This is what I did and it worked:
SELECT name
FROM people JOIN stars ON stars.person_id = people.id
JOIN movies ON movies.id = stars.movie_id
WHERE year = 2004
GROUP BY name, person_id ORDER BY birth;
note: I was working with several tables here.
CS50 Problem Set 7 (pset7) 9.sql fix!!

How do I write a SQL query to detect duplicate primary keys?

Suppose I want to alter the table so that my primary keys are as follows
user_id , round , tournament_id
Currently there are duplicates that I need to clean up. What is the query to find all duplicates?
This is for MySQL and I would like to see duplicate rows
Technically, you don't need such a query; any RDBMS worth its salt will not allow the insertion of a row which would produce a duplicate primary key in the table. Such a thing violates the very definition of a primary key.
However, if you are looking to write a query to find duplicates of these groups of columns before applying a primary key to the table that consists of these columns, then this is what you'd want:
select
t.user_id, t.round, t.tournament_id
from
table as t
group by
t.user_id, t.round, t.tournament_id
having
count(*) > 1
The above will only give you the combination of columns that have more than one row for that combination, if you want to see all of the columns in the rows, then you would do the following:
select
o.*
from
table as o
inner join (
select
t.user_id, t.round, t.tournament_id
from
table as t
group by
t.user_id, t.round, t.tournament_id
having
count(*) > 1
) as t on
t.user_id = o.user_id and
t.round = o.round and
t.tournament_id = o.tournament_id
Note that you could also create a temporary table and join on that if you need to use the results multiple times.
SELECT name, COUNT(*) AS counter
FROM customers
GROUP BY name
HAVING COUNT (*) > 1
That's what you are looking for.
In table:
ID NAME email
-- ---- -----
1 John Doe john#teratrax.com
2 Mark Smith marks#teratrax.com
3 John Doe jdoe#company.com
will return
name counter
---- -------
John Doe 2
Assuming you either have a table with those three columns, or that you can make and populate a table with those three columns, this query will show the duplicates.
select user_id, round, tournament_id
from yourtable
group by user_id, round, tournament_id
having count(*) > 1
This query selects all rows from the customers table that have a duplicate name but also shows the email of each duplicate.
SELECT c.name, c.email FROM customers c, customers d
WHERE c.name = d.name
GROUP BY c.name, c.email
HAVING COUNT(*) > 1
The downside of this is that you have to list all the columns you want to output twice, once in the SELECT and once in the GROUP BY clause. The other approach is to use a subquery or join to filter the table against the list of known duplicate keys.

Find duplicate records in MySQL

I want to pull out duplicate records in a MySQL Database. This can be done with:
SELECT address, count(id) as cnt FROM list
GROUP BY address HAVING cnt > 1
Which results in:
100 MAIN ST 2
I would like to pull it so that it shows each row that is a duplicate. Something like:
JIM JONES 100 MAIN ST
JOHN SMITH 100 MAIN ST
Any thoughts on how this can be done? I'm trying to avoid doing the first one then looking up the duplicates with a second query in the code.
The key is to rewrite this query so that it can be used as a subquery.
SELECT firstname,
lastname,
list.address
FROM list
INNER JOIN (SELECT address
FROM list
GROUP BY address
HAVING COUNT(id) > 1) dup
ON list.address = dup.address;
SELECT date FROM logs group by date having count(*) >= 2
Why not just INNER JOIN the table with itself?
SELECT a.firstname, a.lastname, a.address
FROM list a
INNER JOIN list b ON a.address = b.address
WHERE a.id <> b.id
A DISTINCT is needed if the address could exist more than two times.
I tried the best answer chosen for this question, but it confused me somewhat. I actually needed that just on a single field from my table. The following example from this link worked out very well for me:
SELECT COUNT(*) c,title FROM `data` GROUP BY title HAVING c > 1;
Isn't this easier :
SELECT *
FROM tc_tariff_groups
GROUP BY group_id
HAVING COUNT(group_id) >1
?
select `cityname` from `codcities` group by `cityname` having count(*)>=2
This is the similar query you have asked for and its 200% working and easy too.
Enjoy!!!
Find duplicate users by email address with this query...
SELECT users.name, users.uid, users.mail, from_unixtime(created)
FROM users
INNER JOIN (
SELECT mail
FROM users
GROUP BY mail
HAVING count(mail) > 1
) dupes ON users.mail = dupes.mail
ORDER BY users.mail;
we can found the duplicates depends on more then one fields also.For those cases you can use below format.
SELECT COUNT(*), column1, column2
FROM tablename
GROUP BY column1, column2
HAVING COUNT(*)>1;
Finding duplicate addresses is much more complex than it seems, especially if you require accuracy. A MySQL query is not enough in this case...
I work at SmartyStreets, where we do address validation and de-duplication and other stuff, and I've seen a lot of diverse challenges with similar problems.
There are several third-party services which will flag duplicates in a list for you. Doing this solely with a MySQL subquery will not account for differences in address formats and standards. The USPS (for US address) has certain guidelines to make these standard, but only a handful of vendors are certified to perform such operations.
So, I would recommend the best answer for you is to export the table into a CSV file, for instance, and submit it to a capable list processor. One such is LiveAddress which will have it done for you in a few seconds to a few minutes automatically. It will flag duplicate rows with a new field called "Duplicate" and a value of Y in it.
Another solution would be to use table aliases, like so:
SELECT p1.id, p2.id, p1.address
FROM list AS p1, list AS p2
WHERE p1.address = p2.address
AND p1.id != p2.id
All you're really doing in this case is taking the original list table, creating two pretend tables -- p1 and p2 -- out of that, and then performing a join on the address column (line 3). The 4th line makes sure that the same record doesn't show up multiple times in your set of results ("duplicate duplicates").
Not going to be very efficient, but it should work:
SELECT *
FROM list AS outer
WHERE (SELECT COUNT(*)
FROM list AS inner
WHERE inner.address = outer.address) > 1;
This will select duplicates in one table pass, no subqueries.
SELECT *
FROM (
SELECT ao.*, (#r := #r + 1) AS rn
FROM (
SELECT #_address := 'N'
) vars,
(
SELECT *
FROM
list a
ORDER BY
address, id
) ao
WHERE CASE WHEN #_address <> address THEN #r := 0 ELSE 0 END IS NOT NULL
AND (#_address := address ) IS NOT NULL
) aoo
WHERE rn > 1
This query actially emulates ROW_NUMBER() present in Oracle and SQL Server
See the article in my blog for details:
Analytic functions: SUM, AVG, ROW_NUMBER - emulating in MySQL.
This also will show you how many duplicates have and will order the results without joins
SELECT `Language` , id, COUNT( id ) AS how_many
FROM `languages`
GROUP BY `Language`
HAVING how_many >=2
ORDER BY how_many DESC
SELECT firstname, lastname, address FROM list
WHERE
Address in
(SELECT address FROM list
GROUP BY address
HAVING count(*) > 1)
select * from table_name t1 inner join (select distinct <attribute list> from table_name as temp)t2 where t1.attribute_name = t2.attribute_name
For your table it would be something like
select * from list l1 inner join (select distinct address from list as list2)l2 where l1.address=l2.address
This query will give you all the distinct address entries in your list table... I am not sure how this will work if you have any primary key values for name, etc..
Fastest duplicates removal queries procedure:
/* create temp table with one primary column id */
INSERT INTO temp(id) SELECT MIN(id) FROM list GROUP BY (isbn) HAVING COUNT(*)>1;
DELETE FROM list WHERE id IN (SELECT id FROM temp);
DELETE FROM temp;
Personally this query has solved my problem:
SELECT `SUB_ID`, COUNT(SRV_KW_ID) as subscriptions FROM `SUB_SUBSCR` group by SUB_ID, SRV_KW_ID HAVING subscriptions > 1;
What this script does is showing all the subscriber ID's that exists more than once into the table and the number of duplicates found.
This are the table columns:
| SUB_SUBSCR_ID | int(11) | NO | PRI | NULL | auto_increment |
| MSI_ALIAS | varchar(64) | YES | UNI | NULL | |
| SUB_ID | int(11) | NO | MUL | NULL | |
| SRV_KW_ID | int(11) | NO | MUL | NULL | |
Hope it will be helpful for you either!
SELECT t.*,(select count(*) from city as tt where tt.name=t.name) as count FROM `city` as t where (select count(*) from city as tt where tt.name=t.name) > 1 order by count desc
Replace city with your Table.
Replace name with your field name
SELECT id, count(*) as c
FROM 'list'
GROUP BY id HAVING c > 1
This will return you the id with the number of times that id is repeated, or nothing in which case you will not have repeated id.
Change the id in the group by (ex: address) and it will return the number of times an address is repeated identified by the first found id with that address.
SELECT id, count(*) as c
FROM 'list'
GROUP BY address HAVING c > 1
I hope it helps. Enjoy ;)
SELECT *
FROM (SELECT address, COUNT(id) AS cnt
FROM list
GROUP BY address
HAVING ( COUNT(id) > 1 ))
I use the following:
SELECT * FROM mytable
WHERE id IN (
SELECT id FROM mytable
GROUP BY column1, column2, column3
HAVING count(*) > 1
)
Most of the answers here don't cope with the case when you have MORE THAN ONE duplicate result and/or when you have MORE THAN ONE column to check for duplications. When you are in such case, you can use this query to get all duplicate ids:
SELECT address, email, COUNT(*) AS QUANTITY_DUPLICATES, GROUP_CONCAT(id) AS ID_DUPLICATES
FROM list
GROUP BY address, email
HAVING COUNT(*)>1;
If you want to list every result as a single line, you need a more complex query. This is the one I found working:
CREATE TEMPORARY TABLE IF NOT EXISTS temptable AS (
SELECT GROUP_CONCAT(id) AS ID_DUPLICATES
FROM list
GROUP BY address, email
HAVING COUNT(*)>1
);
SELECT d.*
FROM list AS d, temptable AS t
WHERE FIND_IN_SET(d.id, t.ID_DUPLICATES)
ORDER BY d.id;
Find duplicate Records:
Suppose we have table : Student
student_id int
student_name varchar
Records:
+------------+---------------------+
| student_id | student_name |
+------------+---------------------+
| 101 | usman |
| 101 | usman |
| 101 | usman |
| 102 | usmanyaqoob |
| 103 | muhammadusmanyaqoob |
| 103 | muhammadusmanyaqoob |
+------------+---------------------+
Now we want to see duplicate records
Use this query:
select student_name,student_id ,count(*) c from student group by student_id,student_name having c>1;
+--------------------+------------+---+
| student_name | student_id | c |
+---------------------+------------+---+
| usman | 101 | 3 |
| muhammadusmanyaqoob | 103 | 2 |
+---------------------+------------+---+
To quickly see the duplicate rows you can run a single simple query
Here I am querying the table and listing all duplicate rows with same user_id, market_place and sku:
select user_id, market_place,sku, count(id)as totals from sku_analytics group by user_id, market_place,sku having count(id)>1;
To delete the duplicate row you have to decide which row you want to delete. Eg the one with lower id (usually older) or maybe some other date information. In my case I just want to delete the lower id since the newer id is latest information.
First double check if the right records will be deleted. Here I am selecting the record among duplicates which will be deleted (by unique id).
select a.user_id, a.market_place,a.sku from sku_analytics a inner join sku_analytics b where a.id< b.id and a.user_id= b.user_id and a.market_place= b.market_place and a.sku = b.sku;
Then I run the delete query to delete the dupes:
delete a from sku_analytics a inner join sku_analytics b where a.id< b.id and a.user_id= b.user_id and a.market_place= b.market_place and a.sku = b.sku;
Backup, Double check, verify, verify backup then execute.
SELECT * FROM bookings
WHERE DATE(created_at) = '2022-01-11'
AND code IN (
SELECT code FROM bookings
GROUP BY code
HAVING COUNT(code) > 1
) ORDER BY id DESC
Would go with something like this:
SELECT t1.firstname t1.lastname t1.address FROM list t1
INNER JOIN list t2
WHERE
t1.id < t2.id AND
t1.address = t2.address;
select address from list where address = any (select address from (select address, count(id) cnt from list group by address having cnt > 1 ) as t1) order by address
the inner sub-query returns rows with duplicate address then
the outer sub-query returns the address column for address with duplicates.
the outer sub-query must return only one column because it used as operand for the operator '= any'
Powerlord answer is indeed the best and I would recommend one more change: use LIMIT to make sure db would not get overloaded:
SELECT firstname, lastname, list.address FROM list
INNER JOIN (SELECT address FROM list
GROUP BY address HAVING count(id) > 1) dup ON list.address = dup.address
LIMIT 10
It is a good habit to use LIMIT if there is no WHERE and when making joins. Start with small value, check how heavy the query is and then increase the limit.