Mysql: Compare fields of 3 columns - mysql

I have a table that looks like this:
id name dob
007 name1 19680514
007 name2 20110830
16842 name3 19660927
250718 name3 19660927
253692 name4 19350328
25576 name5 19520813
25576 name5 19520813
I need:
- a SELECT statement that gives me every row where the id is the same as in another row, but the correspoinding name or dob differ.
example output:
id name dob
007 name1 19680514
007 name2 20110830
a second statement that gives me every row where the name and (!) dob are the same as in another row, but the id differs.
example output:
id name dob
16842 name3 19660927
250718 name3 19660927
Background:
Completely identical rows are allowed, but:
- every id should be unique for a combination of name&dob
- every combination of name&dob should have only one id assigned.
So I want to find out the entries with errors to be able to manually correct them.
Thank you!

First query:
SELECT * FROM your_table yt
INNER JOIN (
SELECT
id
FROM
your_table
GROUP BY id
HAVING COUNT(*) > 1
) sq ON yt.id = sq.id
Second query:
SELECT * FROM your_table yt
INNER JOIN (
SELECT
name, dob
FROM
your_table
GROUP BY name, dob
HAVING COUNT(*) > 1
) sq ON yt.name = sq.name AND yt.dob = sq.dob

SELECT statement that gives every row where the id is the same as in another row, but the corresponding name or dob differ. example output:
select * from tbl t1
join tbl2 t2
on t1.id = t2.id
where (t1.name <> t2.name) or (t1.dob <> t2.dob)
Statement that gives every row where the name and (!) dob are the same as in another row, but the id differs. example output:
select * from tbl t1
join tbl2 t2
on t1.name = t2.name AND t1.dob = t2.dob
where (t1.id <> t2.id)

Related

How to compare tables one column value with other column having a group of values in MySQL?

Preq: Both columns 'name' and 'sname' from below tables are of type varchar and id is int
Table1
id
name
1
test1.txt
2
test2.txt
Table2
id
sname
111
['test1']
222
['test1', 'test2']
Requirement: To get the matching values of column 'name' and column 'sname' from table1 & table2 respectively
MySQL query:
SELECT *
FROM table1
INNER JOIN table2
ON CONCAT('[\'', substring_index(name,'.',1), '\']') LIKE CONCAT('%', table2.scname, '%');
NOTE:
[\'', substring_index(name,'.',1), '\']: This format is to compare table1.name with table2.sname values hence pop '.txt' char and add [ ] to table1.name value
Current (wrong) Output
id
name
id
sname
1
test1.txt
11
['test1']
Expected Output (Expecting below 2 rows to be returned, since in table2, 2nd row also has a matching varchar 'test1')
id
name
id
sname
1
test1.txt
11
['test1']
2
test1.txt
22
['test1', 'test2']
How to get this result?
If you want "table1.name" to match "table2.sname" as a substring, you need to change the condition to table2.sname LIKE CONCAT('%\'', substring_index(name,'.',1), '\'%').
Then if you want to take the smallest "table2.sname" string for each "id" you can use a ranking function like ROW_NUMBER, and get always the first value for each "table1.id" (WHERE rn = 1).
WITH cte AS (
SELECT table1.id AS t1_id,
table1.name AS t1_name,
table2.id AS t2_id,
table2.sname AS t2_name,
ROW_NUMBER() OVER(PARTITION BY table1.id ORDER BY LENGTH(table2.sname)) AS rn
FROM table1
INNER JOIN table2
ON table2.sname LIKE CONCAT('%\'', substring_index(name,'.',1), '\'%')
)
SELECT t1_id,
t1_name,
t2_id,
t2_name
FROM cte
WHERE rn = 1
Check the demo here.

How join statements execute in sql

I'm trying to fetch the data from user table such that every row contains date value(not null). If value is null then it should be view that column with a date of id of above date which have same id.
Without updating the table rows, only with select statement?
Here is the table
NAME, DATE, ID
A, 2021-01-21, 1
B, null, 1
C, null, 1
D, 2021-01-18, 2
D, null, 2
It should be viewed like
A, 2021-01-21, 1
B, 2021-01-21, 1
C, 2021-01-21, 1
D, 2021-01-18, 2
D, 2021-01-18, 2
Now the query I think is =>
select t1.name, t2.date ,t1.id from user t1
left join (select id ,date from user where id=1) t2
on t1.id=t2.id;
But this query doesn't work like I thought.
Can anyone please tell me how above join query works ? And how can I improve it ? So that I got the required result.
For testing of above query use this queries =>
create table user(
name varchar(20),
date date,
id integer
);
insert into user values("A",'2021-01-21',1);
insert into user values("",null,1);
insert into user values("",null,1);
insert into user values("",null,1);
insert into user values("",null,1);
insert into user values("",null,1);
insert into user values("B",'2021-01-20',2);
select t1.name, t2.date ,t1.id from user t1
left join (select id ,date from user where id=1) t2
on t1.id=t2.id;
The first problem is that you are joining a table with itself on the condition t1.id = t2.id. So if you have 4 rows with id=1 and 3 rows with id=2 just as an example, you will end up with a result that had 4 * 4 + 3 * 3 = 25 rows. In your specific case you will end up with 6 * 6 + 1 * 1 = 37 rows.
The second problem is that you have hard-code selecting id=1 in your subquery:
(select id ,date from user where id=1) t2
This can't be the appropriate value for all possible rows.
You could try the obvious:
select
t1.name,
ifnull(t1.date, (select t2.date from user t2 where t2.date is not null and t2.id = t1.id limit 1)) as date,
t1.id
from user t1
;
see db-fiddle
name
id
date
A
1
2021-01-21
1
2021-01-21
1
2021-01-21
1
2021-01-21
1
2021-01-21
1
2021-01-21
B
2
2021-01-20
But better would be to use a join:
select u.name, ifnull(u.date, sq.date) as date, u.id
from user u join (
select id, min(date) as date from user group by id
) sq on u.id = sq.id
;
see db-fiddle
I would expect the second version using a join to be more efficient because the first version has a dependent subquery that has to get executed for every row that has a null date.
You don't need a join. Just use a window function:
select name,
max(date) over (partition by id) as date,
id
from users;
Note that your sample data doesn't match the data in the question. That data suggests:
select max(name) over (partition by id) as name,
max(date) over (partition by id) as date,
id
from user;
Here is a db<>fiddle.

Select all orders except the max order for each distinct customer

Sorry for the poor formatting but as part of a larger problem, I have created a query that produces this table:
id id2
4 7
4 6
1 3
1 2
1 1
How would I extract the rows that don't have the highest id2 for each id1.
What I want:
id id2
4 6
1 2
1 1
I can only seem to figure out how to get rid of the max id2 overall but not for each distinct id1. Any help on actually differentiating the max id2 for each id1 would be appreciated.
You can try below way -
select a.id, a.id2
from tablename a
where a.id2 <> (select max(a1.id2) from tablename a1 where a.id=a1.id)
If you are using MySQL 8+, then RANK() provides one option:
WITH cte AS (
SELECT id, id2, RANK() OVER (PARTITION BY id ORDER BY id2 DESC) rnk
FROM yourTable
)
SELECT id, id2
FROM cte
WHERE rnk > 1
ORDER BY id DESC, id2 DESC;
Demo
instead of a correlated subquery in the where, you can LEFT JOIN and apply not in...
select id, id2
from yourTable YT
LEFT JOIN
( select id, max( id2 ) highestID2
from YourTable
group by id ) TopPerID
on YT.ID = TopPerID.ID
AND YT.ID2 != TopPerID.highestID2
where TopPerID.id IS NULL
Since you can have id values with only one id2 value, you need to check for that situation as well, which you can do by comparing the MAX(id2) value with the MIN(id2) value in a JOIN:
SELECT t1.*
FROM Table1 t1
JOIN (SELECT id, MAX(id2) AS max_id2, MIN(id2) AS min_id2
FROM Table1
GROUP BY id) t2 ON t2.id = t1.id
AND (t1.id2 < t2.max_id2 OR t2.min_id2 = t2.max_id2)
If we add a row 2, 5 to your sample data this correctly gives the result as
id id2
4 6
1 2
1 1
2 5
Demo on SQLFiddle

How to exclude from my populated query each row with the first occurrence of a value which appears in subsequent rows

I want to exclude from my populated query each row with the first occurrence of a value which appears in subsequent rows.
I've looked into offset but this only applies to the whole table
SELECT
myTable.name,
myTable.Id
FROM myTable
GROUP BY myTable.name HAVING COUNT(*) > 1
ORDER BY myTable.name ASC, myTable.Id ASC
What I'm getting:
NAME ID
A 1
A 2
A 3
B 1
B 2
B 3
What I want:
NAME ID
A 2
A 3
B 2
B 3
You can filter in the where clause:
select name, id
from t
where t.id > (select min(t2.id) from t t2 where t2.name = t.name);
I excluded the lowest ID from the query by doing concat and group by finding the MIN value:
DECLARE #myTable TABLE ( NAME NVARCHAR(MAX), ID INT )
INSERT INTO #myTable VALUES
('A',1),
('A',2),
('A',3),
('B',1),
('B',2),
('B',3)
SELECT * FROM #myTable
WHERE CONCAT(NAME,ID) NOT IN
(
SELECT CONCAT(NAME,MIN(ID)) FROM #myTable
GROUP BY NAME
)
GROUP BY NAME,ID
ORDER BY NAME,ID ASC
OUTPUT:
NAME ID
A 2
A 3
B 2
B 3

MySql: How to select rows where all values are the same?

I have a table like this:
name |id | state
name1 12 4
name1 12 4
name2 33 3
name2 33 4
...
I want to select every name and id from table where state is only 4, that means name1 is correct, because it only has two records with state 4 and nothing more. Meanwhile name2 is wrong, because it has record with state 4 and record with state 3.
You can use aggregation as shown below:
SELECT name, id
FROM your_table
GROUP BY name, id
HAVING SUM(state<>4)=0;
See a Demo on SQL Fiddle.
select name, id from mytable where id not in
(select distinct id from mytable where state <> 4)
you might need 2 sub queries .
select with group by name were state 4
select with group by name
compare the count if the count is same then select it
example : select name , count (name) from table where state = 4 as T1
select name , count (name) from table as T2
select T1.name from T1 and T2 where T2.count = T1.count
You can use not exists like this:
select distinct name, id
from table1 a
where not exists (select *
from table1 b
where a.id=b.id and state<>4)
In a more general case you can use count distinct (with not exists or with a join):
select distinct name, id
from table1 a
where not exists (
select *
from table1 b
where a.id=b.id
group by id
HAVING count(distinct state)>1)