Removing redundant rows from db table - mysql

I have a database table like the following (bad design I know, but there are a ton of rows like this):
person1 | person2 | counselor
Jane Doe | John Doe | Mary Smith
John Doe | Jane Doe | Mary Smith
Frank Jones| Ann Jones | Tom Jones
Ann Jones | Frank Jones | Tom Jones
I'm trying to figure out how to just select one of the 'unique' rows so that a result would look like:
person1 | person2 | counselor
Jane Doe | John Doe | Mary Smith
Frank Jones| Ann Jones | Tom Jones
I've tried various things like SELECT distinct and SELECT MIN(person1), etc., but am striking out.

You will have 6 permutations of (person1,person2,counselor) and you can use union with all of them. Finally use a where clause so only one row per combination will be returned.
Fiddle with sample data
select * from (
select person1,person2,counselor
from tablename
union
select person1,counselor,person2
from tablename
union
select person2,person1,counselor
from tablename
union
select person2,counselor,person1
from tablename
union
select counselor,person2,person1
from tablename
union
select counselor,person1,person2
from tablename) t
where person1 < person2 and person2 < counselor

SQL Fiddle Demo
I include a case where no reverse duplicated and also another case where person1 = person2, also include P2.* in select just for debug.
SELECT P1.person1, P1.person2, P1.counselor, P2.*
FROM patient P1
LEFT JOIN patient P2
ON P1.person1 = P2.person2
AND P1.person2 = P2.person1
AND P1.counselor = P2.counselor
WHERE
concat(P1.person1, P1.person2) <= concat(P2.person1, P2.person2)
OR P2.person1 is null
When P2 is NULL mean there isn't a reverse combination of person1, person2
But when the combination exists only choose the smaller one as string concatenation.

With ranked_records AS
(
select *,
ROW_NUMBER() OVER(Partition By person1, person2, counselor
Order By person) [ranked]
from address
)
select * from ranked_records
where ranked > 1
For more detail, like how to delete latest records and keep the older one.
For MySql and if you have Id, try this:
DELETE A1
From Address A1
Where Exists (Select * From Address A2
Where A2.person1= A1.person1
AND A2.person2 = A1.person2
AND A2.counselor = A1.counselor
AND A1.AddressID > A2.AddressID)

Related

How can I create a Group_Concat field per table using values from other records

Suppose a table of First and Last Names and for each record I want to do comma-delimited list of relatives.
1ST | LAST | RELATIVES
Bob | Smith | Alice,Andrew
Alice | Smith | Bob,Andrew
Andrew |Smith | Bob,Alice
Alex | Jones | Anny, Ricky
Anny | Jones | Alex, Ricky
Ricky | Jones | Alex, Anny
As per this sqlFiddle
http://sqlfiddle.com/#!9/25d80c/1
I know how to group_contact manually for any last name but am unclear how for each record I could have it go find the records with matching last name and run the same group_concat
You can do it with a self LEFT join and aggregation:
SELECT s1.First, s1.Last,
GROUP_CONCAT(s2.First) Relatives
FROM Surnames s1 LEFT JOIN Surnames s2
ON s2.Last = s1.Last AND s2.First <> s1.First
GROUP BY s1.First, s1.Last;
See the demo.
You can put the aggregation in a lateral join, like so:
select s.First, s.Last, r.Relatives
from Surnames s,
lateral (
select group_concat(First) Relatives
from Surnames r
where s.Last = r.Last AND s.First != r.First
)r
DB<>Fiddle

Select Distinct value Multiple Columns

This is my simplified table
year | teacher
1 | john
2 | john
2 | sam
3 | john
3 | simon
When I run the query below
SELECT year, teacher FROM table1 GROUP BY year
It gives me the result :
year | teacher
1 | john
2 | john
3 | john
In this case, year column is fine as it shows all distinct value, however teacher column is still repeated. I wish to have distinct values on teacher columns too.
Output I am looking for :
year | teacher
1 | john
2 | sam
3 | simon
This query is not valid SQL (even if MySQL happens to accept it):
SELECT year, teacher
FROM table1
GROUP BY year;
You need an aggregation function around teacher:
SELECT year, MAX(teacher)
FROM table1
GROUP BY year;
That said, this doesn't do what you want. That is hard to do in a single query. Instead, use two queries:
SELECT DISTINCT year FROM table1;
SELECT DISTINCT teacher FROM table1;

Distinct values of grouped attributes in SQL

I have two tables:
Table1 name object Table2 name_old name_corr
------|-----| ---------|-----------
John | A | John | John
Ben | B | Ben | Ben
Jon | B | Jon | John
Be n | B | Be n | Ben
Peter | B | Peter | Peter
Petera| C | Petera | Peter
In my Example I have three persons, in Table1 there are some typing errors, so Table2 assigns every name to the correct name.
Now I want for every Person (John, Ben, Peter) their distinct objects.
This would be the outcome:
John A
B
Ben B
Peter B
C
This was my try, but I get an error:
Select b.name_corr, distinct(a.object) from Table1 as a join Table2 as b on (a.name=b.name_old) group by b.name_corr
Without the grouping, meaning if I select a specific name via 'where' my query works.
distinct isn't a function. It is a qualifier on select:
Select distinct b.name_corr, a.object
from Table1 a join
Table2 b
on a.name = b.name_old;
use group_concat:
Select b.name_corr, group_concat(distinct a.object) from Table1 as a join Table2 as b on (a.name=b.name_old) group by b.name_corr;
I figured out a solution for my problem. -> Double 'group by'. So simple..
Thanks for the help anyway.

optimal query variant for msg senders table

I have table "s_msgs" and his structure is
id | from | to
------------------
1 | John | Robert
2 | John | Michael
3 | Robert | John
4 | Michael | John
I need obtain every message sender-recipient couple, that is result must be
John | Robert
John | Michael
I write query, but I think this is not optimal variant, all the more so expected that in table will be several milion rows, so can anyone tell more optimal query?
this is my query
SELECT `from`,`to` FROM s_msgs WHERE id IN(
SELECT id FROM (
SELECT MIN(id) AS id,
CASE
WHEN STRCMP(`to`,`from`) = -1 THEN CONCAT(`to`,`from`)
ELSE CONCAT(`from`,`to`)
END
AS conc
FROM s_msgs
GROUP BY conc
) AS t
)
How about
SELECT DISTINCT `from`, `to`
FROM YOUR_TABLE
UNION
SELECT DISTINCT `to`,`from`
FROm YOUR_TABLE
EDIT
Got a new one, was the ID that saved me.
Have a look at this demo
SQL Fiddle DEMO
SELECT *
FROM MY_TABLE f
WHERE NOT EXISTS (
SELECT 1
FROM MY_TABLE t
WHERE f.`from` = t.`to`
AND f.`to` = t.`from`
AND f.id > t.id)

Select every other row as male/female from mysql table

I've got a table containing persons gender-coded as 0 and 1. I need to select every other row as male/female. I thought I could manage this somehow by using modulo and the gender-codes 0 and 1, but I haven't managed to figure it out yet...
The result I'm looking for would look like this:
+-----+--------+-------+
| row | gender | name |
+-----+--------+-------+
| 1 | female | Lisa |
| 2 | male | Greg |
| 3 | female | Mary |
| 4 | male | John |
| 5 | female | Jenny |
+-----+--------+-------+
etc.
The alternative is to do it in PHP by merging 2 separate arrays, but I would really like it as a SQL query...
Any suggestions are appreciated!
Do two subqueries to select male and female. Use ranking function to have them enumerated.
Males:
1 | Peter
2 | John
3 | Chris
Females:
1 | Marry
2 | Christina
3 | Kate
Then multiplay ranking result by x10 and add 5 for females. So you have this:
Males:
10 | Peter
20 | John
30 | Chris
Females:
15 | Marry
25 | Christina
35 | Kate
Then do the UNION ALL and sort by new sort order/new ID.
Together it should like this (pseudo code)
SELECT
Name
FROM
(subquery for Males: RANK() AS sortOrd, Name)
UNION ALL
(subquery for Females: RANK()+1 AS SortOrd, Name)
ORDER BY SortOrd
Result should be like this:
Males and Females:
10 | Peter
15 | Marry
20 | John
25 | Christina
30 | Chris
35 | Kate
Found Emulate Row_Number() and modified a bit for your case.
set #rownum := 0;
set #pg := -1;
select p.name,
p.gender
from
(
select name,
gender,
#rownum := if(#pg = gender, #rownum+1, 1) as rn,
#pg := gender as pg
from persons
order by gender
) as p
order by p.rn, p.gender
Try on SQL Fiddle
Note: From 9.4. User-Defined Variables
As a general rule, you should never assign a value to a user variable
and read the value within the same statement. You might get the
results you expect, but this is not guaranteed.
I will leave it up to you do decide if you can use this. I don't use MySQL so I can't really tell you if you should be concerned or not.
Similar to Mikael's solution but without the need to order the resultset multiple times -
SELECT *
FROM (
SELECT people.*,
IF(gender=0, #mr:=#mr+1, #fr:=#fr+1) AS rank
FROM people, (SELECT #mr:=0, #fr:=0) initvars
) tmp
ORDER BY rank ASC, gender ASC;
To avoid having to order both the inner and outer selects I have used separate counters (#mr - male rank, #fr - female rank) in the inner select.
I've got a table containing persons gender-coded as 0 and 1
Then why would you make assumptions on the order of rows in the result set? Seems to me transforming the 0/1 into 'male'/'female' is far more robust:
select name, case gender when 0 then 'male' else 'female' end
from Person
SELECT alias.*, ROW_NUMBER() OVER (PARTITION BY GENDER ORDER BY GENDER) rnk
FROM TABLE_NAME
ORDER BY rnk, GENDER DESC