How to select distinct pairs in MySQL join (same table) with transitivity? - mysql

I'm facing a very poorly designed database with a non-normalized table X.
This table X should have a N:M relationship with another table Y.
The problem is that this relationship is currently 1:N and the jerry-rigged solution until now was to duplicate the entries when there was various registries to be related.
Simplifying, I have this:
| ID | TEXT | LOCATION_ID |
| 1 | foo | 1 |
| 2 | foo | 2 |
| 3 | bar | 1 |
| 4 | bar | 4 |
| 5 | bar | 3 |
I have to normalize this table. So, my first idea was try to obtain pairs of similar registries. Something like this:
| a.ID | b.ID |
| 1 | 2 |
| 3 | 4 |
| 3 | 5 |
Experimenting a little bit:
SELECT a.id, b.id
FROM mytable AS a
INNER JOIN mytable AS b
ON a.text = b.text AND a.id != b.id
GROUP BY a.id, b.id
This lead to a problem like this:
| a.ID | b.ID |
| 1 | 2 |
| 2 | 1 |
| 3 | 4 |
| 3 | 5 |
| 4 | 3 |
| 4 | 5 |
| 5 | 3 |
| 5 | 4 |
The pairs were duplicated.
After some digging, I realized that this was more efficient:
SELECT a.id, b.id
FROM mytable AS a
INNER JOIN mytable AS b
ON a.text = b.text AND a.id < b.id
GROUP BY a.id, b.id
So, I got this:
| a.ID | b.ID |
| 1 | 2 |
| 3 | 4 |
| 3 | 5 |
| 4 | 5 |
But I still need to get rid of that last register.

Group on only one side and take the MIN() of the other:
SELECT MIN(a.ID) a, b.ID b
FROM mytable a JOIN mytable b ON b.text = a.text AND b.ID > a.ID
GROUP BY b.ID
See it on sqlfiddle.

Related

Count how many rows with the same value

I have the following tables:
Table A:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | User | Enterpr_id |
| 1 | test1 | 1 |
| 2 | test2 | 2 |
| 3 | test3 | 3 |
| 4 | test4 | 4 |
| 5 | test5 | 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Table B:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Enterpr_id | Name |
| 1 | Nespresso |
| 2 | what |
| 3 | else |
| 4 | need |
| 5 | help |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
I have a foreign key on Enterpr_id with Table A, how can I make a count alternate and get the expected Output:
Nespresso - 2 users
what - 1 user
else - 1 user
need - 1 user
help - 0 user
That's a simple join:
select a.user, b.name
from tablea a
inner join tableb b on b.entrepr_id = a.entrepr_id
Edit: from your updated question, you seem to want aggregation and a left join:
select b.name, count(a.id) cnt_users
from tableb b
left join tablea a on a.entrepr_id = b.entrepr_id
group by b.entrepr_id, b.name
order by b.entrepr_id
It's a left join query with a count:
SELECT Name, COUNT(TableA.ID)
FROM TableB LEFT JOIN TableA ON TableB.Enterpr_id = TableA.Enterpr_id
GROUP BY TableB.Enterpr_id, TableB.Name;

Join multiple SQL tables [duplicate]

This question already has answers here:
MySQL pivot row into dynamic number of columns
(1 answer)
SQL joining multiple tables
(4 answers)
Closed 6 years ago.
Not working
SELECT a.name, atn.name
FROM t1 a
JOIN t2 ap ON a.id = ap.area_id
JOIN t3 atn ON atn.id = ap.parent_id
I have a table t1 with area names and their type (like pin, ward and simple area name) and table t2 with their mapping and table t3 with type name and their id's.
I want a result with three columns (area name, pin, ward) i.e the result should be which area comes under pin and ward.
t1:
--------------------------
| id | area name | type |
---------------------------
| 1 | a | 5 |
| 2 | b | 8 |
| 3 | x | 7 |
| 4 | z | 8 |
| 5 | pq | 8 |
---------------------------
t2:
------------------------------
| id | area_id | parent_id |
------------------------------
| 1 | 2 | 1 |
| 2 | 2 | 3 |
| 3 | 4 | 1 |
| 4 | 5 | 3 |
-----------------------------
t3:
------------------
| id | name |
------------------
| 5 | pin |
| 7 | ward |
| 8 | area |
------------------
Result:
--------------------------
| area | pin | ward |
--------------------------
| b | a | x |
| z | a | |
| pq | | x |
--------------------------
Anybody knows how to get this, please help me. I don't know how to get that value. I tried but couldn't find anything.
Just a guess. Pivoting parent name by parent type
SELECT a.name,
max(case when atn.name = 'pin' then p.name end) as pin
max(case when atn.name = 'ward' then p.name end) as ward
FROM t2 ap
JOIN t1 a ON a.id = ap.area_id
JOIN t1 p ON p.id = ap.parent_id
JOIN t3 atn ON atn.id = p.type
GROUP BY a.name

Counts from 3 tables with 2 left joins and 1 composite primary key

I have 3 tables like this
SecretAgents
| id | name |
|----|------|
| 1 | A |
| 2 | B |
Victims
| id | name | agent_id |
|----|------|----------|
| 1 | Z | 1 |
| 2 | Y | 1 |
| 3 | X | 2 |
Data
| id | keys | values | victim_id | form_id |
|----|------|--------|-----------|---------|
| 1 | a1 | x | 1 | 1 |
| 2 | a2 | xx | 1 | 2 |
| 3 | a3 | xxx | 2 | 1 |
| 4 | a5 | xxx | 1 | 1 |
I have to get the count of forms(here victim_id and form_id are composite primary keys) and the count of victims for each agent.
I have tried this for any 2 tables with left joins and group by but I am not able to achieve the same together. If anyone can be generous enough to offer a pointer/solution, that would be super awesome..
EDIT 1: The query
This is definitely not the right query but anyways
SELECT count(DISTINCT v.id) as victimcount, `sa`.`username`, `sa`.`id`, count(DISTINCT d.form_id) as submissions
FROM `SecretAgents` as `sa`
LEFT JOIN `Victims` as `v` ON `v`.`agent_id`=`sa`.`id`
LEFT JOIN `Data` as `d` ON `d`.`victim_id`=`v`.`id`
GROUP BY `v`.`agent_id`
ORDER BY `sa`.`id` ASC
The victimcount is correct but the submissions count becomes wrong. Tried lots of other things too but this is the most relevant...
Thanks
I believe you can count the forms-per-agent like so:
SELECT COUNT(*) as form_count, a.id as id, a.name as agent
FROM Data d
LEFT JOIN Victims v ON v.id = d.victim_id
LEFT JOIN SecretAgents a on v.agent_id = a.id
GROUP BY a.id;
To count the victims, just leave off the Data table.

One to many join but only the first row

I have 2 tables that I need to join via ID without getting the duplicate values For ID, InfoA, and InfoB. I do not need the data in column InfoB2. When I join the table on ID because it is a 1 to many join I end up with duplicate values and want to get rid of those. I only want ID, InfoA, and InfoB without the duplicates. Any ideas?
Example:
TableA:
| ID | InfoA |
| 1 | animals|
| 2 | plants |
TableB:
| ID | InfoB | InfoB2 |
| 1 | A | X |
| 1 | A | Y |
| 1 | A | Z |
| 2 | B | X |
| 2 | B | Y |
| 2 | B | Z |
Doing a normal join, because it is 1 to many I get this but do not want the duplicates. I don't want this:
| ID | InfoB | InfoB |
| 1 | animals| A |
| 1 | animals| A |
| 1 | animals| A |
| 2 | plants | B |
| 2 | plants | B |
| 2 | plants | B |
My goal is to get this (note I do not need column InfoB2):
| ID | InfoA | InfoB |
| 1 | animals| A |
| 2 | plants | B |
You could use the distinct keyword:
SELECT DISTINCT a.id, infoa, infob
FROM tablea a
JOIN tableb b ON a.id = b.id
The fastest way is likely to be:
select a.*,
(select b.infob from tableb b on a.id = b.id limit 1)
from tablea a;
For performance, you would want an index on tableb(id, infob).

MySQL Union on Dissimilar Fields without Dummy Columns

So lets say I have 2 or more tables consisting of dissimilar columns in which a shared key (id) is not necessarily present :
Alpha:
+----+-------+-------+-------+
| id | paula | randy | simon |
+----+-------+-------+-------+
| 1 | 8 | 7 | 2 |
| 2 | 9 | 6 | 2 |
| 3 | 10 | 5 | 2 |
+----+-------+-------+-------+
Beta:
+----+---------+-----+------------+------+
| id | is_nice | sex | dob | gift |
+----+---------+-----+------------+------+
| 2 | 1 | F | 1990-05-25 | iPod |
| 3 | 0 | M | 1990-05-25 | coal |
+----+---------+-----+------------+------+
Gamma:
+----+---------+--------+
| id | is_tall | is_fat |
+----+---------+--------+
| 1 | 1 | 1 |
| 99 | 0 | 1 |
+----+---------+--------+
The desired effect is to mash the tables together on id inserting NULLs where data is not available:
+----+-------+-------+-------+---------+-----+------------+------+---------+--------+
| id | paula | randy | simon | is_nice | sex | dob | gift | is_tall | is_fat |
+----+-------+-------+-------+---------+-----+------------+------+---------+--------+
| 1 | 8 | 7 | 2 | | | | | 1 | 1 |
| 2 | 9 | 6 | 2 | 1 | F | 1990-05-25 | iPod | | |
| 3 | 10 | 5 | 2 | 0 | M | 1990-05-25 | coal | 1 | 1 |
| 99 | | | | | | | | 0 | 0 |
+----+-------+-------+-------+---------+-----+------------+------+---------+--------+
I can use NULL 'dummy' columns and UNION (MySql SELECT union for different columns?) but that seems like a royal pain if the number of tables is great. I'd like to think there is a JOIN method I can use to accomplish this, but I need some help to figure this out.
This works:
SELECT `id`, `paula`, `randy`, ..., NULL AS `is_nice`, ... FROM `Alpha`
UNION SELECT `id`, NULL AS `paula`, ..., FROM `Beta`
UNION SELECT `id`, NULL AS `paula`, ..., `is_fat` FROM `Gamma` ;
but it sure feels like the wrong way to do it. How can I get the same results without having to edit lines and lines of SQL inserting NULL AS whatever all over the place whenever I want to tack on additional tables?
Thanks in advance!
SELECT
allid.id
, a.paula, a.randy a.simon
, b. ...
, c. ...
FROM
( SELECT id
FROM Alpha
UNION
SELECT id
FROM Beta
UNION
SELECT id
FROM Gamma
) AS allid
LEFT JOIN
Alpha AS a
ON a.id = allid.id
LEFT JOIN
Beta AS b
ON b.id = allid.id
LEFT JOIN
Gamma AS g
ON g.id = allid.id
If the tables share no other column except the id, you could use the simple to write (but easier to break):
SELECT
*
FROM
( SELECT id
FROM Alpha
UNION
SELECT id
FROM Beta
UNION
SELECT id
FROM Gamma
) AS allid
NATURAL LEFT JOIN
Alpha
NATURAL LEFT JOIN
Beta
NATURAL LEFT JOIN
Gamma
You want to use LEFT JOINs.
http://dev.mysql.com/doc/refman/5.0/en/join.html
In your example:
SELECT id_t.id, a.paula, a.randy, a.simon, b.is_nice, b.sex, b.dob, b.gift, g.is_tall, g.is_fat
FROM (SELECT DISTINCT id FROM alpha,beta,gamma) as id_t
LEFT JOIN alpha a ON a.id = id_t.id
LEFT JOIN beta b on b.id = id_t.id
LEFT JOIN gamma g on g.id = id_t.id