Is there a shortcut to normalizing a table where the columns=rows? - mysql

Suppose you had the mySQL table describing if you can mix two substances
Product A B C
---------------------
A y n y
B n y y
C y y y
The first step would be to transform it like
P1 P2 ?
-----------
A A y
A B n
A C y
B A y
B B y
B C n
C A y
C B n
C C y
But then you have duplicate information. (eg. If A can mix with B, then B can mix with A), so, you can remove several rows to get
P1 P2 ?
-----------
A A y
A B n
A C y
B B y
B C n
C C y
While the last step was pretty easy with a small table, doing it manually would take forever on a larger table. How would one go about automating the removal of rows with duplicate MEANING, but not identical content?
Thanks, I hope my question makes sense as I am still learning databases

If it's safe to assume that you're starting with all relationships doubled up, e.g.
If A B is in the table, then B A is guaranteed to be in the table.
Then all you have to do is remove all rows where P2 < P1;
DELETE FROM `table_name` WHERE `P2` < `P1`;
If this isn't the case, you can make it the case by going through the table and inserting all the duplicate rows if they don't already exist, then running this.

I don't think it's necessary in your situation, but as an intellectual exercise, you could build on Jamie Wong's solution and prevent non-duplicated columns from being removed with an EXISTS clause. Something like this:
DELETE FROM `table_name` AS t1
WHERE `P2` < `P1`
AND EXISTS (SELECT NULL FROM `table_name` AS t2
WHERE t1.`P1` = t2.`P2` AND t1.`P2` = t2.`P1`);
It pretty much just makes sure that there's a duplicate before deleting anything.
(My MySQL syntax might be a little off; it's been a while.)

Step 1 (as you've already done): Transform to Table2
P1 P2 ?
-----------
A A y
A B n
A C y
B A y
B B y
B C n
C A y
C B n
C C y
Step 2: ReOrder Columns, Select Distinct
SELECT DISTINCT
IF P1<P2 THEN P1 ELSE P2 END as P1, -- this puts the smallest value in P1
IF P1>P2 THEN P1 ELSE P2 END as P2 -- this puts the largest value in P2
FROM Table2
WHERE NOT P1=P2 --(Assuming records like A, A, y are not interesting)
I'm not a mySQL guy, so you might need to check the if/then syntax, but this seems conceptually ok anyway.

Related

Making unique values into duplicates in R

I am working with R and I have a table that look like this...
A
B
C
D
E
F
And I need the table to look like this...
A
A
A
A
A
B
B
B
B
B
C
C
C
C
C
D
D
D
D
D
E
E
E
E
E
F
F
F
F
F
So,I need the same values 5 times in order to match them with another column.
Any help would greatly appreciated.
Thanks!
Not sure if the table has associated data that also has to be duplicated. Looking at a vector or single data.frame column can use rep
data <- LETTERS[1:6]
rep(data, each = 5) # will repeat each position 5 times prior to going to next position
rep(data, times = 5) # will repeat entire array 5 times

How do I form a correct SQL statement?

I wanted to pick and find a value from a Table A, depending on some value from Table B. How can I achieve this.
Select
Ax
from
TableA
where TableA.Ay = (Select
Bx
from
TableB
where TableB.By = L AND TableB.Bz = M )
You could try using a JOIN
Select TableA.Ax
from TableA
INNER JOIN TableB ON TableA.Ay = TableB.Bx
AND TableB.By = L
AND TableB.Bz = M
It looks ok
the only thing is when you are not sure that Bx only delivers one row as result
Select
Ax
from
TableA a
where a.Ay IN (Select
Bx
from
TableB b
where b.By = L AND b.Bz = M )
Or you must Limit the return of table b to 1 row with LIMIT 1
Also the use of aliases is helpful, for better reading and less typing

I can´t do a delete for do data union in a trigger update mysql?

Hello I have got a table named example in mysql with the next fields a, b, c al fields are varchar 255, and unique per column.
The unique restriction per column I do with a trigger before insert;
An example of table:
------------
a b c
------------
a c
b
d e
f g
h
i
j
k
l n
c
a
This table is ok, because al values in each column are unique exception that ""
An example of wrog table
------------
a b c
------------
a b c
b
This table is wrong, because the field b have same value in two columns
I want to be able to do the next
I want to be able to make a union of rows respecting that the values ​​are unique
Example:
Merge ('a', '', 'c') with ('', 'b', '') and get ('a', 'b', 'c')
------------
a b c
------------
a b c
d e
f g
h
i
j
k
l n
c
a
Note that ('', 'b', '') was deleted.
Merge('j', 'a', 'i')
------------
a b c
------------
a c
b
d e
f g
h
j a i
k
l n
c
I not need merge for ecample ('', 'l','n') and ('','h','') because the 'a filed is empty' but I need merge (j, a, i) and ('','h','') and get
------------
a b c
------------
a c
b
d e
f g
j h i
k
l n
c
I'm trying to do this with a trigger, with the nex logic
if 'b' changed delete the other appearance of 'b'
if 'c' changed delete the other appearance of 'c'
But I cant do theses because I cant execute a delete into a mysql trigger, any other idea.
My goal is that the logic is executed in the database, and that if I have to call a procedure, block the update on the table, so that it can comply with the restriction of all unique rows

SQL: Can't understand how to select from my tables

I need help with a data extraction. I'm an sql noob and I think I have a serious issue with my data design skills. DB system is MYSQL running on Linux.
Table A is structured like this one:
TYPE SUBTYPE ID
-------------------
xyz aaa 0001
xyz aab 0001
xyz aac 0001
xyz aad 0001
xyz aaa 0002
xyz aaj 0002
xyz aac 0002
xyz aav 0002
Table B is:
TYPE1 SUBTYPE1 TYPE2 SUBTYPE2
-------------------------------------
xyz aaa xyz aab
xyz aac xyz aad
Looking at whole table A, I need to extract all rows where both type and subtype are present as columns in a single table B row. Of course this condition is never met since A.subtype can't be at same time equal to B.subtype1 AND B.subtype2 ...
In the example the result set for id should be:
xyz aaa 0001
xyz aab 0001
xyz aac 0001
xyz aad 0001
I m trying to use a join with 2 AND conditions, but of course I got an empty set.
EDIT:
#Barmar thank you for your support. It seems that I m really near the final solution. Just to keep things clear, I opened this thread with a shortened and simplified data structure, just to highlight the point where I was stuck.
I thought about your solution, and is acceptable to have both result on a single row. Now, I need to reduce execution time.
First join takes about 2 minutes to complete, and it produce around 23Million of rows. The second join (table B) is probably taking longer.
In fact, I need 3 hours to have the final set of 10 millions of rows. How can we impove things a bit? I noticed that mysql engine is not threaded, and the query is only using a single CPU. I indexed all fields used by join, but I m not sure its the right thing to do...since I m not a DBA
I suppose also having to rely on VARCHAR comparison for such a big join is not the best solution. Probably I should rewrite things using numerical ID that should be faster..
Probably split things into different query will help parallelism. thanks for a feedback
You can join Table A with itself to find all combinations of types and subtypes with the same ID, then compare them with the values in Table B.
SELECT t1.type AS type1, t1.subtype AS subtype1, t2.type AS type2, t2.subtype AS subtype2, t1.id
FROM TableA AS t1
JOIN TableA AS t2 ON t1.id = t2.id AND NOT (t1.type = t2.type AND t1.subtype = t2.subtype)
JOIN TableB AS b ON t1.type = b.type1 AND t1.subtype = b.subtype1 AND t2.type = b.type2 AND t2.subtype = b.subtype2
This returns the two rows from Table A as a single row in the result, rather than as separate rows, I hope that's OK. If you need to split them up, you can move this into a subquery and join it back with the original table A to return each row.
SELECT a.*
FROM TableA AS a
JOIN (the above query) AS x
ON a.id = x.id AND
((a.type = x.type1 AND a.subtype = x.subtype1)
OR
(a.type = x.type2 AND a.subtype = x.subtype2))
DEMO
You can use EXISTS:
SELECT a.*
FROM TableA a
WHERE EXISTS(
SELECT 1
FROM TableB b
WHERE
(b.Type1 = a.Type AND b.SubType1 = a.SubType)
OR (b.Type2 = a.Type AND b.SubType2 = a.SubType)
)
AND a.ID = '0001'
ONLINE DEMO
You can use Join like this :
Select A.Type, A.SubType, A.ID from a_table A JOIN b_table B
ON (A.Type = B.Type1 AND A.SubType = B.SubType1) OR
(A.Type = B.Type2 AND A.SubType = B.SubType2)
But I think there is a problem in your design, you have same values in Table A with different ID and there is no any condition on ID !
Instead of storing Type and SubType in Table B, you can store an unique ID of each record of Table A to Table B, then you can think about better ways to get results you want ...
Edit :
With UNION of two joins you can get that result :
Select A.Type, A.SubType, A.ID from A_table A
JOIN b_table B1 ON A.Type = B1.Type1 AND A.SubType = B1.SubType1
WHERE (B1.Type2, B1.SubType2) IN (SELECT Type, SubType FROM A_table) AND ID = '0001'
UNION
Select A.Type, A.SubType, A.ID from A_table A
JOIN b_table B2 ON A.Type = B2.Type2 AND A.SubType = B2.SubType2
WHERE (B2.Type1, B2.SubType1) IN (SELECT Type, SubType FROM A_table) AND ID = '0001'
But as I say, I think there is a design problem, it seems better that each type and subtype have an unique ID in Table A and work with this ID on Table B

SQL Query - Combined rows with selective columns [MySQL]

I have the following table:
Game | Name | Result | Stage
1 A W F
1 B L 0
2 C L F
2 D W 0
3 E L 0
3 F W 0
The output I am looking for:
Game | Name | Result | Stage
1 A W F
2 D W F
I only want to see the winners (W) from the results of stage F.
I can do this via joins (which isn't very fast):
SELECT *
FROM (
SELECT *
FROM MyTable
WHERE Stage = 'F'
) AA
JOIN MyTable
ON AA.Game = MyTable.Game AND AA.Result <> MyTable.Result
..but I am wondering if there is an easier and more efficient way to do it. Plus this requires I do some more filtering afterwards.
Thanks in advance!
To perform a job of this sort without a self-join or an equivalent, you would want to use SQL window functions, which MySQL does not support. The join you are using is not too bad, but this would be a little simpler:
SELECT
players.Game AS Game,
players.Name AS Name,
'W' AS Result,
'F' as Stage
FROM
MyTable stage
JOIN MyTable players
ON stage.Game = players.Game
WHERE
stage.stage = 'F'
AND players.result = 'W'
With "The winners only from stage F" you only need:
SELECT * FROM MyTable WHERE stage="F" and result="W";
Your own result example however also shows name "D" which is not a winner in stage F.
If you only want to see the winners (W) from the results of stage F you don't need a join. The following statement will work:
SELECT * FROM MyTable where Stage='F' AND Result= 'W'
You probably need a subquery :
SELECT
*
FROM
MyTable
WHERE
Result = 'W'
AND Game IN (SELECT Game FROM MyTable WHERE Stage = 'F')
SELECT x.Game
, y.Name
FROM my_table x
JOIN my_table y
ON y.game = x.game
AND y.result = 'w'
WHERE x.stage = 'f';