Making unique values into duplicates in R - duplicates

I am working with R and I have a table that look like this...
A
B
C
D
E
F
And I need the table to look like this...
A
A
A
A
A
B
B
B
B
B
C
C
C
C
C
D
D
D
D
D
E
E
E
E
E
F
F
F
F
F
So,I need the same values 5 times in order to match them with another column.
Any help would greatly appreciated.
Thanks!

Not sure if the table has associated data that also has to be duplicated. Looking at a vector or single data.frame column can use rep
data <- LETTERS[1:6]
rep(data, each = 5) # will repeat each position 5 times prior to going to next position
rep(data, times = 5) # will repeat entire array 5 times

Related

I can´t do a delete for do data union in a trigger update mysql?

Hello I have got a table named example in mysql with the next fields a, b, c al fields are varchar 255, and unique per column.
The unique restriction per column I do with a trigger before insert;
An example of table:
------------
a b c
------------
a c
b
d e
f g
h
i
j
k
l n
c
a
This table is ok, because al values in each column are unique exception that ""
An example of wrog table
------------
a b c
------------
a b c
b
This table is wrong, because the field b have same value in two columns
I want to be able to do the next
I want to be able to make a union of rows respecting that the values ​​are unique
Example:
Merge ('a', '', 'c') with ('', 'b', '') and get ('a', 'b', 'c')
------------
a b c
------------
a b c
d e
f g
h
i
j
k
l n
c
a
Note that ('', 'b', '') was deleted.
Merge('j', 'a', 'i')
------------
a b c
------------
a c
b
d e
f g
h
j a i
k
l n
c
I not need merge for ecample ('', 'l','n') and ('','h','') because the 'a filed is empty' but I need merge (j, a, i) and ('','h','') and get
------------
a b c
------------
a c
b
d e
f g
j h i
k
l n
c
I'm trying to do this with a trigger, with the nex logic
if 'b' changed delete the other appearance of 'b'
if 'c' changed delete the other appearance of 'c'
But I cant do theses because I cant execute a delete into a mysql trigger, any other idea.
My goal is that the logic is executed in the database, and that if I have to call a procedure, block the update on the table, so that it can comply with the restriction of all unique rows

Find the middle of a string after first space and before last in MySql

Given a Name Field with Contents "A B C D" how can I extract "B C"?
I can use:
substring_Index(Name,' ',1) to extract 'A'
substring_Index(Name,' ',-1) to extract 'D'
But am not sure how to combine those to get the middle string. One issue is that both 'A' and 'D' can be anywhere from 1 to 4 characters and there can also be 'A B D'. So basically looking to extract whatever is not the two substrings above.
Something like below:
select trim(replace(replace(col,concat(substring_Index(col,' ',1),' '),''),concat(' ',substring_Index(col,' ',-1)),'')) from foo;
Check sqlfiddle: http://sqlfiddle.com/#!9/57f88/8
After getting some good answers I tried my own approach which seems to automatically account for
However many words between 'A' and 'D'
However many letters in 'A' or 'D'
Whether or not 'A' starts with same letter as D
Sorry for not taking time to do sqlfiddle deep in middle of project but as you can see works on any pattern:
Original 'A B C D' pattern:
Select 'A B C D', substring('A B C D',(char_length(substring_Index('A B C D',' ',1))+2),(char_length('A B C D')-char_length(substring_Index('A B C D',' ',-1)))-(char_length(substring_Index('A B C D',' ',1))+2))
More words between 'A' and 'D':
Select 'A B C X D', substring('A B C X D',(char_length(substring_Index('A B C X D',' ',1))+2),(char_length('A B C X D')-char_length(substring_Index('A B C X D',' ',-1)))-(char_length(substring_Index('A B C X D',' ',1))+2))
Length of 'A' and 'D' greater than one letter:
Select 'AA B C DDDD', substring('AA B C DDDD',(char_length(substring_Index('AA B C DDDD',' ',1))+2),(char_length('AA B C DDDD')-char_length(substring_Index('AA B C DDDD',' ',-1)))-(char_length(substring_Index('AA B C DDDD',' ',1))+2))
'A' and 'D' start with same letter (issue in earlier answer):
Select 'A B C A', substring('A B C A',(char_length(substring_Index('A B C A',' ',1))+2),(char_length('A B C A')-char_length(substring_Index('A B C A',' ',-1)))-(char_length(substring_Index('A B C A',' ',1))+2))

Matching contents from one column to another column in a CSV

So I have a csv with two columns like this:
A B
C D
E F
etc..
And another column like this:
B
X
D
Y
Z
F
etc..
I'd like to match the first column (A,C,E,etc..) to the corresponding values in the last column (B,X,D,Y,etc...).
So that the result is this:
A B
X
C D
Y
Z
E F
etc..
Is there a way to accomplish this?

Data step/SQL Join/Merge/Union 2 datasets/tables and remove the same rows/observations

For example, I have 2 tables like this
data have;
input name $ status $;
datalines;
A a
B b
C c
;;;;
run;
2nd table:
data addon;
input name $ status $;
datalines;
A a
C f
D d
E e
F f
B z
;;;;
run;
How do I get the result like below:
B b
C c
C f
D d
E e
F f
B z
The row A - a is the same from 2 tables so it got removed. I'm trying to use left join but the result is not right. Please help and thanks in advance. I'm really appreciated it.
Another way
data have;
input name $ status $;
datalines;
A a
B b
C c
;;;;
run;
data addon;
input name $ status $;
datalines;
A a
C f
D d
E e
F f
B z
;;;;
run;
Data Together;
Set have addon;
/* If the data sets were already sorted */
/* By Name Status; */
/* Then skip the Proc Sort */
Run;
Proc sort data=together;
by name status;
Run;
Data final;
Set Together;
by name status;
if first.status and last.status;
Run;
Try this:
SELECT COALESCE(table1.input, table2.input) AS input
, COALESCE(table1.status, table2.status) AS status
FROM table1
FULL OUTER JOIN table2 ON table1.input = table2.input
AND table1.status = table2.status
WHERE (table1.input IS NULL OR table2.input IS NULL)
ORDER BY 1
Output :
INPUT STATUS
----- ------
B b
B z
C f
C c
D d
E e
F f
Don't have time to test this, but this is approximately right. Won't work in SQLFiddle since MySQL doesn't support except.
select * from (
select * from have union select * from addon)
except
( select * from have, addon
where have.status=addon.status and have.name=addon.name)
SELECT t1.name,t1.status
FROM
(
SELECT name,status
FROM have
UNION ALL
SELECT name,status
FROM addon
) as t1
JOIN have t2 ON t1.name!=t2.name AND t1.status!=t2.status
JOIN addon t3 ON t2.name=t3.name AND t2.status=t3.status
I created an SQL fiddle for you.
data have;
input name $ status $;
datalines;
A a
B b
C c
;;;;
run;
2nd table:
data addon;
input name $ status $;
datalines;
A a
C f
D d
E e
F f
B z
;;;;
run;
How do I get the result like below:
B b
C c
C f
D d
E e
F f
B z
Simple Use merge statement. Sort the datasets using the keys before this step
DATA RESULT;
KEEP H.NAME A.STATUS;
MERGE HAVE(IN = H) ADDON (IN = A);
BY NAME;
RUN;

Is there a shortcut to normalizing a table where the columns=rows?

Suppose you had the mySQL table describing if you can mix two substances
Product A B C
---------------------
A y n y
B n y y
C y y y
The first step would be to transform it like
P1 P2 ?
-----------
A A y
A B n
A C y
B A y
B B y
B C n
C A y
C B n
C C y
But then you have duplicate information. (eg. If A can mix with B, then B can mix with A), so, you can remove several rows to get
P1 P2 ?
-----------
A A y
A B n
A C y
B B y
B C n
C C y
While the last step was pretty easy with a small table, doing it manually would take forever on a larger table. How would one go about automating the removal of rows with duplicate MEANING, but not identical content?
Thanks, I hope my question makes sense as I am still learning databases
If it's safe to assume that you're starting with all relationships doubled up, e.g.
If A B is in the table, then B A is guaranteed to be in the table.
Then all you have to do is remove all rows where P2 < P1;
DELETE FROM `table_name` WHERE `P2` < `P1`;
If this isn't the case, you can make it the case by going through the table and inserting all the duplicate rows if they don't already exist, then running this.
I don't think it's necessary in your situation, but as an intellectual exercise, you could build on Jamie Wong's solution and prevent non-duplicated columns from being removed with an EXISTS clause. Something like this:
DELETE FROM `table_name` AS t1
WHERE `P2` < `P1`
AND EXISTS (SELECT NULL FROM `table_name` AS t2
WHERE t1.`P1` = t2.`P2` AND t1.`P2` = t2.`P1`);
It pretty much just makes sure that there's a duplicate before deleting anything.
(My MySQL syntax might be a little off; it's been a while.)
Step 1 (as you've already done): Transform to Table2
P1 P2 ?
-----------
A A y
A B n
A C y
B A y
B B y
B C n
C A y
C B n
C C y
Step 2: ReOrder Columns, Select Distinct
SELECT DISTINCT
IF P1<P2 THEN P1 ELSE P2 END as P1, -- this puts the smallest value in P1
IF P1>P2 THEN P1 ELSE P2 END as P2 -- this puts the largest value in P2
FROM Table2
WHERE NOT P1=P2 --(Assuming records like A, A, y are not interesting)
I'm not a mySQL guy, so you might need to check the if/then syntax, but this seems conceptually ok anyway.