I have a table (table_string) & data as below, basically trying to split the string values into single value and store in a separate table.
ID Name ADD
1 a,b,c d,e,f
2 x,y,c n,e,f
3 n,b,c d,e,f
4 x,y,c n,e,f
After transformation the table data looks like the below .
**ID** Name ADD
1 a d
1 b e
1 c f
2 x n
2 y e
2 c f and so on....
SELECT regexp_substr(Name, '[^,]+', 1, LEVEL)
FROM dual
CONNECT BY regexp_substr(Name, '[^,]+', 1, LEVEL) IS NOT NULL
Related
usually I use [R] for my data analysis, but these days I have to use SPSS. I was expecting that data manipulation might get a little bit more difficult this way, but after my first day I kind of surrender :D and I really would appreciate some help ...
My problem is the following:
I have two data sets, which have an ID number. Neither data sets have a unique ID (in one data set, which should have unique IDs, there is kind of a duplicated row)
In a perfect world I would like to keep this duplicated row and simply perform a many-to-many-join. But I accepted, that I might have to delete this "bad" row (in dataset A) and perform a 1:many-join (join dataset B to dataset A, which contains the unique IDs).
If I run the join (and accept that it seems not to be possible to run a 1:many, but only a many:1-join), I have the problem, that I lose IDs. If I join dataset A to dataset B I lose all cases, that are not part of dataset B. But I really would like to have both IDs like in a full join or something.
Do you know if there is (kind of) a simple solution to my problem?
Example:
dataset A:
ID
VAL1
1
A
1
B
2
D
3
K
4
A
dataset B:
ID
VAL2
1
g
2
k
4
a
5
c
5
d
5
a
2
x
expected result (best solution):
ID
VAL1
VAL2
1
A
g
1
B
g
2
D
k
3
K
NA
4
A
a
2
D
x
expected result (second best solution):
ID
VAL1
VAL2
1
A
g
2
D
k
3
K
NA
4
A
a
5
NA
c
5
NA
d
5
NA
a
2
D
x
what I get (worst solution):
ID
VAL1
VAL2
1
A
g
2
D
k
4
A
a
5
NA
c
5
NA
d
5
NA
a
2
D
x
From your example It looks like what you need is a full many to many join, based on the ID's existing in dataset A. You can get this by creating a full Cartesian-Product of the two dataset, using dataset A as the first\left dataset.
The following syntax assumes you have the STATS CARTPROD extention command installed. If you don't you can see here about installing it.
First I'll recreate your example to demonstrate on:
dataset close all.
data list list/id1 vl1 (2F3) .
begin data
1 232
1 433
2 456
3 246
4 468
end data.
dataset name aaa.
data list list/id2 vl2 (2F3) .
begin data
1 111
2 222
4 333
5 444
5 555
5 666
2 777
3 888
end data.
dataset name bbb.
Now the actual work is fairly simple:
DATASET ACTIVATE aaa.
STATS CARTPROD VAR1=id1 vl1 INPUT2=bbb VAR2=id2 vl2
/SAVE OUTFILE="C:\somepath\yourcartesianproduct.sav".
* The new dataset now contains all possible combinations of rows in the two datasets.
* we will select only the relevant combinations, where the two ID's match.
select if id1=id2.
exe.
I have a table named contacts
id name value
1 a x
2 b c
3 c x
4 d x
5 e x
How I want to delete the rows that contain value of x ?
A simple SQL query will do.
DELETE * FROM contacts WHERE value='x'
I'm quite new on data bases and would be very grateful for some help, I have a database on the following format:
ID Nbr Data1 Data2 Data3
1 1 a
2 1 b
3 1 c
4 2 d
5 2 e
6 2 f
And would like to have a way to extract, with a MySQL query, the data on the following format:
Nbr Data1 Data2 Data3
1 a b c
2 d e f
I know that is not best practice to have the data on a non normalized format but sadly I can't change the source data.
Grateful for your help!
Insert into newtable
select ID,Nbr,max(Data1),max(Data2),max(Data3) from table group by Nbr
Try this and let me know it worked or not
SELECT Nbr,
Max(data1) data1,
Max(data2) data2,
Max(Data3) data3
FROM table
GROUP BY Nbr
Let's say I have 2 Tables
One named Baskets,
Another named Fruits.
Baskets-
basket_id , basket_name
1 - Basket One
2 - Basket Two
Fruits-
fruit_id , basket_id , fruit_name
1 - 1 - Banana
2 - 1 - Apple
3 - 2 - Pear
SELECT * FROM baskets
JOIN (SELECT GROUP_CONCAT(fruit_id SEPARATOR ', ') FROM fruits WHERE baskets.basket_id=fruits.basket_id) AS der_fruits
ON baskets.basket_id=der_fruits.basket_id
Now with this query I want to get 2 rows (since there are 2 baskets) with a list of the fruit id's in it.
Like this:
basket_id, fruits
1 - 1, 2
2 - 3
But just now what I get is this:
basket_id, fruits
2 - 1, 2, 3
The thing is, I have to pass the global baskets.basket_id value in the DERIVED table.
Is there anything like a global scope in MySQL?
Or is there a way to pass the global baskets.basket_id value in a variable inside that derived table?
SELECT baskets.*,
(SELECT GROUP_CONCAT(fruits.fruit_name)
FROM fruits f
WHERE b.basket_id = f.basket_id) AS der_baskets
FROM baskets b
The fruits are a subquery. I don't understand why you define the relationship twice. Is there something you are trying to do I don't understand?
I have a more exotic SQL statement I'm trying to perform which "combines" 3 tables as a cartesion product and adds together the identical columns.
I've simplified this as much as possible. Say I've made three tables as such, which will then be combined to make table_d:
mysql>select * from table_a;
Code Goat Dog Cat
A 4 5 6
B 7 8 9
C 10 11 12
mysql>select * from table_b;
Code Goat Dog Cat
D 1 2 3
E 4 5 6
F 7 8 9
mysql>select * from table_c;
Code Goat Dog Cat Bird
T 1 1 1 2
Y 2 2 2 3
U 3 3 3 4
An SQL create table statement, along the lines of "create table table_d as (select..." then makes a table like below.
Here the identically named columns are added together while the Code field is built up as a concatenated string. However I'm not sure how to go about this.
Thus
mysql>select * from table_d;
Code Goat Dog Cat Bird
ADT 6 8 10 2
ADY 7 9 11 3
ADU 8 10 12 4
BDT 9 11 13 2
BDY .....
....
....
CFU 20 22 24 4
Any advice or help is greatly appreciated at this point. This will also be performed on more than 3 tables at once but I showed only 3 here for simplicity. Thanks!
SQL Insertion code:::
create table table_a(code varchar(1),goat integer, dog integer, cat integer);
create table table_b(code varchar(1),goat integer, dog integer, cat integer);
create table table_c(code varchar(1),goat integer, dog integer, cat integer, bird integer);
insert into table_a values('A','4','5','6');
insert into table_a values('B','7','8','9');
insert into table_a values('C','10','11','12');
insert into table_b values('D','1','2','3');
insert into table_b values('E','4','5','6');
insert into table_b values('F','7','8','9');
insert into table_c values('T','1','1','1','2');
insert into table_c values('Y','2','2','2','3');
insert into table_c values('U','3','3','3','4');
Try this:
CREATE TABLE table_d
SELECT CONCAT(a.code, b.code, c.code) AS CODE, (a.goat + b.goat + c.goat) AS goat, (a.dog + b.dog + c.dog) AS dog, (a.cat + b.cat + c.cat) AS cat
FROM table_a a
JOIN table_b b
JOIN table_c c
ORDER BY CODE;
You might be getting into trouble because your column schema varies from table to table across a partitioned dataset. Relational DBs really prefer rows to columns when structure fluctuates. What about a more row-oriented model, like:
mysql>select * from table_a;
Code Type Number
A Goat 4
A Dog 5
A Cat 6
B Goat 7
B Dog 8
B Cat 9
C Goat 10
C Dog 11
C Cat 12
If you joined the tables with themselves multiple times then you would be able to use SUM() aggregate functions to do your counts rather than using calculated columns.