Sorry for the beginner question.
I have an Outputs table:
ID
value
0
x
1
y
2
z
And an Inputs table that is linked to the Outputs through the outputsID:
ID
outputsID
name
0
0
A
1
1
B
2
1
C
3
2
B
4
2
C
Assuming that multiple outputs have at least one shared input (in this example outputID 1,3 and 2,4 are the same), is there a way to avoid the duplication of entries in my Inputs table (inputID 3 and 4)?
The 'normal' answer to your question is no. Rows 1 and 2 address output 1, and Rows 3 and 4 address output 2. They aren't duplicates and each reflect something distinct.
So if you are a beginner, I would say you shouldn't want to get rid of these rows.
That said, there are some more advanced techniques. For example, you could have the OutputsID column be an array with multiple values. This is harder, more complex, and non-standard.
Related
usually I use [R] for my data analysis, but these days I have to use SPSS. I was expecting that data manipulation might get a little bit more difficult this way, but after my first day I kind of surrender :D and I really would appreciate some help ...
My problem is the following:
I have two data sets, which have an ID number. Neither data sets have a unique ID (in one data set, which should have unique IDs, there is kind of a duplicated row)
In a perfect world I would like to keep this duplicated row and simply perform a many-to-many-join. But I accepted, that I might have to delete this "bad" row (in dataset A) and perform a 1:many-join (join dataset B to dataset A, which contains the unique IDs).
If I run the join (and accept that it seems not to be possible to run a 1:many, but only a many:1-join), I have the problem, that I lose IDs. If I join dataset A to dataset B I lose all cases, that are not part of dataset B. But I really would like to have both IDs like in a full join or something.
Do you know if there is (kind of) a simple solution to my problem?
Example:
dataset A:
ID
VAL1
1
A
1
B
2
D
3
K
4
A
dataset B:
ID
VAL2
1
g
2
k
4
a
5
c
5
d
5
a
2
x
expected result (best solution):
ID
VAL1
VAL2
1
A
g
1
B
g
2
D
k
3
K
NA
4
A
a
2
D
x
expected result (second best solution):
ID
VAL1
VAL2
1
A
g
2
D
k
3
K
NA
4
A
a
5
NA
c
5
NA
d
5
NA
a
2
D
x
what I get (worst solution):
ID
VAL1
VAL2
1
A
g
2
D
k
4
A
a
5
NA
c
5
NA
d
5
NA
a
2
D
x
From your example It looks like what you need is a full many to many join, based on the ID's existing in dataset A. You can get this by creating a full Cartesian-Product of the two dataset, using dataset A as the first\left dataset.
The following syntax assumes you have the STATS CARTPROD extention command installed. If you don't you can see here about installing it.
First I'll recreate your example to demonstrate on:
dataset close all.
data list list/id1 vl1 (2F3) .
begin data
1 232
1 433
2 456
3 246
4 468
end data.
dataset name aaa.
data list list/id2 vl2 (2F3) .
begin data
1 111
2 222
4 333
5 444
5 555
5 666
2 777
3 888
end data.
dataset name bbb.
Now the actual work is fairly simple:
DATASET ACTIVATE aaa.
STATS CARTPROD VAR1=id1 vl1 INPUT2=bbb VAR2=id2 vl2
/SAVE OUTFILE="C:\somepath\yourcartesianproduct.sav".
* The new dataset now contains all possible combinations of rows in the two datasets.
* we will select only the relevant combinations, where the two ID's match.
select if id1=id2.
exe.
I have data in multiple rows, and I need to combine data in similar columns and separate with a semi colon, to end up with one row with grouped by ID.
I have
ID Type
1 A
1 B
1 C
2 D
3 A
3 F
I want results to be
1 A;B;C
2 D
3 A;F
I have limited knowledge of access, but know this should be basic and easy. I appreciate assistance.
I am wondering if any of you would be able to help me. I am trying to loop through table 1 (which has duplicate values of the plant codes) and based on the unique plant codes, create a new record for the two other tables. For each unique Plant code I want to create a new row in the other two tables and regarding the non unique PtypeID I link any one of the PTypeID's for all inserts it doesnt matter which I choose and for the rest of the fields like name etc. I would like to set those myself, I am just stuck on the logic of how to insert based on looping through a certain table and adding to another. So here is the data:
Table 1
PlantCode PlantID PTypeID
MEX 1 10
USA 2 11
USA 2 12
AUS 3 13
CHL 4 14
Table 2
PTypeID PtypeName PRID
123 Supplier 1
23 General 2
45 Customer 3
90 Broker 4
90 Broker 5
Table 3
PCreatedDate PRID PRName
2005-03-21 14:44:27.157 1 Classification
2005-03-29 00:00:00.000 2 Follow Up
2005-04-13 09:27:17.720 3 Step 1
2005-04-13 10:31:37.680 4 Step 2
2005-04-13 10:32:17.663 5 General Process
Any help at all would be greatly appreciated
I'm unclear on what relationship there is between Table 1 and either of the other two, so this is going to be a bit general.
First, there are two options and both require a select statement to get the unique values of PlantCode out of table1, along with one of the PTypeId's associated with it, so let's do that:
select PlantCode, min(PTypeId)
from table1
group by PlantCode;
This gets the lowest valued PTypeId associated with the PlantCode. You could use max(PTypeId) instead which gets the highest value if you wanted: for 'USA' min will give you 11 and max will give you 12.
Having selected that data you can either write some code (C#, C++, java, whatever) to read through the results row by row and insert new data into table2 and table3. I'm not going to show that, but I'll show how the do it using pure SQL.
insert into table2 (PTypeId, PTypeName, PRID)
select PTypeId, 'YourChoiceOfName', 24 -- set PRID to 24 for all
from
(
select PlantCode, min(PTypeId) as PTypeId
from table1
group by PlantCode
) x;
and follow that with a similar insert.... select... for table3.
Hope that helps.
I have a table structure as follows - there are two columns A and B. For one value of column A, there can be many values of column B (Corresponding to multiple rows). I want to query SQL in a manner that I get all the values of column A for which corresponding to one particular value of column A, column B does not take a particular value. eg:
A B
1 1
1 2
2 1
2 3
2 4
3 2
3 4
3 5
If I don't want column B to have the value 3 for a particular value of column A, the query should return the following on above data
A
1
3
I cannot figure out how to write such a query and searching manually is too time consuming. Please help me write the query. Thanks in advance.
you question is not very clear. I understand that you want something like
SELECT DISTINCT A FROM table WHERE A NOT IN (SELECT A FROM table WHERE B = 3)
I've following problem and don't know the terminology to describe it and hence search for possible solutions.
I have a pivot table (matrix), eg each row and column have a named header. there is a defined set for rows and columns. Now let's assume that 10 rows are "combined" meaning each column is summed up to create a new "pattern".
What I would like is a way to determine alternative row combinations that lead to the same or similar "combined" pattern.
1 1 1
5 5 5
"Combined"
6 6 6
alternate row combination:
2 2 2
4 4 4
Suggestions? How is this problem called?
http://en.wikipedia.org/wiki/System_of_linear_equations#Matrix_equation
I just have to transpose above matrix to get Matrix A
[code]
1 5
1 5
1 5
[/code]
combined matrix is a vector b:
[code]
6
6
6
[/code]
and x would be just a vector full of 1.