Why second is relation is affecting the result? - mysql

Given relations R(a,b) and S(c,d).I execute following query
select a,b from R,S;
When S is empty, result is always empty while R(a,b) is non empty.I am not getting how S is affecting the query even there should be no interaction with S.

It's because, regardless of the items you're selecting from the query, you're still doing a join between the two tables.
If S is empty, the result of the join is zero rows because that's what the join gives you. That is indeed what you're seeing.
If S had 10,000 rows you would get that many copies of each row in R.
The only way you'll see the correct number of rows from R (assuming no where clause affecting the join), is if S had exactly one row in it.
If you're not using any columns in S for the query, you really shouldn't be listing it as a source table. The correct query would be:
select a, b from R

I am not getting how S is affecting the query even there should be no interaction with S
Because the Cartesian product of A and the empty set is an empty set. Reference: http://en.wikipedia.org/wiki/Empty_set
Also, check this Why is the Cartesian product of a set A and empty set an empty set?

Related

LEFT OUTER JOIN GIVING UNEXPECTED RESULT

I'm making a simple query :
SELECT * FROM Bench LEFT JOIN ASSIGNED_DOM
ON (Bench.B_id=ASSIGNED_DOM.B_id) WHERE Bench.B_type=0 ;
As expected all the lines of Bench table are returned BUT If I try to get the B_id field I discovered that was put to NULL.
Then I have tried with this other query that should be totally equivalent:
SELECT * FROM Bench LEFT JOIN ASSIGNED_DOM USING (B_id) WHERE Bench.B_type=0 ;
But in that case the B_id field is returned correctly.
What's wrong with the first query? What the difference between the two ?
The two queries are not equivalent. According to the documentation,
Natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard.
Redundant columns of a NATURAL join do not appear.
That specifically comes down to the following difference:
A USING clause can be rewritten as an ON clause that compares corresponding columns. However, although USING and ON are similar, they are not quite the same.
With respect to determining which rows satisfy the join condition, both joins are semantically identical.
With respect to determining which columns to display for SELECT * expansion, the two joins are not semantically identical. The USING join selects the coalesced value of corresponding columns, whereas the ON join selects all columns from all tables.
So your first query has two columns of the same name, bench.B_id, ASSIGNED_DOM.B_id, while the second one just has one, coalesce(bench.B_id, ASSIGNED_DOM.B_id) as B_id.
It will depend on your application/framework how exactly the first case will be handled. E.g. the MySQL client or phpmyadmin will just display all columns. Some frameworks may alter the names in some way to make them unique.
php in particular (and I assume you are using this) will not though: if you use $row['B_id'], it will return the last occurance (although that behaviour is not specified), so in your case you will get ASSIGNED_DOM.B_id. You can however still access both columns with their index (e.g. $row[0], $row[1]), but just one of those with their identical column name.
To prevent such problems, you can/should use aliases, e.g. select bench.B_id as bench_B_id, ASSIGNED_DOM.B_id as ASSIGNED_DOM_B_id, ....
Values in second table overwrites values in the first one if column name is the same. Try to use an alias in your query
SELECT Bench.B_id AS bid1, ASSIGNED_DOM.B_id AS bid2
FROM Bench
LEFT JOIN ASSIGNED_DOM ON (Bench.B_id=ASSIGNED_DOM.B_id)
WHERE Bench.B_type=0;

sql workbench adding rows after running a join

Hello everyone i have a quick question, i am running mysql workbench and after joining two tables i get as results 10000 rows. Considering that the first dataset got 6000 rows and the second 450, it'clearly wrong. i'm clearly doing something wrong but i can't figure what is that and why it is happening
I am selecting some column from the first data set and match it against the second one against sv3 and sv4 columns
Can you tell me what i am doing wrong?
the code
select media.Timestamp, media.Campaign, media.Media, media.sv3, media.sv4
from media
inner join media_1
on media.sv3=media_1.sv3 and on media.sv4=media_1.sv4
JOIN queries yielding more results than their source records is not necessarily a sign something is outright wrong; but can be an indicator of something amiss (queries that need to behave that way exist, but are relatively rare).
The source of your issue is likely because you are joining on a value that is non-unique in both tables. As a simple example: If table X has two records with field A = 5, and table Y has three records with field A = 5, and they are JOINed on field A; those records will produce six results.
This may mean there is a problem with your source data, or you may just need to query it in a different manner. I notice you are only selecting fields from media and none from media_1; this query may yield the results you are expecting:
SELECT media.Timestamp, media.Campaign, media.Media, media.sv3, media.sv4
FROM media
WHERE (sv3, sv4) IN (SELECT sv3, sv4 FROM media_1)

Basics: Query results not returning as expected

I have less than basic knowledge of MS Access, as I only need to use it to pull down information irregularly before using R to do the manipulation. As a result, I have no SQL coding knowledge - I just use the Access GUI.
My problem: When I create a query that includes multiple tables Access seems to exclude the results that don't have values in all of the tables.
Solution: I'm looking for a simple way, through the GUI, to tell Access to include all the IDs in the parent table, irrespective of whether they have values in any of the child tables. Those IDs that have no values in the child tables should just return with blanks in those columns.
I know this is probably SQL 101 but my searching hasn't returned anything useful.
You should use LEFT JOIN or RIGHT JOIN, the direction meaning the table from which you want to get all rows. See the select below:
SELECT * FROM TABLE_A a LEFT JOIN TABLE_B b ON a.id=b.id
This will return all rows from TABLE_A linked to the corresponding rows from TABLE_B. When there is no match the TABLE_B columns will return NULL.

Valid SQL without JOIN?

I came across the following SQL statement and I was wondering if it was valid:
SELECT COUNT(*)
FROM
registration_waitinglist,
registration_registrationprofile
WHERE
registration_registrationprofile.activation_key = "ALREADY_ACTIVATED"
What does the two tables separated by a comma mean?
When you SELECT data from multiple tables you obtain the Cartesian Product of all the tuples from these tables. It can be illustrated in the following way:
This means you get each row from the first table paired with all the rows from the second table. Most of the time, it is not what you want. If you really want it, then it's clearer to use the CROSS JOIN notation:
SELECT * FROM A CROSS JOIN B;
In this context, it means that you are going to be joining every row from registration_waitinglist to every row in registration_registrationprofile
It's called a cartesian join
That query is 'syntactically' correct, meaning it will run. What the query will return is the entire product of every row in registration_waitinglist x registration_registrationprofile.
For example, if there were 2 rows in waitinglist and 3 rows in profile, then 6 rows will be returned.
From a practical matter, this is almost always a logical error and not intended. With rare exception, there should be either join criteria or criteria in the where clause.

Determine if joined table has 1 or more than 1 matching rows. Is there a better way than GROUP BY and COUNT?

I join table A to table B and need to know if table B has 1 matching row or more than one.
Of course, I can do it with GROUP BY and COUNT, but it's an overkill, because it has to count all the matches and I don't need this info.
Is there a simple way to get the info I need (only one matching row or more) which short circuits the evaluation and stops when it knows the answer without scanning and counting all the remaining matches?
Or should I not care about this, becasue it's not a big performance hit and I should simply go with COUNT?
It really depends on the size of the DB, and your exact requirements. Generally a count()/Group By/Having combination is a pretty efficient query, with the right indexes. You could do it in a more complicated way, for example, having a trigger on after update that keeps a count table updated.
Are you seeing the count(*)/group/having combination giving you performance issues?
If you just need to know if there is one or more than one row for a certain join sql, meaning a matching row:
-- Without any sample SQL code, here's a return sample
SELECT B.SOMEJOINAPPLICABLECOLUMN
FROM A
LEFT OUTER JOIN B
ON A.SOMEJOINAPPLICABLECOLUMN = B.SOMEJOINAPPLICABLECOLUMN
WHERE
B.SOMEJOINAPPLICABLECOLUMN IS NOT NULL
LIMIT 2;
Naturally:
2 returned rows = more than one match
1 returned row = one match
0 returned rows = no matches