Valid SQL without JOIN? - mysql

I came across the following SQL statement and I was wondering if it was valid:
SELECT COUNT(*)
FROM
registration_waitinglist,
registration_registrationprofile
WHERE
registration_registrationprofile.activation_key = "ALREADY_ACTIVATED"
What does the two tables separated by a comma mean?

When you SELECT data from multiple tables you obtain the Cartesian Product of all the tuples from these tables. It can be illustrated in the following way:
This means you get each row from the first table paired with all the rows from the second table. Most of the time, it is not what you want. If you really want it, then it's clearer to use the CROSS JOIN notation:
SELECT * FROM A CROSS JOIN B;

In this context, it means that you are going to be joining every row from registration_waitinglist to every row in registration_registrationprofile
It's called a cartesian join

That query is 'syntactically' correct, meaning it will run. What the query will return is the entire product of every row in registration_waitinglist x registration_registrationprofile.
For example, if there were 2 rows in waitinglist and 3 rows in profile, then 6 rows will be returned.
From a practical matter, this is almost always a logical error and not intended. With rare exception, there should be either join criteria or criteria in the where clause.

Related

MySQL aggregate function to filter nulls and conform with ONLY_FULL_GROUP_BY

I have a single record which joins to N other tables, and extracts a single column from each of them. I would like to put all N of those extracted columns in a single record.
After constructing the diagram below it seems like I can get to the second step easily, and then I should be able to use an aggregate function to filter out the NULL's. I have looked around for something like GROUP_COALESCE, but I couldn't find something which accomplishes this.
I have a fiddle here which unfortunately works, because MySQL will let you select columns which aren't in the GROUP BY without an aggregate at your own peril http://sqlfiddle.com/#!9/304992/1/0.
Is there a way I can make sure that it always selects the column from the record, if the record exists?
The end result should one record per group, and each column would contain the value which was inside the only row successfully joined for that group..
If I followed you correctly, you can just use aggregate functions on the columns coming from the joined tables. Aggregate functions ignore null values, so, since you have two null values and one non-null value for each column and each group, this will return the expected output (while conforming to the ONLY_FULL_GROUP_BY option).
SELECT
group_table_id,
MAX(t1.v) t1_v,
MAX(t2.v) t2_v,
MAX(t3.v) t3_v
FROM group_table
LEFT JOIN t1 ON t1.group_id = group_table_id
LEFT JOIN t2 ON t2.group_id = group_table_id
LEFT JOIN t3 ON t3.group_id = group_table_id
GROUP BY group_table_id

sql workbench adding rows after running a join

Hello everyone i have a quick question, i am running mysql workbench and after joining two tables i get as results 10000 rows. Considering that the first dataset got 6000 rows and the second 450, it'clearly wrong. i'm clearly doing something wrong but i can't figure what is that and why it is happening
I am selecting some column from the first data set and match it against the second one against sv3 and sv4 columns
Can you tell me what i am doing wrong?
the code
select media.Timestamp, media.Campaign, media.Media, media.sv3, media.sv4
from media
inner join media_1
on media.sv3=media_1.sv3 and on media.sv4=media_1.sv4
JOIN queries yielding more results than their source records is not necessarily a sign something is outright wrong; but can be an indicator of something amiss (queries that need to behave that way exist, but are relatively rare).
The source of your issue is likely because you are joining on a value that is non-unique in both tables. As a simple example: If table X has two records with field A = 5, and table Y has three records with field A = 5, and they are JOINed on field A; those records will produce six results.
This may mean there is a problem with your source data, or you may just need to query it in a different manner. I notice you are only selecting fields from media and none from media_1; this query may yield the results you are expecting:
SELECT media.Timestamp, media.Campaign, media.Media, media.sv3, media.sv4
FROM media
WHERE (sv3, sv4) IN (SELECT sv3, sv4 FROM media_1)

Mysql join query with where condition and distinct records

I have two tables called tc_revenue and tc_rates.
tc_revenue contains :- code, revenue, startDate, endDate
tc_rate contains :- code, tier, payout, startDate, endDate
Now I need to get records where code = 100 and records should be unique..
I have used this query
SELECT *
FROM task_code_rates
LEFT JOIN task_code_revenue ON task_code_revenue.code = task_code_rates.code
WHERE task_code_rates.code = 105;
But I am getting repeated records help me to find the correct solution.
eg:
in this example every record is repeated 2 time
Thanks
Use a group by for whatever field you need unique. For example, if you want one row per code, then:
SELECT * FROM task_code_rates LEFT JOIN task_code_revenue ON task_code_revenue.code = task_code_rates.code
where task_code_rates.code = 105
group by task_code_revenue.code, task_code_revenue.tier
If code admits duplicates in both tables and you perform join only using code, then you will get the cartessian product between all matching rows from one table and all matching rows from the other.
If you have 5 records with code 100 in first table and 2 records with code 100 in second table, you'll get 5 times 2 results, all combinations between matching rows from the left and the right.
Unless you have duplicates inside one (or both) tables, all 10 results will differ in colums coming either from one table, the other or both.
But if you were expecting to get two combined rows and three rows from first table with nulls for second table columns, this will not happen.
This is how joins work, and anyway, how should the database decide which rows to combine if it didn't generate all combinations and let you decide in where clause?
Maybe you need to add more criteria to the ON clause, such as also matching dates?

Why second is relation is affecting the result?

Given relations R(a,b) and S(c,d).I execute following query
select a,b from R,S;
When S is empty, result is always empty while R(a,b) is non empty.I am not getting how S is affecting the query even there should be no interaction with S.
It's because, regardless of the items you're selecting from the query, you're still doing a join between the two tables.
If S is empty, the result of the join is zero rows because that's what the join gives you. That is indeed what you're seeing.
If S had 10,000 rows you would get that many copies of each row in R.
The only way you'll see the correct number of rows from R (assuming no where clause affecting the join), is if S had exactly one row in it.
If you're not using any columns in S for the query, you really shouldn't be listing it as a source table. The correct query would be:
select a, b from R
I am not getting how S is affecting the query even there should be no interaction with S
Because the Cartesian product of A and the empty set is an empty set. Reference: http://en.wikipedia.org/wiki/Empty_set
Also, check this Why is the Cartesian product of a set A and empty set an empty set?

Determine if joined table has 1 or more than 1 matching rows. Is there a better way than GROUP BY and COUNT?

I join table A to table B and need to know if table B has 1 matching row or more than one.
Of course, I can do it with GROUP BY and COUNT, but it's an overkill, because it has to count all the matches and I don't need this info.
Is there a simple way to get the info I need (only one matching row or more) which short circuits the evaluation and stops when it knows the answer without scanning and counting all the remaining matches?
Or should I not care about this, becasue it's not a big performance hit and I should simply go with COUNT?
It really depends on the size of the DB, and your exact requirements. Generally a count()/Group By/Having combination is a pretty efficient query, with the right indexes. You could do it in a more complicated way, for example, having a trigger on after update that keeps a count table updated.
Are you seeing the count(*)/group/having combination giving you performance issues?
If you just need to know if there is one or more than one row for a certain join sql, meaning a matching row:
-- Without any sample SQL code, here's a return sample
SELECT B.SOMEJOINAPPLICABLECOLUMN
FROM A
LEFT OUTER JOIN B
ON A.SOMEJOINAPPLICABLECOLUMN = B.SOMEJOINAPPLICABLECOLUMN
WHERE
B.SOMEJOINAPPLICABLECOLUMN IS NOT NULL
LIMIT 2;
Naturally:
2 returned rows = more than one match
1 returned row = one match
0 returned rows = no matches