MySQL Sub-Query or LEFT JOIN for SELECTing missing columns? - mysql

I need to perform a SELECT query on 3 tables and i don't know if using a sub-query could be better than a LEFT JOIN since one column in some case might be missing. These are the tables:
Options (name, info...)
Owners (name, address)
Rel (idoption, idowner)
The SELECT should return all the Options with the name of the Owner inside each record but, in some case, the Option might not be connected to any Owner and the name of the Owner should be empty.
Any suggestions? Thanks in advance

A LEFT JOIN is likely the appropriate response and will probably be faster than a subquery depending on your results (it's possible that they'd compile to the same plan).
SELECT
op.name
,op.info
,...
,ow.name
,ow.address
FROM
options op
LEFT OUTER JOIN
Rel r
ON r.idoption = op.id
LEFT OUTER JOIN
owners ow
ON ow.id = r.idowner

LEFT JOIN then, it will get all the Options irregardless if there is a matching Owner or not - "This extra consideration to the left table can be thought of as special kind of preservation. Each item in the left table will show up in a MySQL result, even if there isn't a match with the other table that it is being joined to."
from: http://www.tizag.com/mysqlTutorial/mysqlleftjoin.php

A left join will be much more efficient and faster than a subquery. If you can live with NULLs for the cases where there's no match, it's the better approach.

Related

LEFT JOIN or INNER JOIN?

I have the following tables. All fields are NOT NULL.
tb_post
id
account_id
created_at
content
tb_account
id
name
I want to select the latest post along with the name. Should I use INNER JOIN or LEFT JOIN? From my understanding both produce the same results. But which is more correct or faster?
SELECT p.content, a.name
FROM tb_post AS p
[INNER or LEFT] JOIN tb_account AS a
ON a.id = p.account_id
ORDER BY p.created_at DESC
LIMIT 50
A LEFT JOIN is absolutely not faster than an INNER JOIN. In fact, it's slower; by definition, an outer join (LEFT JOIN or RIGHT JOIN) has to do all the work of an INNER JOIN plus the extra work of null-extending the results. It would also be expected to return more rows, further increasing the total execution time simply due to the larger size of the result set.
(And even if a LEFT JOIN were faster in specific situations due to some difficult-to-imagine confluence of factors, it is not functionally equivalent to an INNER JOIN, so you cannot simply go replacing all instances of one with the other!)
Better go for INNER JOIN.
As Per My View The Correct One Is Inner join
because it returns resultset that include only matched elements where Left Join Returns all entries from Left Table. In this case I think Inner join returns the only required amount of data to be proceed.
You have to ask yourself two questions.
1) Is there any chance that at some point in your application lifetime, there will be posts with an empty or invalid account_id?
If not, it doesn't matter.
If yes...
2) Would it be desirable to include posts without an associated account in the result of the query? If yes, use LEFT JOIN, if no, use INNER JOIN.
I personally don't think speed is very relevant: the difference between them is what they do.
They happen to give the same result in your case, but that does not mean they can be interchanged, because choosing the one or the other still tells the other guy that reads your code something.
I tend to think like this:
INNER JOIN - the two tables are basically ONE set, we just need to combine two sources.
LEFT JOIN - the left tables is the source, and optionally we may have additional information (in the right table).
So if I would read your code and see a LEFT JOIN, that's the impression you give me about your data model.

Difference between FROM and JOIN tables

I'm working through the JOIN tutorial on SQL zoo.
Let's say I'm about to execute the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM game a
JOIN goal g
ON g.matchid = a.id
GROUP BY a.stadium
As it happens, it produces the same output as the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM goal g
JOIN game a
ON g.matchid = a.id
GROUP BY a.stadium
So then, when does it matter which table you assign at FROM and which one you assign at JOIN?
When you are using an INNER JOIN like you are here, the order doesn't matter. That is because you are connecting two tables on a common index, so the order in which you use them is up to you. You should pick an order that is most logical to you, and easiest to read. A habit of mine is to put the table I'm selecting from first. In your case, you're selecting information about a stadium, which comes from the game table, so my preference would be to put that first.
In other joins, however, such as LEFT OUTER JOIN and RIGHT OUTER JOIN the order will matter. That is because these joins will select all rows from one table. Consider for example I have a table for Students and a table for Projects. They can exist independently, some students may have an associated project, but not all will.
If I want to get all students and project information while still seeing students without projects, I need a LEFT JOIN:
SELECT s.name, p.project
FROM student s
LEFT JOIN project p ON p.student_id = s.id;
Note here, that the LEFT JOIN refers to the table in the FROM clause, so that means ALL of students were being selected. This also means that p.project will be null for some rows. Order matters here.
If I took the same concept with a RIGHT JOIN, it will select all rows from the table in the join clause. So if I changed the query to this:
SELECT s.name, p.project
FROM student s
RIGHT JOIN project p ON p.student_id = s.id;
This will return all rows from the project table, regardless of whether or not it has a match for students. This means that in some rows, s.name will be null. Similar to the first example, because I've made project the outer joined table, p.project will never be null (assuming it isn't in the original table). In the first example, s.name should never be null.
In the case of outer joins, order will matter. Thankfully, you can think intuitively with LEFT and RIGHT joins. A left join will return all rows in the table to the left of that statement, while a right join returns all rows from the right of that statement. Take this as a rule of thumb, but be careful. You might want to develop a pattern to be consistent with yourself, as I mentioned earlier, so these queries are easier for you to understand later on.
When you only JOIN 2 tables, usually the order does not matter: MySQL scans the tables in the optimal order.
When you scan more than 2 tables, the order could matter:
SELECT ...
FROM a
JOIN b ON ...
JOIN c ON ...
Also, MySQL tries to scan the tables in the fastest way (large tables first). But if a join is slow, it is possible that MySQL is scanning them in a non-optimal order. You can verify this with EXPLAIN. In this case, you can force the join order by adding the STRAIGHT_JOIN keyword.
The order doesn't always matter, I usually just order it in a way that makes sense to someone reading your query.
Sometime order does matter. Try it with LEFT JOIN and RIGHT JOIN.
In this instance you are using an INNER JOIN, if you're expecting a match on a common ID or foreign key, it probably doesn't matter too much.
You would however need to specify the tables the correct way round if you were performing an OUTER JOIN, as not all records in this type of join are guaranteed to match via the same field.
yes, it will matter when you will user another join LEFT JOIN, RIGHT JOIN
currently You are using NATURAL JOIN that is return all tables related data, if JOIN table row not match then it will exclude row from result
If you use LEFT / RIGHT {OUTER} join then result will be different, follow this link for more detail

MySQL SELECT from two tables with COUNT

i have two tables as below:
Table 1 "customer" with fields "Cust_id", "first_name", "last_name" (10 customers)
Table 2 "cust_order" with fields "order_id", "cust_id", (26 orders)
I need to display "Cust_id" "first_name" "last_name" "order_id"
to where i need count of order_id group by cust_id like list total number of orders placed by each customer.
I am running below query, however, it is counting all the 26 orders and applying that 26 orders to each of the customer.
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order, customer cus
GROUP BY cust_id;
Could you please suggest/advice what is wrong in the query?
You issue here is that you have told the database how these two tables are 'connected', or what they should be connected by:
Have a look at this image:
~IMAGE SOURCE
This effectively allows you to 'join' two tables together, and use a query between them.
so you might want to use something like:
SELECT COUNT(B.order_id), A.cust_id, A.first_name, A.last_name
FROM customer A
LEFT JOIN cust_order B //this is using a left join, but an inner may be appropriate also
ON (A.cust_id= B.Cust_id) //what links them together
GROUP BY A.cust_id; // the group by clause
As per your comment requesting some further info:
Left Join (right joins are almost identical, only the other way around):
The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in right table, the join will still return a row in the result, but with NULL in each column from right table. ~Tutorials Point.
This means that a left join returns all the values from the left table, plus matched values from the right table or NULL in case of no matching join predicate.
LEFT joins will be used in the cases where you wish to retrieve all the data from the table in the left hand side, and only data from the right that match.
Execution Time
While the accepted answer in this case may work well in small datasets, it may however become 'heavy' in larger databases. This is because it was not actually designed for this type of operation.
This was the purpose of Joins to be introduced.
Much work in database-systems has aimed at efficient implementation of joins, because relational systems commonly call for joins, yet face difficulties in optimising their efficient execution. The problem arises because inner joins operate both commutatively and associatively. ~Wikipedia
In practice, this means that the user merely supplies the list of tables for joining and the join conditions to use, and the database system has the task of determining the most efficient way to perform the operation. A query optimizer determines how to execute a query containing joins. So, by allowing the dbms to choose the way your data is queried, you can save a lot of time.
Other Joins/Summary
AN INNER JOIN will return data from both tables where the keys in each table match
A LEFT JOIN or RIGHT JOIN will return all the rows from one table and matching data from the other table.
Use a join when you want to query multiple tables.
Joins are much faster than other ways of querying >=2 tables (speed can be seen much better on larger datasets).
You could try this one:
SELECT COUNT(cus_order.order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order cus_order, customer cus
WHERE cus_order.cust_id = cus.cust_id
GROUP BY cust_id;
Maybe an left join will help you
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name ]
FROM customer cus
LEFT JOIN cust_order co
ON (co.cust_id= cus.Cust_id )
GROUP BY cus.cust_id;

SQL - Joining tables BUT not always

I need to perform a query SELECT that joins three tables (no problem with that). Nonetheless, the third table can, or NOT, have any element that match the joining KEY.
I want ALL data from the first two tables and if the ITEMS have ALSO information in the third table, fetch this data to.
For example, imagine that the first table have a person, the second table have his/her address (everyone lives anywhere), the third table stores the driving license (not everyone has this) - but I need to fetch all data whether or not people (all people) have driving license.
Thanks a lot for reading, if possible to give you suggestion / solution!
Use LEFT JOIN to join the third table. Using INNER JOIN a row has to exists. Using LEFT JOIN, the 'gaps' will be filled with NULLs.
SELECT
p.PersonID, -- NOT NULL
-- dl.PersonID, -- Can be null. Don't use this one.
p.FirstName,
p.LastName,
a.City,
a.Street,
dl.ValidUntilDate
FROM
Person p
INNER JOIN Addresse a ON a.AddressID = p.HomeAddressID
LEFT JOIN DrivingLicence dl ON dl.PersonId = p.PersonID

Including rest of the rows in MySQL

I have an SQL query that selects user's privileges, and adds true to them.
SELECT
PrivilageName,
'true' hasrights <-- imaginary column
FROM
users
NATURAL JOIN usermemberships
NATURAL JOIN groupprivileges
NATURAL JOIN `privileges`
WHERE
UserID = '2'
Result is
AddBuilding true
RemoveBuilding true
EditBuilding true
I'm trying to add the rest of the privilages with false value.
AddBuilding true
RemoveBuilding true
EditBuilding true
RemoveUser false
AddUser false
How I'll do this?
Edit: the structure of the tables:
users(UserID),
usermemberships(UserID, groupID),
groupprivileges(GroupID, PrivilegeID),
privileges(PrivilegeID, PrivilageName)
Edit: misspelling, sorry.
(NOTE: The queries in this answer are now updated, to include the column names that were added to the question.)
One approach to getting that resultset would be to use LEFT JOIN operations (with appropriate predicates in the ON caluse), in place of all those NATURAL JOIN operations.
(I'm just guessing at the column names referenced by the NATURAL JOIN. In order to decipher that, we would need to inspect each table definition to get a list of all of the columns, and then find all the column names that match, to figure out which columns MySQL is using to do those inner join operations.)
Based on the scant information provided in the query text, here's the approach I would take (again, just guessing at the names referenced in each ON clause):
SELECT p.PrivilageName
, IF(u.UserID IS NOT NULL,'true','false') AS hasrights
FROM `privileges` p
LEFT
JOIN groupprivileges g
ON g.PrivilegeID = p.PrivilegeID
LEFT
JOIN usermemberships m
ON m.GroupId = g.GroupID
LEFT
JOIN users u
ON u.UserID = g.UserID
AND u.UserID = 2
Depending on the cardinality in those tables (i.e. is "AddBuilding" privilege granted to two different groups, one which the user is a member of and the other not...)
and depending on whether you want to avoid returning any "duplicate" PrivilageName values (either multiple rows with "true" or "false", or rows with both "true" and "false" for each PrivilageName), and depending on how you want the resultset ordered (i.e. do you want all the "true" privileges listed first?)...
Then this query is more deterministic in the resultset that is returned, it will return each PrivilageName only once. This resultset seems better suited to answer the question whether a user has a privilege or not.
SELECT p.PrivilageName
, MAX(IF(u.UserID IS NOT NULL,'true','false')) AS hasrights
FROM `privileges` p
LEFT
JOIN groupprivileges g
ON g.PrivilegeID = p.PrivilegeID
LEFT
JOIN usermemberships m
ON m.GroupId = g.GroupID
LEFT
JOIN users u
ON u.UserID = g.UserID
AND u.UserID = 2
GROUP BY p.PrivilageName
ORDER BY hasrights DESC, p.PrivilageName ASC
(Personally, I'd omit the ORDER BY, and let the results be ordered by PrivilageName, but with the ORDER BY, this better matches the resultset specified in the question.)
Of course, that's not the only way to get the result set, but it's likely to be the most efficient (given suitable indexes).
Personally, I don't ever use NATURAL JOIN. (I want to see the predicates in the statement, and I don't want any of my queries to "break" if someone adds a column with a matching name to one of the table in my query. (Actually, thinking about it, I can't use NATURAL JOIN because id is typically the name of the primary key column of nearly all my tables... foreign key columns are typically named referencedtable_id.) But even if I did name the columns in a way that I could use NATURAL JOIN, I see the potential drawbacks outweighing any advantage.
But, something like the statement below might work. (I say "might" because I don't have any experience using syntax like this... I never use NATURAL JOIN, and I always prefer LEFT joins to RIGHT joins. If someone in my shop came to me with this, I would give them the statement above. But I don't want to leave you with the impression that a NATURAL JOIN can't be used to return the specified resultset. It's possible your specified resultset might be returned by a statement like this:
SELECT
PrivilageName,
MAX(IF(UserID=2,'false','true')) AS hasrights
FROM
users
NATURAL RIGHT JOIN usermemberships
NATURAL RIGHT JOIN groupprivileges
NATURAL RIGHT JOIN `privileges`
GROUP BY PrivilageName
You can use UNION for "concate" two request.
And may be operator IF() can help you.