Including rest of the rows in MySQL - mysql

I have an SQL query that selects user's privileges, and adds true to them.
SELECT
PrivilageName,
'true' hasrights <-- imaginary column
FROM
users
NATURAL JOIN usermemberships
NATURAL JOIN groupprivileges
NATURAL JOIN `privileges`
WHERE
UserID = '2'
Result is
AddBuilding true
RemoveBuilding true
EditBuilding true
I'm trying to add the rest of the privilages with false value.
AddBuilding true
RemoveBuilding true
EditBuilding true
RemoveUser false
AddUser false
How I'll do this?
Edit: the structure of the tables:
users(UserID),
usermemberships(UserID, groupID),
groupprivileges(GroupID, PrivilegeID),
privileges(PrivilegeID, PrivilageName)
Edit: misspelling, sorry.

(NOTE: The queries in this answer are now updated, to include the column names that were added to the question.)
One approach to getting that resultset would be to use LEFT JOIN operations (with appropriate predicates in the ON caluse), in place of all those NATURAL JOIN operations.
(I'm just guessing at the column names referenced by the NATURAL JOIN. In order to decipher that, we would need to inspect each table definition to get a list of all of the columns, and then find all the column names that match, to figure out which columns MySQL is using to do those inner join operations.)
Based on the scant information provided in the query text, here's the approach I would take (again, just guessing at the names referenced in each ON clause):
SELECT p.PrivilageName
, IF(u.UserID IS NOT NULL,'true','false') AS hasrights
FROM `privileges` p
LEFT
JOIN groupprivileges g
ON g.PrivilegeID = p.PrivilegeID
LEFT
JOIN usermemberships m
ON m.GroupId = g.GroupID
LEFT
JOIN users u
ON u.UserID = g.UserID
AND u.UserID = 2
Depending on the cardinality in those tables (i.e. is "AddBuilding" privilege granted to two different groups, one which the user is a member of and the other not...)
and depending on whether you want to avoid returning any "duplicate" PrivilageName values (either multiple rows with "true" or "false", or rows with both "true" and "false" for each PrivilageName), and depending on how you want the resultset ordered (i.e. do you want all the "true" privileges listed first?)...
Then this query is more deterministic in the resultset that is returned, it will return each PrivilageName only once. This resultset seems better suited to answer the question whether a user has a privilege or not.
SELECT p.PrivilageName
, MAX(IF(u.UserID IS NOT NULL,'true','false')) AS hasrights
FROM `privileges` p
LEFT
JOIN groupprivileges g
ON g.PrivilegeID = p.PrivilegeID
LEFT
JOIN usermemberships m
ON m.GroupId = g.GroupID
LEFT
JOIN users u
ON u.UserID = g.UserID
AND u.UserID = 2
GROUP BY p.PrivilageName
ORDER BY hasrights DESC, p.PrivilageName ASC
(Personally, I'd omit the ORDER BY, and let the results be ordered by PrivilageName, but with the ORDER BY, this better matches the resultset specified in the question.)
Of course, that's not the only way to get the result set, but it's likely to be the most efficient (given suitable indexes).
Personally, I don't ever use NATURAL JOIN. (I want to see the predicates in the statement, and I don't want any of my queries to "break" if someone adds a column with a matching name to one of the table in my query. (Actually, thinking about it, I can't use NATURAL JOIN because id is typically the name of the primary key column of nearly all my tables... foreign key columns are typically named referencedtable_id.) But even if I did name the columns in a way that I could use NATURAL JOIN, I see the potential drawbacks outweighing any advantage.
But, something like the statement below might work. (I say "might" because I don't have any experience using syntax like this... I never use NATURAL JOIN, and I always prefer LEFT joins to RIGHT joins. If someone in my shop came to me with this, I would give them the statement above. But I don't want to leave you with the impression that a NATURAL JOIN can't be used to return the specified resultset. It's possible your specified resultset might be returned by a statement like this:
SELECT
PrivilageName,
MAX(IF(UserID=2,'false','true')) AS hasrights
FROM
users
NATURAL RIGHT JOIN usermemberships
NATURAL RIGHT JOIN groupprivileges
NATURAL RIGHT JOIN `privileges`
GROUP BY PrivilageName

You can use UNION for "concate" two request.
And may be operator IF() can help you.

Related

Choosing none in set

I have two tables:
Invariant (UniqueID, characteristic1, characteristic2)
Variant (VariantID, UniqueID, specification1, specification2)
Each project has its own unchanging characteristics between implementations. Each implementation also has its own individual properties.
So, I use queries like this to find projects with the given characteristics and specifications:
SELECT *
FROM `Invariants`
LEFT JOIN (`Variants`) ON (`Variants`.`UniqueID`=`Invariants`.`UniqueID`)
WHERE char2='y' and spec1='x'
GROUP BY `Invariant`.`UniqueID`;
I'm looking for a query that will return all projects that have never satisfied a given specification. So, if one of project 100's variants had spec1='bad', then I don't want project 100 to be included, regardless if it had variants where spec1='good'.
select *
from Invariants iv
where not exists (
select 1
from Variants v
where v.UniqueId = iv.UniqueId and v.spec1 = 'bad'
)
The queries below do not address your question, I probably read to fast and thought you wanted to pick up only the invariant properties of a particular type. But I will note that you shouldn't use a left join and then filter, in the where clause, against columns from the right table (except for checking nulls). People make that mistake all the time and that's what jumped out to me at first glance.
The whole purpose of a left join is that some of the rows will not match and will thus have filler null values in the columns for the right-hand table. This join logic happens first and then after that the where clause is applied. When you have a condition like where spec1 = 'x' it will always evaluate to false against a null value. So you end up eliminating all the rows you wanted to keep.
This happens a lot with these invariant/custom values tables. You're only interested in one of the properties but if you don't filter prior to joining or inside the join condition, you end up dropping rows because the value didn't exist and you didn't have a value left to compare once it tried to apply a where-clause condition on the property name.
Hope that made sense. See below for examples:
select iv.UniqueId, ...
from
Invariants iv left outer join
Variants
on v.UniqueId = vi.UniqueId and v.spec1 = 'x'
or
select iv.UniqueId, ...
from
Invariants iv left outer join
(
select
from Variants
where spec1 = 'x'
) v
on v.UniqueId = vi.UniqueId

Difference between FROM and JOIN tables

I'm working through the JOIN tutorial on SQL zoo.
Let's say I'm about to execute the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM game a
JOIN goal g
ON g.matchid = a.id
GROUP BY a.stadium
As it happens, it produces the same output as the code below:
SELECT a.stadium, COUNT(g.matchid)
FROM goal g
JOIN game a
ON g.matchid = a.id
GROUP BY a.stadium
So then, when does it matter which table you assign at FROM and which one you assign at JOIN?
When you are using an INNER JOIN like you are here, the order doesn't matter. That is because you are connecting two tables on a common index, so the order in which you use them is up to you. You should pick an order that is most logical to you, and easiest to read. A habit of mine is to put the table I'm selecting from first. In your case, you're selecting information about a stadium, which comes from the game table, so my preference would be to put that first.
In other joins, however, such as LEFT OUTER JOIN and RIGHT OUTER JOIN the order will matter. That is because these joins will select all rows from one table. Consider for example I have a table for Students and a table for Projects. They can exist independently, some students may have an associated project, but not all will.
If I want to get all students and project information while still seeing students without projects, I need a LEFT JOIN:
SELECT s.name, p.project
FROM student s
LEFT JOIN project p ON p.student_id = s.id;
Note here, that the LEFT JOIN refers to the table in the FROM clause, so that means ALL of students were being selected. This also means that p.project will be null for some rows. Order matters here.
If I took the same concept with a RIGHT JOIN, it will select all rows from the table in the join clause. So if I changed the query to this:
SELECT s.name, p.project
FROM student s
RIGHT JOIN project p ON p.student_id = s.id;
This will return all rows from the project table, regardless of whether or not it has a match for students. This means that in some rows, s.name will be null. Similar to the first example, because I've made project the outer joined table, p.project will never be null (assuming it isn't in the original table). In the first example, s.name should never be null.
In the case of outer joins, order will matter. Thankfully, you can think intuitively with LEFT and RIGHT joins. A left join will return all rows in the table to the left of that statement, while a right join returns all rows from the right of that statement. Take this as a rule of thumb, but be careful. You might want to develop a pattern to be consistent with yourself, as I mentioned earlier, so these queries are easier for you to understand later on.
When you only JOIN 2 tables, usually the order does not matter: MySQL scans the tables in the optimal order.
When you scan more than 2 tables, the order could matter:
SELECT ...
FROM a
JOIN b ON ...
JOIN c ON ...
Also, MySQL tries to scan the tables in the fastest way (large tables first). But if a join is slow, it is possible that MySQL is scanning them in a non-optimal order. You can verify this with EXPLAIN. In this case, you can force the join order by adding the STRAIGHT_JOIN keyword.
The order doesn't always matter, I usually just order it in a way that makes sense to someone reading your query.
Sometime order does matter. Try it with LEFT JOIN and RIGHT JOIN.
In this instance you are using an INNER JOIN, if you're expecting a match on a common ID or foreign key, it probably doesn't matter too much.
You would however need to specify the tables the correct way round if you were performing an OUTER JOIN, as not all records in this type of join are guaranteed to match via the same field.
yes, it will matter when you will user another join LEFT JOIN, RIGHT JOIN
currently You are using NATURAL JOIN that is return all tables related data, if JOIN table row not match then it will exclude row from result
If you use LEFT / RIGHT {OUTER} join then result will be different, follow this link for more detail

MySQL SELECT from two tables with COUNT

i have two tables as below:
Table 1 "customer" with fields "Cust_id", "first_name", "last_name" (10 customers)
Table 2 "cust_order" with fields "order_id", "cust_id", (26 orders)
I need to display "Cust_id" "first_name" "last_name" "order_id"
to where i need count of order_id group by cust_id like list total number of orders placed by each customer.
I am running below query, however, it is counting all the 26 orders and applying that 26 orders to each of the customer.
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order, customer cus
GROUP BY cust_id;
Could you please suggest/advice what is wrong in the query?
You issue here is that you have told the database how these two tables are 'connected', or what they should be connected by:
Have a look at this image:
~IMAGE SOURCE
This effectively allows you to 'join' two tables together, and use a query between them.
so you might want to use something like:
SELECT COUNT(B.order_id), A.cust_id, A.first_name, A.last_name
FROM customer A
LEFT JOIN cust_order B //this is using a left join, but an inner may be appropriate also
ON (A.cust_id= B.Cust_id) //what links them together
GROUP BY A.cust_id; // the group by clause
As per your comment requesting some further info:
Left Join (right joins are almost identical, only the other way around):
The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in right table, the join will still return a row in the result, but with NULL in each column from right table. ~Tutorials Point.
This means that a left join returns all the values from the left table, plus matched values from the right table or NULL in case of no matching join predicate.
LEFT joins will be used in the cases where you wish to retrieve all the data from the table in the left hand side, and only data from the right that match.
Execution Time
While the accepted answer in this case may work well in small datasets, it may however become 'heavy' in larger databases. This is because it was not actually designed for this type of operation.
This was the purpose of Joins to be introduced.
Much work in database-systems has aimed at efficient implementation of joins, because relational systems commonly call for joins, yet face difficulties in optimising their efficient execution. The problem arises because inner joins operate both commutatively and associatively. ~Wikipedia
In practice, this means that the user merely supplies the list of tables for joining and the join conditions to use, and the database system has the task of determining the most efficient way to perform the operation. A query optimizer determines how to execute a query containing joins. So, by allowing the dbms to choose the way your data is queried, you can save a lot of time.
Other Joins/Summary
AN INNER JOIN will return data from both tables where the keys in each table match
A LEFT JOIN or RIGHT JOIN will return all the rows from one table and matching data from the other table.
Use a join when you want to query multiple tables.
Joins are much faster than other ways of querying >=2 tables (speed can be seen much better on larger datasets).
You could try this one:
SELECT COUNT(cus_order.order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order cus_order, customer cus
WHERE cus_order.cust_id = cus.cust_id
GROUP BY cust_id;
Maybe an left join will help you
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name ]
FROM customer cus
LEFT JOIN cust_order co
ON (co.cust_id= cus.Cust_id )
GROUP BY cus.cust_id;

MySQL Sub-Query or LEFT JOIN for SELECTing missing columns?

I need to perform a SELECT query on 3 tables and i don't know if using a sub-query could be better than a LEFT JOIN since one column in some case might be missing. These are the tables:
Options (name, info...)
Owners (name, address)
Rel (idoption, idowner)
The SELECT should return all the Options with the name of the Owner inside each record but, in some case, the Option might not be connected to any Owner and the name of the Owner should be empty.
Any suggestions? Thanks in advance
A LEFT JOIN is likely the appropriate response and will probably be faster than a subquery depending on your results (it's possible that they'd compile to the same plan).
SELECT
op.name
,op.info
,...
,ow.name
,ow.address
FROM
options op
LEFT OUTER JOIN
Rel r
ON r.idoption = op.id
LEFT OUTER JOIN
owners ow
ON ow.id = r.idowner
LEFT JOIN then, it will get all the Options irregardless if there is a matching Owner or not - "This extra consideration to the left table can be thought of as special kind of preservation. Each item in the left table will show up in a MySQL result, even if there isn't a match with the other table that it is being joined to."
from: http://www.tizag.com/mysqlTutorial/mysqlleftjoin.php
A left join will be much more efficient and faster than a subquery. If you can live with NULLs for the cases where there's no match, it's the better approach.

In what order are MySQL JOINs evaluated?

I have the following query:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123;
I have the following questions:
Is the USING syntax synonymous with ON syntax?
Are these joins evaluated left to right? In other words, does this query say: x = companies JOIN users; y = x JOIN jobs; z = y JOIN useraccounts;
If the answer to question 2 is yes, is it safe to assume that the companies table has companyid, userid and jobid columns?
I don't understand how the WHERE clause can be used to pick rows on the companies table when it is referring to the alias "j"
Any help would be appreciated!
USING (fieldname) is a shorthand way of saying ON table1.fieldname = table2.fieldname.
SQL doesn't define the 'order' in which JOINS are done because it is not the nature of the language. Obviously an order has to be specified in the statement, but an INNER JOIN can be considered commutative: you can list them in any order and you will get the same results.
That said, when constructing a SELECT ... JOIN, particularly one that includes LEFT JOINs, I've found it makes sense to regard the third JOIN as joining the new table to the results of the first JOIN, the fourth JOIN as joining the results of the second JOIN, and so on.
More rarely, the specified order can influence the behaviour of the query optimizer, due to the way it influences the heuristics.
No. The way the query is assembled, it requires that companies and users both have a companyid, jobs has a userid and a jobid and useraccounts has a userid. However, only one of companies or user needs a userid for the JOIN to work.
The WHERE clause is filtering the whole result -- i.e. all JOINed columns -- using a column provided by the jobs table.
I can't answer the bit about the USING syntax. That's weird. I've never seen it before, having always used an ON clause instead.
But what I can tell you is that the order of JOIN operations is determined dynamically by the query optimizer when it constructs its query plan, based on a system of optimization heuristics, some of which are:
Is the JOIN performed on a primary key field? If so, this gets high priority in the query plan.
Is the JOIN performed on a foreign key field? This also gets high priority.
Does an index exist on the joined field? If so, bump the priority.
Is a JOIN operation performed on a field in WHERE clause? Can the WHERE clause expression be evaluated by examining the index (rather than by performing a table scan)? This is a major optimization opportunity, so it gets a major priority bump.
What is the cardinality of the joined column? Columns with high cardinality give the optimizer more opportunities to discriminate against false matches (those that don't satisfy the WHERE clause or the ON clause), so high-cardinality joins are usually processed before low-cardinality joins.
How many actual rows are in the joined table? Joining against a table with only 100 values is going to create less of a data explosion than joining against a table with ten million rows.
Anyhow... the point is... there are a LOT of variables that go into the query execution plan. If you want to see how MySQL optimizes its queries, use the EXPLAIN syntax.
And here's a good article to read:
http://www.informit.com/articles/article.aspx?p=377652
ON EDIT:
To answer your 4th question: You aren't querying the "companies" table. You're querying the joined cross-product of ALL four tables in your FROM and USING clauses.
The "j.jobid" alias is just the fully-qualified name of one of the columns in that joined collection of tables.
In MySQL, it's often interesting to ask the query optimizer what it plans to do, with:
EXPLAIN SELECT [...]
See "7.2.1 Optimizing Queries with EXPLAIN"
Here is a more detailed answer on JOIN precedence. In your case, the JOINs are all commutative. Let's try one where they aren't.
Build schema:
CREATE TABLE users (
name text
);
CREATE TABLE orders (
order_id text,
user_name text
);
CREATE TABLE shipments (
order_id text,
fulfiller text
);
Add data:
INSERT INTO users VALUES ('Bob'), ('Mary');
INSERT INTO orders VALUES ('order1', 'Bob');
INSERT INTO shipments VALUES ('order1', 'Fulfilling Mary');
Run query:
SELECT *
FROM users
LEFT OUTER JOIN orders
ON orders.user_name = users.name
JOIN shipments
ON shipments.order_id = orders.order_id
Result:
Only the Bob row is returned
Analysis:
In this query the LEFT OUTER JOIN was evaluated first and the JOIN was evaluated on the composite result of the LEFT OUTER JOIN.
Second query:
SELECT *
FROM users
LEFT OUTER JOIN (
orders
JOIN shipments
ON shipments.order_id = orders.order_id)
ON orders.user_name = users.name
Result:
One row for Bob (with the fulfillment data) and one row for Mary with NULLs for fulfillment data.
Analysis:
The parenthesis changed the evaluation order.
Further MySQL documentation is at https://dev.mysql.com/doc/refman/5.5/en/nested-join-optimization.html
SEE http://dev.mysql.com/doc/refman/5.0/en/join.html
AND start reading here:
Join Processing Changes in MySQL 5.0.12
Beginning with MySQL 5.0.12, natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard. The goal was to align the syntax and semantics of MySQL with respect to NATURAL JOIN and JOIN ... USING according to SQL:2003. However, these changes in join processing can result in different output columns for some joins. Also, some queries that appeared to work correctly in older versions must be rewritten to comply with the standard.
These changes have five main aspects:
The way that MySQL determines the result columns of NATURAL or USING join operations (and thus the result of the entire FROM clause).
Expansion of SELECT * and SELECT tbl_name.* into a list of selected columns.
Resolution of column names in NATURAL or USING joins.
Transformation of NATURAL or USING joins into JOIN ... ON.
Resolution of column names in the ON condition of a JOIN ... ON.
Im not sure about the ON vs USING part (though this website says they are the same)
As for the ordering question, its entirely implementation (and probably query) specific. MYSQL most likely picks an order when compiling the request. If you do want to enforce a particular order you would have to 'nest' your queries:
SELECT c.*
FROM companies AS c
JOIN (SELECT * FROM users AS u
JOIN (SELECT * FROM jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123)
)
as for part 4: the where clause limits what rows from the jobs table are eligible to be JOINed on. So if there are rows which would join due to the matching userids but don't have the correct jobid then they will be omitted.
1) Using is not exactly the same as on, but it is short hand where both tables have a column with the same name you are joining on... see: http://www.java2s.com/Tutorial/MySQL/0100__Table-Join/ThekeywordUSINGcanbeusedasareplacementfortheONkeywordduringthetableJoins.htm
It is more difficult to read in my opinion, so I'd go spelling out the joins.
3) It is not clear from this query, but I would guess it does not.
2) Assuming you are joining through the other tables (not all directly on companyies) the order in this query does matter... see comparisons below:
Origional:
SELECT c.*
FROM companies AS c
JOIN users AS u USING(companyid)
JOIN jobs AS j USING(userid)
JOIN useraccounts AS us USING(userid)
WHERE j.jobid = 123
What I think it is likely suggesting:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = u.userid
JOIN useraccounts AS us on us.userid = u.userid
WHERE j.jobid = 123
You could switch you lines joining jobs & usersaccounts here.
What it would look like if everything joined on company:
SELECT c.*
FROM companies AS c
JOIN users AS u on u.companyid = c.companyid
JOIN jobs AS j on j.userid = c.userid
JOIN useraccounts AS us on us.userid = c.userid
WHERE j.jobid = 123
This doesn't really make logical sense... unless each user has their own company.
4.) The magic of sql is that you can only show certain columns but all of them are their for sorting and filtering...
if you returned
SELECT c.*, j.jobid....
you could clearly see what it was filtering on, but the database server doesn't care if you output a row or not for filtering.