Foreword: I am afraid that I'm asking to do something that can't be achieved. But I know that, if there is a way to do it, here someone will know :)
Suppose I have a MySQL database (or a MariaDB one, if compatibility can be retained). I need to select from TableA and join TableB based on a certain column having a certain value X; it is possible that no suitable row exists in TableB, which means I shall use a LEFT OUTER JOIN. However, in case no row is found (no row of TableB has the value X in that column), then I wish to use a row from TableB with value Y. In other words, Y should act as a fallback condition for the JOIN.
Here is my concrete example. Suppose I handle data localization by splitting an entity (say, a product) between two tables, one with the localizable data and the other with the fixed data. When I want to get the product info in French, I wish to fallback to English if no information is available in French. However, if I do
// basic stub of the query for demonstration purposes
SELECT * FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc
ON pLoc.ProductID = p.ID AND pLoc.Culture = 'fr-FR'
I will get NULL when no French info is available. Instead, only for those rows, I wish to have the result of the query
// basic stub of the query for demonstration purposes
SELECT * FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc
ON pLoc.ProductID = p.ID AND pLoc.Culture = 'en-US'
But the latter query is not acceptable, of course.
I know I could use an OR to join based on the Culture being 'fr-FR' or 'en-US', but this would potentially double the number of selected rows, and it would require me to post-process the resulting dataset. I wish to know if there is a way to avoid this post-processing.
Non-duplicate disclaimer: my question is similar to many others, but there's a crucial difference. I need to use a fallback value for the JOIN condition itself, not as the result of the query. My default doesn't end up in the dataset, but the dataset is computed based on that default.
You can join twice and use COALESCE() to get the value you want:
SELECT p.product_desc,
COALESCE(pLoc_fr.value, pLoc_en.value)
FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc_us ON
pLoc_us.ProductID = p.ID
AND pLoc_us.Culture = 'en-US'
LEFT OUTER JOIN ProductsLocalization pLoc_fr ON
pLoc_fr.ProductID = p.ID
AND pLoc_fr.Culture = 'fr-FR';
Also, if you can imagine a way of combining data sets, you can probably achieve using plain-ol SQL. Some edge cases require dynamically generating a SQL statement, but those aren't terribly common. In this case, you could also use two correlated subqueries inside of a COALESCE or a CASE or an IF statement inside the SELECT. There's almost always a few ways to skin the cat.
Related
I am pretty new to SQL. Here is an operation I am sure is simple for a lot of you. I am trying to join two tables across databases on the same server – dbB and dbA, and TableA (with IdA) and TableB (with IdB) respectively. But before doing that I want to transform column IdA into a number, where I would like to remove the “:XYZ” character from its values and add a where statement for another column in dbA too. Below I show my code for the join but I am not sure how to convert the values of the column. This allows me to match idAwith idB in the join. Thanks a ton in advance.
Select replace(idA, “:XYZ”, "")
from dbA.TableA guid
where event like “%2015”
left join dbB.TableB own
on guid.idA = own.idB
Few things
FROM, Joins, WHERE (unless you use subqueries) syntax order it's also the order of execution (notice select isn't listed as its near the end in order of operation but first syntactically!)
alias/fully qualify columns when multiple tables are involved so we know what field comes from what table.
order of operations has the SQL doing the from and JOINS 1st thus what you do in the select isn't available (not in scope yet) for the compiler, this is why you can't use select column aliases in the from, where or even group by as well.
I don't like Select * usually but as I don't know what columns you really need... I used it here.
As far as where before the join. most SQL compilers anymore use cost based optimization and figure out the best execution plan given your data tables and what not involved. So just put the limiting criteria in the where in this case since it's limiting the left table of the left join. If you needed to limit data on the right table of a left join, you'd put the limit on the join criteria; thus allowing it to filter as it joins.
probably need to cast IDA as integer (or to the same type as IDB) I used trim to eliminate spaces but if there are other non-display characters, you'd have issues with the left join matching)
.
SELECT guild.*, own.*
FROM dbA.TableA guid
LEFT JOIN dbB.TableB own
on cast(trim(replace(guid.idA, ':XYZ', '')) as int) = own.idB
WHERE guid.event like '%2015'
Or materialize the transformation first by using a subquery so IDA in its transformed state before the join (like algebra ()'s matter and get processed inside out)
SELECT *
FROM (SELECT cast(trim(replace(guid.idA, ':XYZ', '')) as int) as idA
FROM dbA.TableA guid
WHERE guid.event like '%2015') B
LEFT JOIN dbB.TableB own
on B.IDA = own.idB
I have two tables:
Invariant (UniqueID, characteristic1, characteristic2)
Variant (VariantID, UniqueID, specification1, specification2)
Each project has its own unchanging characteristics between implementations. Each implementation also has its own individual properties.
So, I use queries like this to find projects with the given characteristics and specifications:
SELECT *
FROM `Invariants`
LEFT JOIN (`Variants`) ON (`Variants`.`UniqueID`=`Invariants`.`UniqueID`)
WHERE char2='y' and spec1='x'
GROUP BY `Invariant`.`UniqueID`;
I'm looking for a query that will return all projects that have never satisfied a given specification. So, if one of project 100's variants had spec1='bad', then I don't want project 100 to be included, regardless if it had variants where spec1='good'.
select *
from Invariants iv
where not exists (
select 1
from Variants v
where v.UniqueId = iv.UniqueId and v.spec1 = 'bad'
)
The queries below do not address your question, I probably read to fast and thought you wanted to pick up only the invariant properties of a particular type. But I will note that you shouldn't use a left join and then filter, in the where clause, against columns from the right table (except for checking nulls). People make that mistake all the time and that's what jumped out to me at first glance.
The whole purpose of a left join is that some of the rows will not match and will thus have filler null values in the columns for the right-hand table. This join logic happens first and then after that the where clause is applied. When you have a condition like where spec1 = 'x' it will always evaluate to false against a null value. So you end up eliminating all the rows you wanted to keep.
This happens a lot with these invariant/custom values tables. You're only interested in one of the properties but if you don't filter prior to joining or inside the join condition, you end up dropping rows because the value didn't exist and you didn't have a value left to compare once it tried to apply a where-clause condition on the property name.
Hope that made sense. See below for examples:
select iv.UniqueId, ...
from
Invariants iv left outer join
Variants
on v.UniqueId = vi.UniqueId and v.spec1 = 'x'
or
select iv.UniqueId, ...
from
Invariants iv left outer join
(
select
from Variants
where spec1 = 'x'
) v
on v.UniqueId = vi.UniqueId
SELECT dbo.postst.cost as Cost2014,
case when (dbo.InvNum.OrderDate) between '2014/01/01' and '2014/12/31' then dbo.Postst.Cost end as Cost2014
from dbo.postst INNER JOIN dbo.invnum on dbo.invnum.autoindex = dbo.postst.accountLink
Here is the technique I use for tracking down an issue like this:
SELECT dbo.postst.cost as Cost2014,
dbo.InvNum.OrderDate,
case when (dbo.InvNum.OrderDate) between '2014/01/01' and '2014/12/31' then dbo.Postst.Cost end as Cost2014
from dbo.postst
left JOIN dbo.invnum on dbo.invnum.autoindex = dbo.postst.accountLink
So I add the column I am doing some kind of processing on to see the values I am returning. Then I change the inner join to a left join (temporarily). Then when I run the select, I can usually see why they may not be meeting my expectations and why my query is not retuning the correct results.
In this case, you may not have any data in the right range or the join might be incorrect and thus no records are picked up at all.
What you have is a relatively simple query, If it more complex, I often use select * instead of the sepcific columns just to see if there is something in the other columns that is affecting the results. This is often the case when you have a one-many relationship and want to get only one record but are getting duplicates in the fields you selected.
I have an SQL query that selects user's privileges, and adds true to them.
SELECT
PrivilageName,
'true' hasrights <-- imaginary column
FROM
users
NATURAL JOIN usermemberships
NATURAL JOIN groupprivileges
NATURAL JOIN `privileges`
WHERE
UserID = '2'
Result is
AddBuilding true
RemoveBuilding true
EditBuilding true
I'm trying to add the rest of the privilages with false value.
AddBuilding true
RemoveBuilding true
EditBuilding true
RemoveUser false
AddUser false
How I'll do this?
Edit: the structure of the tables:
users(UserID),
usermemberships(UserID, groupID),
groupprivileges(GroupID, PrivilegeID),
privileges(PrivilegeID, PrivilageName)
Edit: misspelling, sorry.
(NOTE: The queries in this answer are now updated, to include the column names that were added to the question.)
One approach to getting that resultset would be to use LEFT JOIN operations (with appropriate predicates in the ON caluse), in place of all those NATURAL JOIN operations.
(I'm just guessing at the column names referenced by the NATURAL JOIN. In order to decipher that, we would need to inspect each table definition to get a list of all of the columns, and then find all the column names that match, to figure out which columns MySQL is using to do those inner join operations.)
Based on the scant information provided in the query text, here's the approach I would take (again, just guessing at the names referenced in each ON clause):
SELECT p.PrivilageName
, IF(u.UserID IS NOT NULL,'true','false') AS hasrights
FROM `privileges` p
LEFT
JOIN groupprivileges g
ON g.PrivilegeID = p.PrivilegeID
LEFT
JOIN usermemberships m
ON m.GroupId = g.GroupID
LEFT
JOIN users u
ON u.UserID = g.UserID
AND u.UserID = 2
Depending on the cardinality in those tables (i.e. is "AddBuilding" privilege granted to two different groups, one which the user is a member of and the other not...)
and depending on whether you want to avoid returning any "duplicate" PrivilageName values (either multiple rows with "true" or "false", or rows with both "true" and "false" for each PrivilageName), and depending on how you want the resultset ordered (i.e. do you want all the "true" privileges listed first?)...
Then this query is more deterministic in the resultset that is returned, it will return each PrivilageName only once. This resultset seems better suited to answer the question whether a user has a privilege or not.
SELECT p.PrivilageName
, MAX(IF(u.UserID IS NOT NULL,'true','false')) AS hasrights
FROM `privileges` p
LEFT
JOIN groupprivileges g
ON g.PrivilegeID = p.PrivilegeID
LEFT
JOIN usermemberships m
ON m.GroupId = g.GroupID
LEFT
JOIN users u
ON u.UserID = g.UserID
AND u.UserID = 2
GROUP BY p.PrivilageName
ORDER BY hasrights DESC, p.PrivilageName ASC
(Personally, I'd omit the ORDER BY, and let the results be ordered by PrivilageName, but with the ORDER BY, this better matches the resultset specified in the question.)
Of course, that's not the only way to get the result set, but it's likely to be the most efficient (given suitable indexes).
Personally, I don't ever use NATURAL JOIN. (I want to see the predicates in the statement, and I don't want any of my queries to "break" if someone adds a column with a matching name to one of the table in my query. (Actually, thinking about it, I can't use NATURAL JOIN because id is typically the name of the primary key column of nearly all my tables... foreign key columns are typically named referencedtable_id.) But even if I did name the columns in a way that I could use NATURAL JOIN, I see the potential drawbacks outweighing any advantage.
But, something like the statement below might work. (I say "might" because I don't have any experience using syntax like this... I never use NATURAL JOIN, and I always prefer LEFT joins to RIGHT joins. If someone in my shop came to me with this, I would give them the statement above. But I don't want to leave you with the impression that a NATURAL JOIN can't be used to return the specified resultset. It's possible your specified resultset might be returned by a statement like this:
SELECT
PrivilageName,
MAX(IF(UserID=2,'false','true')) AS hasrights
FROM
users
NATURAL RIGHT JOIN usermemberships
NATURAL RIGHT JOIN groupprivileges
NATURAL RIGHT JOIN `privileges`
GROUP BY PrivilageName
You can use UNION for "concate" two request.
And may be operator IF() can help you.
I have a table called faq. This table consists from fields faq_id,faq_subject.
I have another table called article which consists of article_id,ticket_id,a_body and which stores articles in a specific ticket. Naturally there is also a table "ticket" with fields ticket_id,ticket_number.
I want to retrieve a result table in format:
ticket_number,faq_id,faq_subject.
In order to do this I need to search for faq_id in the article.a_body field using %LIKE% statement.
My question is, how can I do this dynamically such that I return with SQL one result table, which is in format ticket_number,faq_id,faq_subject.
I tried multiple configurations of UNION ALL, LEFT JOIN, LEFT OUTER JOIN statements, but they all return either too many rows, or have different problems.
Is this even possible with MySQL, and is it possible to write an SQL statement which includes #variables and can take care of this?
First off, that kind of a design is problematic. You have certain data embedded within another column, which is going to cause logic as well as performance problems (since you can't index the a_body in such a way that it will help the JOIN). If this is a one-time thing then that's one issue, but otherwise you're going to have problems with this design.
Second, consider this example: You're searching for faq_id #123. You have an article that includes faq_id 4123. You're going to end up with a false match there. You can embed the faq_id values in the text with some sort of mark-up (for example, [faq_id:123]), but at that point you might as well be saving them off in another table as well.
The following query should work (I think that MySQL supports CAST, if not then you might need to adjust that).
SELECT
T.ticket_number,
F.faq_id,
F.faq_subject
FROM
Articles A
INNER JOIN FAQs F ON
A.a_body LIKE CONCAT('%', F.faq_id, '%')
INNER JOIN Tickets T ON
T.ticket_id = A.ticket_id
EDIT: Corrected to use CONCAT
SELECT DISTINCT t.ticket_number, f.faq_id, f.faq_subject
FROM faq.f
INNER JOIN article a ON (a.a_body RLIKE CONCAT('faq_id: ',faq_id))
INNER JOIN ticket t ON (t.ticket_id = a.ticket_id)
WHERE somecriteria