Choosing none in set - mysql

I have two tables:
Invariant (UniqueID, characteristic1, characteristic2)
Variant (VariantID, UniqueID, specification1, specification2)
Each project has its own unchanging characteristics between implementations. Each implementation also has its own individual properties.
So, I use queries like this to find projects with the given characteristics and specifications:
SELECT *
FROM `Invariants`
LEFT JOIN (`Variants`) ON (`Variants`.`UniqueID`=`Invariants`.`UniqueID`)
WHERE char2='y' and spec1='x'
GROUP BY `Invariant`.`UniqueID`;
I'm looking for a query that will return all projects that have never satisfied a given specification. So, if one of project 100's variants had spec1='bad', then I don't want project 100 to be included, regardless if it had variants where spec1='good'.

select *
from Invariants iv
where not exists (
select 1
from Variants v
where v.UniqueId = iv.UniqueId and v.spec1 = 'bad'
)
The queries below do not address your question, I probably read to fast and thought you wanted to pick up only the invariant properties of a particular type. But I will note that you shouldn't use a left join and then filter, in the where clause, against columns from the right table (except for checking nulls). People make that mistake all the time and that's what jumped out to me at first glance.
The whole purpose of a left join is that some of the rows will not match and will thus have filler null values in the columns for the right-hand table. This join logic happens first and then after that the where clause is applied. When you have a condition like where spec1 = 'x' it will always evaluate to false against a null value. So you end up eliminating all the rows you wanted to keep.
This happens a lot with these invariant/custom values tables. You're only interested in one of the properties but if you don't filter prior to joining or inside the join condition, you end up dropping rows because the value didn't exist and you didn't have a value left to compare once it tried to apply a where-clause condition on the property name.
Hope that made sense. See below for examples:
select iv.UniqueId, ...
from
Invariants iv left outer join
Variants
on v.UniqueId = vi.UniqueId and v.spec1 = 'x'
or
select iv.UniqueId, ...
from
Invariants iv left outer join
(
select
from Variants
where spec1 = 'x'
) v
on v.UniqueId = vi.UniqueId

Related

Fallback value in a JOIN condition if no row is found

Foreword: I am afraid that I'm asking to do something that can't be achieved. But I know that, if there is a way to do it, here someone will know :)
Suppose I have a MySQL database (or a MariaDB one, if compatibility can be retained). I need to select from TableA and join TableB based on a certain column having a certain value X; it is possible that no suitable row exists in TableB, which means I shall use a LEFT OUTER JOIN. However, in case no row is found (no row of TableB has the value X in that column), then I wish to use a row from TableB with value Y. In other words, Y should act as a fallback condition for the JOIN.
Here is my concrete example. Suppose I handle data localization by splitting an entity (say, a product) between two tables, one with the localizable data and the other with the fixed data. When I want to get the product info in French, I wish to fallback to English if no information is available in French. However, if I do
// basic stub of the query for demonstration purposes
SELECT * FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc
ON pLoc.ProductID = p.ID AND pLoc.Culture = 'fr-FR'
I will get NULL when no French info is available. Instead, only for those rows, I wish to have the result of the query
// basic stub of the query for demonstration purposes
SELECT * FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc
ON pLoc.ProductID = p.ID AND pLoc.Culture = 'en-US'
But the latter query is not acceptable, of course.
I know I could use an OR to join based on the Culture being 'fr-FR' or 'en-US', but this would potentially double the number of selected rows, and it would require me to post-process the resulting dataset. I wish to know if there is a way to avoid this post-processing.
Non-duplicate disclaimer: my question is similar to many others, but there's a crucial difference. I need to use a fallback value for the JOIN condition itself, not as the result of the query. My default doesn't end up in the dataset, but the dataset is computed based on that default.
You can join twice and use COALESCE() to get the value you want:
SELECT p.product_desc,
COALESCE(pLoc_fr.value, pLoc_en.value)
FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc_us ON
pLoc_us.ProductID = p.ID
AND pLoc_us.Culture = 'en-US'
LEFT OUTER JOIN ProductsLocalization pLoc_fr ON
pLoc_fr.ProductID = p.ID
AND pLoc_fr.Culture = 'fr-FR';
Also, if you can imagine a way of combining data sets, you can probably achieve using plain-ol SQL. Some edge cases require dynamically generating a SQL statement, but those aren't terribly common. In this case, you could also use two correlated subqueries inside of a COALESCE or a CASE or an IF statement inside the SELECT. There's almost always a few ways to skin the cat.

SQL transform id and add where statement before join

I am pretty new to SQL. Here is an operation I am sure is simple for a lot of you. I am trying to join two tables across databases on the same server – dbB and dbA, and TableA (with IdA) and TableB (with IdB) respectively. But before doing that I want to transform column IdA into a number, where I would like to remove the “:XYZ” character from its values and add a where statement for another column in dbA too. Below I show my code for the join but I am not sure how to convert the values of the column. This allows me to match idAwith idB in the join. Thanks a ton in advance.
Select replace(idA, “:XYZ”, "")
from dbA.TableA guid
where event like “%2015”
left join dbB.TableB own
on guid.idA = own.idB
Few things
FROM, Joins, WHERE (unless you use subqueries) syntax order it's also the order of execution (notice select isn't listed as its near the end in order of operation but first syntactically!)
alias/fully qualify columns when multiple tables are involved so we know what field comes from what table.
order of operations has the SQL doing the from and JOINS 1st thus what you do in the select isn't available (not in scope yet) for the compiler, this is why you can't use select column aliases in the from, where or even group by as well.
I don't like Select * usually but as I don't know what columns you really need... I used it here.
As far as where before the join. most SQL compilers anymore use cost based optimization and figure out the best execution plan given your data tables and what not involved. So just put the limiting criteria in the where in this case since it's limiting the left table of the left join. If you needed to limit data on the right table of a left join, you'd put the limit on the join criteria; thus allowing it to filter as it joins.
probably need to cast IDA as integer (or to the same type as IDB) I used trim to eliminate spaces but if there are other non-display characters, you'd have issues with the left join matching)
.
SELECT guild.*, own.*
FROM dbA.TableA guid
LEFT JOIN dbB.TableB own
on cast(trim(replace(guid.idA, ':XYZ', '')) as int) = own.idB
WHERE guid.event like '%2015'
Or materialize the transformation first by using a subquery so IDA in its transformed state before the join (like algebra ()'s matter and get processed inside out)
SELECT *
FROM (SELECT cast(trim(replace(guid.idA, ':XYZ', '')) as int) as idA
FROM dbA.TableA guid
WHERE guid.event like '%2015') B
LEFT JOIN dbB.TableB own
on B.IDA = own.idB

Replace the id's with name using single query

SELECT team_with.participant1,team_with.participant2,team_with.participant3
FROM event,team_with
WHERE team_with.for_event_no=event.event_no AND
event.event_no=4 AND
team_with.participant1=9 OR
team_with.participant2=9 OR
team_with.participant3=9;
I have written the particular query, and obtained the required id's in a row. I am not able to modify this query such that, in place of these id's, names connected to the id's are displayed.
The student_detatil table consists of PK(sam_id) and the attribute name.
IDs displayed by the present query are FKs connected to student_detail.sam_id..
It seems like a bad design to multiply columns storing different participants. Consider creating a separate row for each participant and storing them in a table. Your joining logic would also be easier.
Also, please use explicit JOIN syntax - it makes the query clearer and easier to understand by separating join logic with conditions for data retrieval.
Remember that operator AND has a precedence over OR, so that your event.event_no = 4 does not apply to each participant condition. I believe this was a mistake, but you are the one to judge.
As to the query itself, you could apply OR conditions into join, or simply join the student_detail table thrice.
SELECT
s1.name,
s2.name,
s3.name
FROM
event e
INNER JOIN team_with t ON t.for_event_no = e.event_no
LEFT JOIN student_detail s1 ON s1.sam_id = t.participant1
LEFT JOIN student_detail s2 ON s2.sam_id = t.participant2
LEFT JOIN student_detail s3 ON s3.sam_id = t.participant3
WHERE
e.event_no = 4
AND ( t.participant1=9 OR t.participant2=9 OR t.participant3=9 );

MySql - Duplicate Columns When Joining Multiple tables to Main Table

BACKGROUND
I was given a bunch of data that looks like this, a little over 200 columns wide:
Name|Address|etc...|Value 1|Value 2|Code 1|Code 2|repeat 7 times for codes|Value 3|repeat 200 times for values....|Value 200
They included definition lists that are used to decipher the Coded Values, for example: U6 = Local Limit and U7 = More than 100 Times
So I loaded it up into mysql because they wanted reports that swapped in the value from the definition lists for the Coded Values. However not all cells in main table have data, some are blank.
PROBLEM
So, when build my select statement, I would usually use a left join and be fine, but I need multiple left joins to get the 8 definition lists swapped in when needed, multiple left joins give me a lot of extra fields, having trouble with this.
Main Table is called
RAW_DATA
and the tables that hold all of the definition lists are named:
COUNTRY
ORIGIN
LANGUAGE
PREFERENCE
HAS_VEHICLE
EDUCATION
MARITAL_STATUS
OCCUPATION
TECHCODE
TYPE
INCOME
These tables above are just the ones that have definitions. All of the other fields in the 225 table are static and often unique. It could be normalized I am sure, but it would be tons of effort for converting one report, one time. That is why I just used the ones that had codes that were not human recognizable via the definition lists.
MY QUERY
SELECT `raw_data`.`id_raw_data`,
`raw_data`.`id`,
`raw_data`.`first_name`,
`raw_data`.`last_name`,
`raw_data`.`OTHER_COLUMNS_AS_NEEDED`,
`country`.`longname` as `country`,
`origin`.`longname` as `origin`,
`language`.`longname` as `language`,
`preference`.`longname` as `preference`,
`has_vehicle`.`longname` as `vehichle_type`,
`education`.`longname` as `education`,
`marital_status`.`longname` as `marital_status`,
`occupation`.`longname` as `occupation`,
`techcode`.`longname` as `tech_group`,
`typestat`.`longname` as `typecode`,
`income`.`longname` as `income`,
FROM `raw_data`
left join `country`
on `raw_data`.`countrycode` = `country`.`shortname`
left join `origin`
on `raw_data`.`origincode` = `origin`.`shortname`
left join `language`
on `raw_data`.`languagecode` = `origin`.`language`
left join `preference`
on `raw_data`.`preferencecode` = `preference`.`shortname`
left join `has_vehicle`
on `raw_data`.`has_vehiclecode` = `has_vehicle`.`shortname`
left join `education`
on `raw_data`.`educationcode` = `education`.`shortname`
left join `marital_status`
on `raw_data`.`marital_statuscode` = `marital_status`.`shortname`
left join `occupation`
on `raw_data`.`occupationcode` = `occupation`.`shortname`
left join `techcode`
on `raw_data`.`techcodecode` = `techcode`.`shortname`
left join `typecode`
on `raw_data`.`typestatcode` = `typestat`.`shortname`
left join `income`
on `raw_data`.`incomecode` = `income`.`shortname`
I have done some searching, all seem to use some form of sub-query or question involved joining back to itself. I am pretty sure it has something to do with the columns in the massive raw_data table that do not have values so there is no match, but need help.
This seemed close, but my query times out if too many joins already and this seems like even more work for all my lookups: Removing duplicates from result of multiple join on tables with different columns in MySQL
Thanks for the help,
David
In case anyone else wants to know, I found the issue was not with the sql at all, which worked fine for my purpose.
Rather, the data in a definition table had some values that were not unique, so the result returned an extra row in those cases where there was a duplicate definition defined.

Including rest of the rows in MySQL

I have an SQL query that selects user's privileges, and adds true to them.
SELECT
PrivilageName,
'true' hasrights <-- imaginary column
FROM
users
NATURAL JOIN usermemberships
NATURAL JOIN groupprivileges
NATURAL JOIN `privileges`
WHERE
UserID = '2'
Result is
AddBuilding true
RemoveBuilding true
EditBuilding true
I'm trying to add the rest of the privilages with false value.
AddBuilding true
RemoveBuilding true
EditBuilding true
RemoveUser false
AddUser false
How I'll do this?
Edit: the structure of the tables:
users(UserID),
usermemberships(UserID, groupID),
groupprivileges(GroupID, PrivilegeID),
privileges(PrivilegeID, PrivilageName)
Edit: misspelling, sorry.
(NOTE: The queries in this answer are now updated, to include the column names that were added to the question.)
One approach to getting that resultset would be to use LEFT JOIN operations (with appropriate predicates in the ON caluse), in place of all those NATURAL JOIN operations.
(I'm just guessing at the column names referenced by the NATURAL JOIN. In order to decipher that, we would need to inspect each table definition to get a list of all of the columns, and then find all the column names that match, to figure out which columns MySQL is using to do those inner join operations.)
Based on the scant information provided in the query text, here's the approach I would take (again, just guessing at the names referenced in each ON clause):
SELECT p.PrivilageName
, IF(u.UserID IS NOT NULL,'true','false') AS hasrights
FROM `privileges` p
LEFT
JOIN groupprivileges g
ON g.PrivilegeID = p.PrivilegeID
LEFT
JOIN usermemberships m
ON m.GroupId = g.GroupID
LEFT
JOIN users u
ON u.UserID = g.UserID
AND u.UserID = 2
Depending on the cardinality in those tables (i.e. is "AddBuilding" privilege granted to two different groups, one which the user is a member of and the other not...)
and depending on whether you want to avoid returning any "duplicate" PrivilageName values (either multiple rows with "true" or "false", or rows with both "true" and "false" for each PrivilageName), and depending on how you want the resultset ordered (i.e. do you want all the "true" privileges listed first?)...
Then this query is more deterministic in the resultset that is returned, it will return each PrivilageName only once. This resultset seems better suited to answer the question whether a user has a privilege or not.
SELECT p.PrivilageName
, MAX(IF(u.UserID IS NOT NULL,'true','false')) AS hasrights
FROM `privileges` p
LEFT
JOIN groupprivileges g
ON g.PrivilegeID = p.PrivilegeID
LEFT
JOIN usermemberships m
ON m.GroupId = g.GroupID
LEFT
JOIN users u
ON u.UserID = g.UserID
AND u.UserID = 2
GROUP BY p.PrivilageName
ORDER BY hasrights DESC, p.PrivilageName ASC
(Personally, I'd omit the ORDER BY, and let the results be ordered by PrivilageName, but with the ORDER BY, this better matches the resultset specified in the question.)
Of course, that's not the only way to get the result set, but it's likely to be the most efficient (given suitable indexes).
Personally, I don't ever use NATURAL JOIN. (I want to see the predicates in the statement, and I don't want any of my queries to "break" if someone adds a column with a matching name to one of the table in my query. (Actually, thinking about it, I can't use NATURAL JOIN because id is typically the name of the primary key column of nearly all my tables... foreign key columns are typically named referencedtable_id.) But even if I did name the columns in a way that I could use NATURAL JOIN, I see the potential drawbacks outweighing any advantage.
But, something like the statement below might work. (I say "might" because I don't have any experience using syntax like this... I never use NATURAL JOIN, and I always prefer LEFT joins to RIGHT joins. If someone in my shop came to me with this, I would give them the statement above. But I don't want to leave you with the impression that a NATURAL JOIN can't be used to return the specified resultset. It's possible your specified resultset might be returned by a statement like this:
SELECT
PrivilageName,
MAX(IF(UserID=2,'false','true')) AS hasrights
FROM
users
NATURAL RIGHT JOIN usermemberships
NATURAL RIGHT JOIN groupprivileges
NATURAL RIGHT JOIN `privileges`
GROUP BY PrivilageName
You can use UNION for "concate" two request.
And may be operator IF() can help you.