I will simplify my problem to make the core issue clear: I have a query that does TableA INNER JOIN TableB LEFT JOIN TableC.
The result of the left join is that in the result set, two of the columns might have NULL values in some rows. To fill in the missing values I have to loop over the result set and query another database that has the data (so it is not possible to join
in the first place).
My question is: Is there a standard/optimised approach when we need to fill nulls of a result set after a left join?
You can use COALESCE(...) (MSDN - COALESCE) instead.
You query will then look like:
select a, b, COALESCE(TableB.c, 'replacement value')
from TableA INNER JOIN TableB LEFT JOIN TableC ...
Add another join for your replacement table and put the column you want to replace NULL values in the COALESCE function in you don't want to use a static value.
"To fill in the missing values I have to loop over the result set and query another database that has the data (so it is not possible to join in the first place)."
Consider a different solution then looping to fill in data.
1.Another database on the same server==easy. Just join using db.schema qualified names.
Another database on a another server, still possibly depending on you network topography. Then join using server.db.schema qualified names.
Consider replicating the data you need if you regularly need to do this.
Related
Foreword: I am afraid that I'm asking to do something that can't be achieved. But I know that, if there is a way to do it, here someone will know :)
Suppose I have a MySQL database (or a MariaDB one, if compatibility can be retained). I need to select from TableA and join TableB based on a certain column having a certain value X; it is possible that no suitable row exists in TableB, which means I shall use a LEFT OUTER JOIN. However, in case no row is found (no row of TableB has the value X in that column), then I wish to use a row from TableB with value Y. In other words, Y should act as a fallback condition for the JOIN.
Here is my concrete example. Suppose I handle data localization by splitting an entity (say, a product) between two tables, one with the localizable data and the other with the fixed data. When I want to get the product info in French, I wish to fallback to English if no information is available in French. However, if I do
// basic stub of the query for demonstration purposes
SELECT * FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc
ON pLoc.ProductID = p.ID AND pLoc.Culture = 'fr-FR'
I will get NULL when no French info is available. Instead, only for those rows, I wish to have the result of the query
// basic stub of the query for demonstration purposes
SELECT * FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc
ON pLoc.ProductID = p.ID AND pLoc.Culture = 'en-US'
But the latter query is not acceptable, of course.
I know I could use an OR to join based on the Culture being 'fr-FR' or 'en-US', but this would potentially double the number of selected rows, and it would require me to post-process the resulting dataset. I wish to know if there is a way to avoid this post-processing.
Non-duplicate disclaimer: my question is similar to many others, but there's a crucial difference. I need to use a fallback value for the JOIN condition itself, not as the result of the query. My default doesn't end up in the dataset, but the dataset is computed based on that default.
You can join twice and use COALESCE() to get the value you want:
SELECT p.product_desc,
COALESCE(pLoc_fr.value, pLoc_en.value)
FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc_us ON
pLoc_us.ProductID = p.ID
AND pLoc_us.Culture = 'en-US'
LEFT OUTER JOIN ProductsLocalization pLoc_fr ON
pLoc_fr.ProductID = p.ID
AND pLoc_fr.Culture = 'fr-FR';
Also, if you can imagine a way of combining data sets, you can probably achieve using plain-ol SQL. Some edge cases require dynamically generating a SQL statement, but those aren't terribly common. In this case, you could also use two correlated subqueries inside of a COALESCE or a CASE or an IF statement inside the SELECT. There's almost always a few ways to skin the cat.
I am pretty new to SQL. Here is an operation I am sure is simple for a lot of you. I am trying to join two tables across databases on the same server – dbB and dbA, and TableA (with IdA) and TableB (with IdB) respectively. But before doing that I want to transform column IdA into a number, where I would like to remove the “:XYZ” character from its values and add a where statement for another column in dbA too. Below I show my code for the join but I am not sure how to convert the values of the column. This allows me to match idAwith idB in the join. Thanks a ton in advance.
Select replace(idA, “:XYZ”, "")
from dbA.TableA guid
where event like “%2015”
left join dbB.TableB own
on guid.idA = own.idB
Few things
FROM, Joins, WHERE (unless you use subqueries) syntax order it's also the order of execution (notice select isn't listed as its near the end in order of operation but first syntactically!)
alias/fully qualify columns when multiple tables are involved so we know what field comes from what table.
order of operations has the SQL doing the from and JOINS 1st thus what you do in the select isn't available (not in scope yet) for the compiler, this is why you can't use select column aliases in the from, where or even group by as well.
I don't like Select * usually but as I don't know what columns you really need... I used it here.
As far as where before the join. most SQL compilers anymore use cost based optimization and figure out the best execution plan given your data tables and what not involved. So just put the limiting criteria in the where in this case since it's limiting the left table of the left join. If you needed to limit data on the right table of a left join, you'd put the limit on the join criteria; thus allowing it to filter as it joins.
probably need to cast IDA as integer (or to the same type as IDB) I used trim to eliminate spaces but if there are other non-display characters, you'd have issues with the left join matching)
.
SELECT guild.*, own.*
FROM dbA.TableA guid
LEFT JOIN dbB.TableB own
on cast(trim(replace(guid.idA, ':XYZ', '')) as int) = own.idB
WHERE guid.event like '%2015'
Or materialize the transformation first by using a subquery so IDA in its transformed state before the join (like algebra ()'s matter and get processed inside out)
SELECT *
FROM (SELECT cast(trim(replace(guid.idA, ':XYZ', '')) as int) as idA
FROM dbA.TableA guid
WHERE guid.event like '%2015') B
LEFT JOIN dbB.TableB own
on B.IDA = own.idB
Let's say I have about 25,000 records in two tables and the data in each should be the same. If I need to find any rows that are in table A but NOT in table B, what's the most efficient way to do this.
We've tried it as a subquery of one table and a NOT IN the result but this runs for over 10 minutes and almost crashes our site.
There must be a better way. Maybe a JOIN?
Hope LEFT OUTER JOIN will do the job
select t1.similar_ID
, case when t2.similar_ID is not null then 1 else 0 end as row_exists
from table1 t1
left outer join (select distinct similar_ID from table2) t2
on t1.similar_ID = t2.similar_ID // your WHERE goes here
I would suggest you read the following blog post, which goes into great detail on this question:
Which method is best to select values present in one table but missing
in another one?
And after a thorough analysis, arrives at the following conclusion:
However, these three methods [NOT IN, NOT EXISTS, LEFT JOIN]
generate three different plans which are executed by three different
pieces of code. The code that executes EXISTS predicate is about 30%
less efficient than those that execute index_subquery and LEFT JOIN
optimized to use Not exists method.
That’s why the best way to search for missing values in MySQL is using a LEFT JOIN / IS NULL or NOT IN rather than NOT
EXISTS.
If the performance you're seeing with NOT IN is not satisfactory, you won't improve this performance by switching to a LEFT JOIN / IS NULL or NOT EXISTS, and instead you'll need to take a different route to optimizing this query, such as adding indexes.
Use exixts and not exists function instead
Select * from A where not exists(select * from B);
Left join. From the mysql documentation
If there is no matching row for the right table in the ON or USING
part in a LEFT JOIN, a row with all columns set to NULL is used for
the right table. You can use this fact to find rows in a table that
have no counterpart in another table:
SELECT left_tbl.* FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id =
right_tbl.id WHERE right_tbl.id IS NULL;
This example finds all rows in left_tbl with an id value that is not
present in right_tbl (that is, all rows in left_tbl with no
corresponding row in right_tbl).
I am getting a problem with my LEFT OUTER JOIN. I have a set of queries which gives me about 80,000 to 1,00000 records in a #Temp Table. Now when I LEFT OUTER JOIN this #Temp table with another table I have to put a CASE statement i.e. if the records are not found when joining with a particular column then take that particular column value and find its subsequent value in another table which has the matching records. The query is working fine for a particular data but for larger data it just goes on executing or just takes too much time. My query is like:
SELECT * FROM #Temp
LEFT OUTER JOIN TABLE1 ON #Temp.Materialcode =
CASE WHEN TABLE1.MaterialCode LIKE 'HY%'
THEN TABLE1.MaterialCode
ELSE REPLACE(TABLE1.MaterialCode,
TABLE1.MaterialCode,
(SELECT NewMaterialCode
FROM TABLE2
WHERE OldMaterialCode = TABLE1.MaterialCode))
END
Here TABLE2 has got only two columns NewMaterialCode and OldMetarialCode. What I have to do is if the Material Code is not found in TABLE1 LIKE 'HY%' type then it should take that material code and look for its subsequent NewMaterialCode in TABLE2 to get both types of records having 'HY' type and non 'HY' type. I think I made my problem clear. Any help would be greatly appreciated.
SELECT *
FROM #TEMP TMP
LEFT JOIN Table1 MATERIAL
ON TMP.MaterialCode = MATERIAL.MaterialCode
LEFT JOIN Table2 REPLACEMENT
ON MATERIAL.MaterialCode = REPLACEMENT.OldMaterialCode
WHERE ( COALESCE(MATERIAL.materialcode, '') LIKE 'HY%'
AND TMP.materialCode = MATERIAL.MaterialCode
)
OR MATERIAL.MaterialCode = REPLACEMENT.NewMaterialCode
I think this should do what you're trying to do, but I don't really know how the tables are related except by reverse-engineering your query.
For the record, the OUTER JOIN in your query isn't accomplishing a thing, because an outer condition would product null values for the columns in TABLE1, and the case condition wouldn't work (a NULL would be neither a match for 'HY%' nor an ELSE). That's counter-intuitive to those not used to working in the three-valued logic of the database world, but that's why we have COALESCE and ISNULL.
I want to create a query in MS Access that will display information from two tables based on the values in one table. Both of these tables have the same exact columns. One has set records and the other one has records a visitor can insert/edit/delete. For the purpose of this question I will call the tables TableA and TableB. TableA has the predetermined records and can not be changed. Multiple users will be using these records. Visitors would add records to TableB. I need a query that will display the records from TableA unless a visitor adds a record to TableB and then it displays that record. The field I need to join on is CategoryID. So what I need is basically like this;
If TableB.CategoryID Is Not Null Then
Select * From TableB
Else
Select * From TableA
End If
Thanks for any assistance anyone can provide.
JW
You get part of the way there by unioning the individual table queries; that works if there's nothing in B, but shows the A records if there is.
So suppose we created a table just like A, say A2, but with an added column: the number of records in B. And then we select all of the records in A2 where this new column 0, and only the columns originally in A; call this A3.
Now consider the union of A3 & B. If B is empty, we get A. If B is not empty, then none of the records from A2 will be chosen for A3, and we'e left with just B.
That is easier than it seems at first. You'll have to join both tables on CategoryID and then conditionally select the right item like this:
SELECT tA.CategoryID, IIF(tB.CategoryID IS NULL, tA.txtEntry, tB.txtEntry) AS EntryText,
tB.CategoryID IS NULL AS bOriginalEntry
FROM TableA AS tA LEFT JOIN TableB AS tB ON tA.CategoryID=tB.CategoryID
However there is one caveat: If TableB is empty then the join is producing an empty set! Just populate TableB with at least one record (preferably one with an invalid CategoryID so it won't join with a valid record in TableA.
The bOriginalEntry is just a boolean expression to show whether the EntryText stems from TableA or TableB.
I found this thread searching for a similar problem. Note to self and others.
You can use the Join Types to cope with potential different values in a conditional select,
MS Access doesn't have the full range of JOIN that MS SQL has, but you can "fudge" it.
eg
Full outer joins: all the data, combined where feasible
In some systems, an outer join can include all rows from both tables, with rows combined when they correspond. This is called a full outer join, and Access doesn’t explicitly support them. However, you can use a cross join and criteria to achieve the same effect.
https://support.office.com/en-us/article/join-tables-and-queries-3f5838bd-24a0-4832-9bc1-07061a1478f6#typesofjoins