I am pretty new to SQL. Here is an operation I am sure is simple for a lot of you. I am trying to join two tables across databases on the same server – dbB and dbA, and TableA (with IdA) and TableB (with IdB) respectively. But before doing that I want to transform column IdA into a number, where I would like to remove the “:XYZ” character from its values and add a where statement for another column in dbA too. Below I show my code for the join but I am not sure how to convert the values of the column. This allows me to match idAwith idB in the join. Thanks a ton in advance.
Select replace(idA, “:XYZ”, "")
from dbA.TableA guid
where event like “%2015”
left join dbB.TableB own
on guid.idA = own.idB
Few things
FROM, Joins, WHERE (unless you use subqueries) syntax order it's also the order of execution (notice select isn't listed as its near the end in order of operation but first syntactically!)
alias/fully qualify columns when multiple tables are involved so we know what field comes from what table.
order of operations has the SQL doing the from and JOINS 1st thus what you do in the select isn't available (not in scope yet) for the compiler, this is why you can't use select column aliases in the from, where or even group by as well.
I don't like Select * usually but as I don't know what columns you really need... I used it here.
As far as where before the join. most SQL compilers anymore use cost based optimization and figure out the best execution plan given your data tables and what not involved. So just put the limiting criteria in the where in this case since it's limiting the left table of the left join. If you needed to limit data on the right table of a left join, you'd put the limit on the join criteria; thus allowing it to filter as it joins.
probably need to cast IDA as integer (or to the same type as IDB) I used trim to eliminate spaces but if there are other non-display characters, you'd have issues with the left join matching)
.
SELECT guild.*, own.*
FROM dbA.TableA guid
LEFT JOIN dbB.TableB own
on cast(trim(replace(guid.idA, ':XYZ', '')) as int) = own.idB
WHERE guid.event like '%2015'
Or materialize the transformation first by using a subquery so IDA in its transformed state before the join (like algebra ()'s matter and get processed inside out)
SELECT *
FROM (SELECT cast(trim(replace(guid.idA, ':XYZ', '')) as int) as idA
FROM dbA.TableA guid
WHERE guid.event like '%2015') B
LEFT JOIN dbB.TableB own
on B.IDA = own.idB
Related
Foreword: I am afraid that I'm asking to do something that can't be achieved. But I know that, if there is a way to do it, here someone will know :)
Suppose I have a MySQL database (or a MariaDB one, if compatibility can be retained). I need to select from TableA and join TableB based on a certain column having a certain value X; it is possible that no suitable row exists in TableB, which means I shall use a LEFT OUTER JOIN. However, in case no row is found (no row of TableB has the value X in that column), then I wish to use a row from TableB with value Y. In other words, Y should act as a fallback condition for the JOIN.
Here is my concrete example. Suppose I handle data localization by splitting an entity (say, a product) between two tables, one with the localizable data and the other with the fixed data. When I want to get the product info in French, I wish to fallback to English if no information is available in French. However, if I do
// basic stub of the query for demonstration purposes
SELECT * FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc
ON pLoc.ProductID = p.ID AND pLoc.Culture = 'fr-FR'
I will get NULL when no French info is available. Instead, only for those rows, I wish to have the result of the query
// basic stub of the query for demonstration purposes
SELECT * FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc
ON pLoc.ProductID = p.ID AND pLoc.Culture = 'en-US'
But the latter query is not acceptable, of course.
I know I could use an OR to join based on the Culture being 'fr-FR' or 'en-US', but this would potentially double the number of selected rows, and it would require me to post-process the resulting dataset. I wish to know if there is a way to avoid this post-processing.
Non-duplicate disclaimer: my question is similar to many others, but there's a crucial difference. I need to use a fallback value for the JOIN condition itself, not as the result of the query. My default doesn't end up in the dataset, but the dataset is computed based on that default.
You can join twice and use COALESCE() to get the value you want:
SELECT p.product_desc,
COALESCE(pLoc_fr.value, pLoc_en.value)
FROM Products p
LEFT OUTER JOIN ProductsLocalization pLoc_us ON
pLoc_us.ProductID = p.ID
AND pLoc_us.Culture = 'en-US'
LEFT OUTER JOIN ProductsLocalization pLoc_fr ON
pLoc_fr.ProductID = p.ID
AND pLoc_fr.Culture = 'fr-FR';
Also, if you can imagine a way of combining data sets, you can probably achieve using plain-ol SQL. Some edge cases require dynamically generating a SQL statement, but those aren't terribly common. In this case, you could also use two correlated subqueries inside of a COALESCE or a CASE or an IF statement inside the SELECT. There's almost always a few ways to skin the cat.
I have two tables:
Invariant (UniqueID, characteristic1, characteristic2)
Variant (VariantID, UniqueID, specification1, specification2)
Each project has its own unchanging characteristics between implementations. Each implementation also has its own individual properties.
So, I use queries like this to find projects with the given characteristics and specifications:
SELECT *
FROM `Invariants`
LEFT JOIN (`Variants`) ON (`Variants`.`UniqueID`=`Invariants`.`UniqueID`)
WHERE char2='y' and spec1='x'
GROUP BY `Invariant`.`UniqueID`;
I'm looking for a query that will return all projects that have never satisfied a given specification. So, if one of project 100's variants had spec1='bad', then I don't want project 100 to be included, regardless if it had variants where spec1='good'.
select *
from Invariants iv
where not exists (
select 1
from Variants v
where v.UniqueId = iv.UniqueId and v.spec1 = 'bad'
)
The queries below do not address your question, I probably read to fast and thought you wanted to pick up only the invariant properties of a particular type. But I will note that you shouldn't use a left join and then filter, in the where clause, against columns from the right table (except for checking nulls). People make that mistake all the time and that's what jumped out to me at first glance.
The whole purpose of a left join is that some of the rows will not match and will thus have filler null values in the columns for the right-hand table. This join logic happens first and then after that the where clause is applied. When you have a condition like where spec1 = 'x' it will always evaluate to false against a null value. So you end up eliminating all the rows you wanted to keep.
This happens a lot with these invariant/custom values tables. You're only interested in one of the properties but if you don't filter prior to joining or inside the join condition, you end up dropping rows because the value didn't exist and you didn't have a value left to compare once it tried to apply a where-clause condition on the property name.
Hope that made sense. See below for examples:
select iv.UniqueId, ...
from
Invariants iv left outer join
Variants
on v.UniqueId = vi.UniqueId and v.spec1 = 'x'
or
select iv.UniqueId, ...
from
Invariants iv left outer join
(
select
from Variants
where spec1 = 'x'
) v
on v.UniqueId = vi.UniqueId
Running an INNER JOIN type of query, i get duplicate column names, which can pose a problem. This has been covered here extensively and i was able to find the solution to this problem, asides from it being fairly logical, by SELECTing only the columns i need.
However, i would like to know how i could run such a query without actually returning any of the columns from the joined table.
This is my MySQL query
SELECT * FROM product z
INNER JOIN crosslink__productXmanufacturer a
ON z.id = a.productId
WHERE
(z.title LIKE "%search_term%" OR z.search_keywords LIKE "%search_term%")
AND
z.availability = 1
AND
a.manufacturerId IN (22,23,24)
Question
How would i modify this MySQL query in order to return only columns from product and none of the columns from crosslink__productXmanufacturer?
Add the table name to the *. Replace
SELECT * FROM product z
with
SELECT z.* FROM product z
Often when you are doing this, the intention may be clearer using in or exists rather than a join. The join is being used for filtering, so putting the condition in the where clause makes sense:
SELECT p.*
FROM product p
WHERE (p.title LIKE '%search_term%' OR p.search_keywords LIKE '%search_term%') AND
p.availability = 1 AND
exists (SELECT 1
FROM pXm
WHERE pXm.productId = p.id AND pxm.manufacturerId IN (22, 23, 24)
);
With the proper indexes, this should run at least as fast as the join version (the index is crosslink__productXmanufacturer(productId, manufacturerId). In addition, you don't have to worry about returning duplicate records, if there are multiple matches in crosslink__productXmanufacturer.
You may notice two other small changes I made to the query. First, the table aliases are abbreviates for the table names, making the logic easier to follow. Second, the string constants use single quotes (the ANSI standard) rather than double quotes. Using single quotes only for string and date constants helps prevent inadvertent syntax errors.
Let's say I have about 25,000 records in two tables and the data in each should be the same. If I need to find any rows that are in table A but NOT in table B, what's the most efficient way to do this.
We've tried it as a subquery of one table and a NOT IN the result but this runs for over 10 minutes and almost crashes our site.
There must be a better way. Maybe a JOIN?
Hope LEFT OUTER JOIN will do the job
select t1.similar_ID
, case when t2.similar_ID is not null then 1 else 0 end as row_exists
from table1 t1
left outer join (select distinct similar_ID from table2) t2
on t1.similar_ID = t2.similar_ID // your WHERE goes here
I would suggest you read the following blog post, which goes into great detail on this question:
Which method is best to select values present in one table but missing
in another one?
And after a thorough analysis, arrives at the following conclusion:
However, these three methods [NOT IN, NOT EXISTS, LEFT JOIN]
generate three different plans which are executed by three different
pieces of code. The code that executes EXISTS predicate is about 30%
less efficient than those that execute index_subquery and LEFT JOIN
optimized to use Not exists method.
That’s why the best way to search for missing values in MySQL is using a LEFT JOIN / IS NULL or NOT IN rather than NOT
EXISTS.
If the performance you're seeing with NOT IN is not satisfactory, you won't improve this performance by switching to a LEFT JOIN / IS NULL or NOT EXISTS, and instead you'll need to take a different route to optimizing this query, such as adding indexes.
Use exixts and not exists function instead
Select * from A where not exists(select * from B);
Left join. From the mysql documentation
If there is no matching row for the right table in the ON or USING
part in a LEFT JOIN, a row with all columns set to NULL is used for
the right table. You can use this fact to find rows in a table that
have no counterpart in another table:
SELECT left_tbl.* FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id =
right_tbl.id WHERE right_tbl.id IS NULL;
This example finds all rows in left_tbl with an id value that is not
present in right_tbl (that is, all rows in left_tbl with no
corresponding row in right_tbl).
I have a table called faq. This table consists from fields faq_id,faq_subject.
I have another table called article which consists of article_id,ticket_id,a_body and which stores articles in a specific ticket. Naturally there is also a table "ticket" with fields ticket_id,ticket_number.
I want to retrieve a result table in format:
ticket_number,faq_id,faq_subject.
In order to do this I need to search for faq_id in the article.a_body field using %LIKE% statement.
My question is, how can I do this dynamically such that I return with SQL one result table, which is in format ticket_number,faq_id,faq_subject.
I tried multiple configurations of UNION ALL, LEFT JOIN, LEFT OUTER JOIN statements, but they all return either too many rows, or have different problems.
Is this even possible with MySQL, and is it possible to write an SQL statement which includes #variables and can take care of this?
First off, that kind of a design is problematic. You have certain data embedded within another column, which is going to cause logic as well as performance problems (since you can't index the a_body in such a way that it will help the JOIN). If this is a one-time thing then that's one issue, but otherwise you're going to have problems with this design.
Second, consider this example: You're searching for faq_id #123. You have an article that includes faq_id 4123. You're going to end up with a false match there. You can embed the faq_id values in the text with some sort of mark-up (for example, [faq_id:123]), but at that point you might as well be saving them off in another table as well.
The following query should work (I think that MySQL supports CAST, if not then you might need to adjust that).
SELECT
T.ticket_number,
F.faq_id,
F.faq_subject
FROM
Articles A
INNER JOIN FAQs F ON
A.a_body LIKE CONCAT('%', F.faq_id, '%')
INNER JOIN Tickets T ON
T.ticket_id = A.ticket_id
EDIT: Corrected to use CONCAT
SELECT DISTINCT t.ticket_number, f.faq_id, f.faq_subject
FROM faq.f
INNER JOIN article a ON (a.a_body RLIKE CONCAT('faq_id: ',faq_id))
INNER JOIN ticket t ON (t.ticket_id = a.ticket_id)
WHERE somecriteria