Joining MySQL Tables dynamically - mysql

i am writing a MySQL Stored Procedure which has to join a "base" table with other tables.
Which other table to join depends on a field in the base table.
This is the base table:
ID - Value - JoinValue - TableToJoin - FieldToJoin
---------------------------------------------------
1 - Test - aa - tbl_test1 - test1field
2 - Test2 - bb - tbl_test2 - test2field
As Output, i wanna have:
ID - Value - ValueFromOtherTable
----------------------------------------------
aa - Test - ValueFromTBL_TEST1Field
bb - Test2 - ValueFromTBL_TEST2Field
Is this somehow possible?
Somewhat like this maybe?
SELECT
ID,
Value,
(SELECT #FieldToJoin FROM #TableToJoin AS t WHERE t.ID = #JoinValue) AS ValueFromOtherTable
FROM tbl_base;
Already tried some sort of JOIN, but unfortunately I could not find the answer.
Greetings, xola

This kind of design is usually a bad idea, and you cannot dynamically choose a table to join to in the way you are thinking, but it can be done in a crude rather brute force kind of way....
SELECT stuff
, COALESCE(t2.something, t3.something, ....) AS otherTsomething
FROM t1
LEFT JOIN t2 ON t1.TableToJoin = "t2"
AND t1.JoinValue = CASE t1.FieldToJoin WHEN "a" THEN t2.a WHEN "b" THEN t2.b .... END
LEFT JOIN t3 ON t1.TableToJoin = "t3"
AND t1.JoinValue = CASE t1.FieldToJoin WHEN "a" THEN t3.a WHEN "b" THEN t3.b .... END
;
Usually, the appropriate design for this kind of data is to have t2 and t3 reference t1, then it simply becomes...
SELECT stuff, COALESCE(t2.something, t3.something, ...) AS otherSomething
FROM t1
LEFT JOIN t2 ON t1.id = t2.t1_id
LEFT JOIN t3 ON t1.id = t3.t1_id
;
(For your purposes, this presumes to some degree that only one t2 or t3 will be associated with a t1 record; but that would be hard to enforce, and in general use doesn't really present a problem.)

Related

MySQL find rows not in other table

I am a bit stunned that I am no able to produce my results I want to get so I ask you experts for help!
I have three Tables showing only the important parts:
T1: (List of all names and schedules assigned)
Name | ScheduleName
T2: (all possible schedule names)
ScheduleName
T3: (List of all names)
Name
I try to find the ScheduleName per Person that was not assigned. In the end I would like to see:
Name1 | ScheduleName1
Name1 | ScheduleName2
Name1 | ScheduleName3
Name2 | ScheduleName2
What I tried:
SELECT
T2.ScheduleName
FROM
T2
WHERE
T2.ScheduleName NOT IN
(SELECT T1.ScheduleName FROM T1 WHERE T1.Name = "Name1")
This gives me the not assigned schedules for Person Name1. Works.
If I remove the WHERE statement within the parenthesis it returns no rows. (Because it matches it against the whole dataset)
Help very much appreciated
Thanks in advance
EDIT: Tried to work with a LEFT JOIN but it is not returning rows with "NULL" although it should
Here is one option, using a cross join:
SELECT
T3.Name,
T2.ScheduleName
FROM T2
CROSS JOIN T3
LEFT JOIN T1
ON T2.ScheduleName = T1.ScheduleName AND T3.Name = T1.Name
WHERE
T1.Name IS NULL;
The idea behind the cross join is that it generates all possible pairings of names and schedules. That is, it represents the entire set of all pairings. Then, we left join this to the T1 assignment table to find pairings which were never made.
We could also have achieved the same thing using EXISTS logic:
SELECT
T3.Name,
T2.ScheduleName
FROM T2
CROSS JOIN T3
WHERE NOT EXISTS (SELECT 1 FROM T1
WHERE T2.ScheduleName = T1.ScheduleName AND T3.Name = T1.Name);

MySQL: how to ignore that column is ambiguous

I have a mysql query like this:
SELECT * from T1
INNER JOIN T2 ON T2.t1_id = T1.id
INNER JOIN T3 ON T3.t1_id = T1.id
where T3.my_column = my_column
I get the Column 'my_column' in where clause is ambiguous error, because all the 3 tables has the my_column column.
What I can NOT do now is to say where T3.my_column = T1.my_column, because I'm using an ORM in which it is impossible now, because I have to setup the condition in the association definition where I do not know if there would be a T1 or T2 table.
The actual value of my_column will be the same in T1 and T2, which means my condition would be ok for both T1.my_column and T2.my_column.
Is there any way in MySQL to say do not care about ambiguous columns, just take one of them randomly ?
What I'd like to do is something like this:
SELECT * from T1
INNER JOIN T2 ON T2.t1_id = T1.id
INNER JOIN T3 ON T3.t1_id = T1.id
where T3.my_column = IGNORE_AMBIGUITY(my_column)
The actual problem
I'm defining an n:m relationship in sequelize, which is an ORM for node.js.
All my tables have an id and a company_id, the id is not unique, only the id-company_id pairs are unique.
This is how I define the association:
{
through: {
model: 'T1_to_T2',
scope: {
company_id: {$col: 'company_id'}, // Problematic part
}
},
foreignKey: 'T2_id'
}
My problem is that through the T1_to_T2 relationship table, for company_id=X1, I'll get the relations of X2 or X3 companies because the query doesn't care about the company_id field in the relationship table.
If you don't know if it would be in T2 or T3, you have a problem. What is your expected behavior? If you are trying to join one or the other then how do you know what you are filtering?
I think you need to rethink your problem first and define what you want from your query before deciding how to solve it. You get the error because MySQL cannot determine what you want.
If you really want one arbitrarily, instead do:
WHERE t3.mycolumn = t1.column OR t3.mycolumn = t2.mycolumn limit 1
That is not random but it is arbitrary.
try this
SELECT * from T1
INNER JOIN T2 ON T2.t1_id = T1.id
INNER JOIN T3 ON T3.t1_id = T1.id
where T3.my_column = (select my_column from T1)

5 tables, scan first to find single match in 1st relational pair or 2nd relational pair without full table scans

I am still learning mysql and am not even sure how to phrase this to find the answer in a search.
I have 5 tables (actually more, but for the example, 5 suffices), one is the main table, T, and then we have T1 and T2 and their respective relational tables T1_x_T and T2_x_T. I need to go through every row in T to find if there is a match in either T1 or T2 with a given attribute, it only needs to match once, but can have multiple matches. Table structure is something like:
T.id
T1.id T1.attrib
T2.id T2.attrib
T1_x_T.T1_id, T1_x_T.T_id
T2_x_T.T2_id, T2_x_T.T_id
If the entry in T has a match in either T1 or T2 on that attrib something like:
(T.id = T1_x_T.T_id and T1.id = T1_x_T.T1_id and T1.attrib = SOMEVAL) or (T.id = T2_x_T.T_id and T2.id = T2_x_T.T2_id and T2.attrib = SOMEVAL)
Ie, as soon as it finds a match for T, move on to the next row in T and don't scan the rest of the table nor move to the next table. Basically to answer the question: "For each id in T, is there any match in T1_x_T or T2_x_T where the corresponding T1 or T2 value matches a given value for attrib?"
So the result would be a subset of table T.
My initial intuition is to use LEFT INNER JOIN, LIMIT and GROUP BY to achieve this, but I don't know enough about either (or mysql) to know how to accomplish this or if it is accomplishable. I do know how to do this in what I assume is the inefficient way (full table scans for both?) or in two queries and then parse the results outside of mysql, but I want to learn how to build nice efficient queries.
Sample data, as requested, for query where attrib = 1:
T.id:
i1
i2
i3
T1.id - T1.attrib:
a - 1
b - 0
T1_x_T.T1_id - T1_x_T.T_id:
a - i1
b - i1
b - i2
T2.id - T2.attrib:
y - 0
z - 1
T2_x_T.T2_id - T2_x_T.T_id:
z - i3
y - i2
Results in:
i1
i3
Since T1.id = a has T1.attrib = 1 and T1_x_T.T1_id = a has entry with T1_x_T.T_id = i1; and T2.id = z has T2.attrib = 1 and T2_x_T.T2_id = a has entry with T2_x_T.T_id = i3.
Hope that helps explain a bit.
Try this:
SELECT
T.id as T_id
FROM T
LEFT JOIN T1_x_T ON T.id= T1_x_T.T_id
LEFT JOIN T1 ON T1.id = T1_x_T.T1_id
LEFT JOIN T2_x_T ON T.id= T2_x_T.T_id
LEFT JOIN T2 ON T2.id = T2_x_T.T2_id
WHERE T1.attributes = '1' OR T2.attribute = '1';
Well this maps your question:
"For each id in T, is there any match in T1_x_T or T2_x_T where the
corresponding T1 or T2 value matches a given value for attrib?"
and provide your expected result in the example.
Just to clarify how things work.
LEFT JOINS combine all the rows following the ON clause like T.id = T1_x_T.T_id. If the join find n different T and m different records in T1_x_T that respect the ON clause, it will produce a m x n result with al the possible values.
So here is the result of the joins in your case:
Where you see NULL is what you mean with short circuit, there is no match, so no result.
When you put the WHERE or the GROUP BY you are acting on this extended table result of JOIN to put your conditions.
By the way, when you are trying a complex join it can be a good idea to look the complete results to better understand if you are doing it right and select the appropriate conditions to obtain the desired result.
Regards
I would suggest indeed the use of INNER JOIN, but combined with UNION:
SELECT T.id
FROM T
INNER JOIN T1_x_T
ON T1_x_T.T_id = T.id
INNER JOIN T1
ON T1.id = T1_x_T.T1_id
WHERE T1.attrib = 1
UNION
SELECT T.id
FROM T
INNER JOIN T2_x_T
ON T2_x_T.T_id = T.id
INNER JOIN T2
ON T2.id = T2_x_T.T2_id
WHERE T2.attrib = 1
Here is a fiddle.
As your condition concerns the columns of the joined tables, you should not use outer joins like LEFT JOIN in this case. Although the output would be the same, LEFT JOIN is generally more expensive in terms of performance.
The UNION clause will also make sure you don't get duplicates.
Also, if you are only interested in the id value of table T, then you don't need to include that table at all in the query, and this would be better:
SELECT T1_x_T.T_id
FROM T1_x_T
INNER JOIN T1
ON T1.id = T1_x_T.T1_id
WHERE T1.attrib = 1
UNION
SELECT T2_x_T.T_id
FROM T2_x_T
INNER JOIN T2
ON T2.id = T2_x_T.T2_id
WHERE T2.attrib = 1
Fiddle
You might also compare the performance with this alternative, which performs sub queries. One might expect it will skip the second one if the first one gives a match, but as this may differ per id value, there really is no gain: both sub queries will be executed first before the matchings with the id values are done. There a short-cicuit will take place, but only for the comparison with the already generated result sets:
SELECT id
FROM T
WHERE id IN (
SELECT T1_x_T.T_id
FROM T1_x_T
INNER JOIN T1
ON T1.id = T1_x_T.T1_id
WHERE T1.attrib = 1)
OR id IN (
SELECT T2_x_T.T_id
FROM T2_x_T
INNER JOIN T2
ON T2.id = T2_x_T.T2_id
WHERE T2.attrib = 1)
Fiddle
One could potentially force the short-circuit with correlated sub queries, but then such sub query has to be executed again and again for each id. And even though in some cases it would not have to repeat that for the second sub query, the loss in performance, due to the repeated executions for different id values, will be much greater than the gain from the short-circuit evaluation. Also the execution plan might see an optimisation and thus not follow the produce I just described:
SELECT id
FROM T
WHERE EXISTS (
SELECT 1
FROM T1_x_T
INNER JOIN T1
ON T1.id = T1_x_T.T1_id
WHERE T1.attrib = 1
AND T1.id = T.id)
OR EXISTS (
SELECT 1
FROM T2_x_T
INNER JOIN T2
ON T2.id = T2_x_T.T2_id
WHERE T2.attrib = 1
AND T2.id = T.id)

MySQL: How to Remove One Row of a Multi-Row Record Based on Column

If I have two tables that I'm joining and I write the most simple query possible like this:
SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
There are a few records who have multiple rows per ID because they have multiple employers, so t1 looks like this:
ID Name Employer
12345 Jerry Comedy Cellar
12345 Jerry NBC
12348 Elaine Pendant Publishing
12346 George Real Estate
12346 George Yankees
12346 George NBC
12347 Kramer Kramerica Industries
t2 is linked with the similar IDs but with some activities that I'd like to see -- hence the SELECT * above. Though I don't want multiple rows to return if the Employer column is "NBC" -- but everything else is good.
The only other thing that matters here is that t2 is smaller than t1, because t1 is everybody and t2 are only from people who did particular activities -- so some of the matches won't return anything from t2, but I would still like them to be returned, hence the LEFT JOIN.
If I write the query like this:
SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE Employer <> "NBC"
Then it removes Jerry and George completely -- when really all I want is for the NBC row to not be returned, but to return any other rows that are associated with them.
How can I write the query while joining t1 with t2 to return each row except for the NBC ones? The ideal output would be all of the rows from t1 regardless if they match up with all of t2 except removing all of the rows with "NBC" as the employer in the return file. Basically the ideal here is to return the JOINs where they fit, but regardless remove the entire row for anybody with "NBC" as employer without removing their other rows.
The more I write about it, it seems like I should potentially just run a query prior to my JOIN to delete all the rows in t1 who have "NBC" as their employer and then run the normal query.
Basic subset filtering
You can filter either of the two merged (joined) subsets by extending the ON clause.
SELECT *
FROM t1
LEFT JOIN t2
ON t1.ID = t2.ID
AND t2.Employer != 'NBC'
If you get null values now, and you don't want them, you'd add:
WHERE t2.Employer IS NOT NULL
extended logic:
SELECT *
FROM t1
LEFT JOIN t2
ON (t1.ID = t2.ID AND t2.Employer != 'NBC')
OR (t2.ID = t2.ID AND t2.Employer IS NULL)
Using UNION
Basically, JOIN is for horizontal linking and UNION does vertical linking of datasets.
It merges to resultsets: the first without NBC, and the second (which is basically an OUTER JOIN), adds everyone in t1 which is not part of t2.
SELECT *
FROM t1
LEFT JOIN t2
ON t1.ID = t2.ID
AND t2.Employer != 'NBC'
UNION
SELECT *
FROM t1
LEFT JOIN t2
ON t1.ID = t2.ID
AND t2.Employer IS NULL
String manipulation in the resultset
If you just want to remove NBC as a string, here is a workaround:
SELECT
t1.*,
IF (t2.Employer = 'NBC', NULL, t2.Employer) AS Employer
FROM t1
LEFT JOIN t2
ON t1.id = t2.id
This basically replaces "NBC" by NULL

Select from table1 where similar rows do NOT appear in table2?

I've been struggling with this for a while, and haven't been able to find any examples to point me in the right direction.
I have 2 MySQL tables that are virtually identical in structure. I'm trying to perform a query that returns results from Table 1 where the same data isn't present in table 2. For example, imagine both tables have 3 fields - fieldA, fieldB and fieldC. I need to exclude results where the data is identical in all 3 fields.
Is it even possible?
There are several ways to do it (assuming the fields don't allow NULLs):
SELECT a, b, c FROM Table1 T1 WHERE NOT EXISTS
(SELECT * FROM Table2 T2 WHERE T2.a = T1.a AND T2.b = T1.b AND T2.c = T1.c)
or
SELECT T1.a, T1.b, T1.c FROM Table1 T1
LEFT OUTER JOIN Table2 T2 ON T2.a = T1.a AND T2.b = T1.b AND T2.c = T1.c
WHERE T2.a IS NULL
select
t1.*
from
table1 t1
left join table2 t2 on
t1.fieldA = t2.fieldA and
t1.fieldB = t2.fieldB and
t1.fieldC = t2.fieldC
where
t2.fieldA is null
Note that this will not work if any of the fields is NULL in both tables. The expression NULL = NULL returns false, so these records are excluded as well.
This is a perfect use of EXCEPT (the key word/phase is "set difference"). However, MySQL lacks it. But no fear, a work-around is here:
Intersection and Set-Difference in MySQL (A workaround for EXCEPT)
Please not that approaches using NOT EXISTS in MySQL (as per above link) are actually less than ideal although they are semantically correct. For an explanation of the performance differences with the above (and alternative) approaches as handled my MySQL, complete with examples, see NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: MySQL:
That’s why the best way to search for missing values in MySQL is using a LEFT JOIN / IS NULL or NOT IN rather than NOT EXISTS.
Happy coding.
The 'left join' is very slow in MYSQL. The gifford algorithm shown below speeds it many orders of magnitude.
select * from t1
inner join
(select fieldA from
(select distinct fieldA, 1 as flag from t1
union all
select distinct fieldA, 2 as flag from t2) a
group by fieldA
having sum(flag) = 1) b on b.fieldA = t1.fieldA;