Hello (at start i wish to sorry for my bad english)
I have two tables: tbl1 and tbl2; both have same structure but dataset in them if from different sources and i can't mix them.
in another table tbl3 i have dataid and datasource.
what i wish to do is to select data from table linked in source.
pseudocode i try to produce:
SELECT
tbl3.dataid,
tbl3.datasource,
SEL_TBL.important_data,
SEL_TBL.another_thing,
SEL_TBL.something_completly_different
FROM
tbl3
LEFT JOIN
( SWITCH tbl3.datasource
CASE 'tbl1':
tbl1 AS >> SEL_TBL
CASE 'tbl2':
tbl2 AS >> SEL_TBL
)
ON
SEL_TBL.dataid = tbl3.dataid
i need result that contains: important_data, another_thing and of course something_completly_different from table selected in "switch statment".
what works right now :
SELECT
tbl3.dataid,
tbl3.datasource,
(
CASE
WHEN tbl3.datasource ='tbl1'
THEN
tbl1.important_data
ELSE
tbl2.important_data
END
) important_data
FROM
tbl3
LEFT JOIN
tbl2
ON
tbl2.dataid = tbl3.dataid
LEFT JOIN
tbl1
ON
tbl1.dataid = tbl3.dataid
in results of this query i got dataid, datasource and imporatn_data. I can ofcourse repeat whole case block for every single field but perhaps there is more civilized method.
oh and one more thing: tbl1.dataid and tbl2.dataid can get the same value (that's why i can't mix tables)
The best solution is to use a left join and include the join requirements.
Then coalesce gives you your results for each field.
Since this is a standard practice for star data models SQL optimizers will make it run quite fast.
SELECT
tbl3.dataid,
tbl3.datasource,
COALESCE(tbl1.important_data, tbl2.important_data) important_data
COALESCE(tbl1.another_thing, tbl2.another_thing) another_thing,
COALESCE(tbl1.something_completly_different, tbl2.something_completly_different) something_completly_different
FROM tbl3
LEFT JOIN tbl2 ON tbl2.dataid = tbl3.dataid AND tbl3.datasource = 'tbl2'
LEFT JOIN tbl1 ON tbl1.dataid = tbl3.dataid AND tbl3.datasource = 'tbl1'
Related
I'm really struggling with this query and I hope somebody can help.
I am querying across multiple tables to get the dataset that I require. The following query is an anonymised version:
SELECT main_table.id,
sub_table_1.field_1,
main_table.field_1,
main_table.field_2,
main_table.field_3,
main_table.field_4,
main_table.field_5,
main_table.field_6,
main_table.field_7,
sub_table_2.field_1,
sub_table_2.field_2,
sub_table_2.field_3,
sub_table_3.field_1,
sub_table_4.field_1,
sub_table_4.field_2
FROM main_table
INNER JOIN sub_table_4 ON sub_table_4.id = main_table.id
INNER JOIN sub_table_2 ON sub_table_2.id = main_table.id
INNER JOIN sub_table_3 ON sub_table_3.id = main_table.id
INNER JOIN sub_table_1 ON sub_table_1.id = main_table.id
WHERE sub_table_4.field_1 = '' AND sub_table_4.field_2 = '0' AND sub_table_2.field_1 != ''
The query works, the problem I have is sub_table_1 has a revision number (int 11). Currently I get duplicate records with different revision numbers and different versions of sub_table_1.field_1 which is to be expected, but I want to limit the result set to only include results limited by the latest revision number, giving me only the latest sub_table_1_field_1 and I really can not figure it out!
Can anybody lend me a hand?
Many Thanks.
It's always important to remember that a JOIN can be on a subquery as well as a table. You could build a subquery that returns the results you want to see then, once you've got the data you want, join it in the parent query.
It's hard to 'tailor' an answer that's specific to you problem, as it's too obfuscated (as you admit) to know what the data and tables really look like, but as an example:
Say table1 has four fields: id, revision_no, name and stuff. You want to return a distinct list of name values, with their latest version of stuff (which, we'll pretend varies by revision). You could do this in isolation as:
select t.* from table1 t
inner join
(SELECT name, max(revision_no) maxr
FROM table1
GROUP BY name) mx
on mx.name = t.name
and mx.maxr = t.revision_no;
(Note: see fiddle at the end)
That would return each individual name with the latest revision of stuff.
Once you've got that nailed down, you could then swap out
INNER JOIN sub_table_1 ON sub_table_1.id = main_table.id
....with....
INNER JOIN (select t.* from table1 t
inner join
(SELECT name, max(revision_no) maxr
FROM table1
GROUP BY name) mx
on mx.name = t.name
and mx.maxr = t.revision_no) sub_table_1
ON sub_table_1.id = main_table.id
...which would allow a join with a recordset that is more tailored to that which you want to join (again, don't get hung up on the actual query I've used, it's just there to demonstrate the method).
There may well be more elegant ways to achieve this, but it's sometimes good to start with a simple solution that's easier to replicate, then simplify it once you've got the general understanding of the what and why nailed down.
Hope that helps - as I say, it's as specific as I could offer without having an idea of the real data you're using.
(for the sake of reference, here is a fiddle with a working version of the above example query)
In your case where you only need one column from the table, make this a subquery in your select clause instead of than a join. You get the latest revision by ordering by revision number descending and limiting the result to one row.
SELECT
main_table.id,
(
select sub_table_1.field_1
from sub_table_1
where sub_table_1.id = main_table.id
order by revision_number desc
limit 1
) as sub_table_1_field_1,
main_table.field_1,
...
FROM main_table
INNER JOIN sub_table_4 ON sub_table_4.id = main_table.id
INNER JOIN sub_table_2 ON sub_table_2.id = main_table.id
INNER JOIN sub_table_3 ON sub_table_3.id = main_table.id
WHERE sub_table_4.field_1 = ''
AND sub_table_4.field_2 = '0'
AND sub_table_2.field_1 != '';
Update 1
I discover when it does the wrong behaviour. If the view is composed by two tables, only the fields in the first table has values inside the subquery. I don't know why, but if I change the JOIN order, it works. As soon as I try to match another field with the second table it returns NULL again.
Update 2
I've created a working example here: http://sqlfiddle.com/#!2/d4eb97/1
Update 3
The same example works in a newer MySQL version (5.6.6) so maybe there is a bug in the 5.5 - http://sqlfiddle.com/#!9/4e140/2
I've a schema in which I ended doing a SQL like this:
SELECT view.user,
(
SELECT tableA.user
FROM tableA
LEFT JOIN tableB ON tableA.id = tableB.tableA_id
WHERE tableA.user = view.user
LIMIT 1
) as b_user
FROM view
WHERE view.user = 1
What I'm doing here is simple:
Select two fields from view
view is a MySQL view, not a real table.
The second field is a subquery of:
2.1 The field user of the table tableA
2.2 Left join with the table tableB with the relational field
There are no rows in tableB yet
2.3 Only where the the tableA user is the same as in the view
2.4 Limit 1, just for this example
Limit results to user = 1
The strange thing here is that in some situations the field b_user is NULL, but the data is ok.
I can make three changes to make it works:
fix 1
Put the user id manually make it works
SELECT view.user,
(
SELECT tableA.user
FROM tableA
LEFT JOIN tableB ON tableA.id = tableB.tableA_id
WHERE tableA.user = 1
LIMIT 1
) as b_user
FROM view
WHERE view.user = 1
fix 2
Remove the left join also make it works:
SELECT view.user,
(
SELECT tableA.user
FROM tableA
WHERE tableA.user = view.user
LIMIT 1
) as b_user
FROM view
WHERE view.user = 1
fix 3
Another option is not to use the MySQL view:
SELECT view.user,
(
SELECT tableA.user
FROM tableA
WHERE tableA.user = view_table_a.user
LEFT JOIN tableB ON tableA.id = tableB.tableA_id
LIMIT 1
) as b_user
FROM view_table_a INNER JOIN view_table_b ON condition
WHERE table_a.user = 1
I'm not being able to reproduce this recreating a new database schema manually, it only happens in my current setup, which I cannot expose here due to security reasons.
Why the subquery return NULL values? I need to make the first query works since I can't use any of the three fixes.
Why have the subquery in the first place? I like subqueries, they are very handy things of have around. But they shouldn't be used if they don't have to be. Queries can get complicated enough with no help from us.
You are looking for a particular user from the main table (the fact that it is really a view is irrelevant) then using the same User value to join with TableA and then optionally joining to TableB using the ID value associated with that user:
select rs.Origin, a.Origin as Same_Origin
from requests_status rs
join assignments a
on a.employee = rs.employee
and a.origin = rs.origin
left join assignments_author aa
on aa.assignment = a.id
where rs.employee = 1;
Then I noticed that in your fiddles, you create the assignments_author table but never populate it. But that doesn't really matter because you left join to it. But you don't use any data from that table. So in actuality, you don't need that table in your query at all. Thus the equivalent query would be:
select rs.Origin, a.Origin as Same_Origin
from requests_status rs
join assignments a
on a.employee = rs.employee
and a.origin = rs.origin
where rs.employee = 1;
I don't know why you get a NULL in one but not the other. But since the query above returns the same answer in both fiddles and it is the expected results, my work here is finished.
I assume this is a bug, maybe this one (http://bugs.mysql.com/bug.php?id=52051) because the query fails in MySQL 5.5 (http://sqlfiddle.com/#!2/d4eb97/1) but works in 5.6 (http://sqlfiddle.com/#!9/4e140/2)
I think my title says it all, but in essence, this is what I want to do:
SELECT t1.*, t2.friendly_name, CONCAT_WS(" ",t3.name,t3.surname) AS user FROM activity
LEFT JOIN t1.typedb AS t2 ON t1.typeid = t2.id
LEFT JOIN users AS t3 on t1.loginid = t2.loginid
ORDER BY time DESC, user ASC
But as you can imagine, this will give me an error.
I can do a normal select on the activity db and then do a loop in php and run queries to fetch the info. But there has to be a way to do this in one query in MYSQL.
Please help.
So there is a table name in typedb? Then no, SQL doesn't support this. This is not how a relational database is supposed to work.
You can try something along the lines of
SELECT
t1.*,
coalesce(x1.friendly_name, x2.friendly_name, x3.friendly_name) as friendly_name,
CONCAT_WS(" ",t3.name,t3.surname) AS user
FROM activity t1
LEFT JOIN tableX AS x1 ON t1.typeid = x1.id and t1.typedb = 'TABLEX'
LEFT JOIN tableY AS x2 ON t1.typeid = x2.id and t1.typedb = 'TABLEY'
LEFT JOIN tableZ AS x3 ON t1.typeid = x3.id and t1.typedb = 'TABLEZ'
LEFT JOIN users AS t3 on t1.loginid = coalesce(x1.loginid,x2.loginid,x3.loginid)
ORDER BY time DESC, user ASC;
But this is not very efficient. It's just not what SQL is made for. So use a programming language and build dynamic SQL, or even better: change your database design.
Table Name: Look
FieldName: LookUp
example fieldname value : Country.CountryCode
While making a select inside table 'Look' I should dynamically split on value of the fieldname 'LookUp' and get the first value as Tablename and second value as Fieldname to do a dynamic select. I have the split function in place the problem is how to make it work in a case statement or maybe somebody has an alternative solution. currently i have this which is clearly not working
SELECT l.Id,
case when l.lookup is not null then
SELECT t.Id
FROM (SPLIT_STR(l.LOOKUP,'.',1)) AS t
WHERE t.(SPLIT_STR(l.LOOKUP,'.',2)) = l.attValue
LIMIT 1
END AS attValue
FROM look as l
Don't believe it is possible to pick up the table name from a field. Does suggest that there is an issue with your database design though.
Previous similar question:-
MYSQL query using variable as table name in LEFT JOIN
If there is a limited number of related tables / fields to join on and you know them all in advance then something like the following might do it:-
SELECT l.Id,
CASE
WHEN SPLIT_STR(l.LOOKUP,'.',1) = 'tableA' THEN tableA.Id
WHEN SPLIT_STR(l.LOOKUP,'.',1) = 'tableB' THEN tableB.Id
WHEN SPLIT_STR(l.LOOKUP,'.',1) = 'tableC' THEN tableC.Id
WHEN SPLIT_STR(l.LOOKUP,'.',1) = 'tableD' THEN tableD.Id
ELSE NULL
END AS SubId
FROM look as l
LEFT OUTER JOIN tableA ON tableA.ColA = l.attValue
LEFT OUTER JOIN tableB ON tableA.ColB = l.attValue
LEFT OUTER JOIN tableC ON tableA.ColC = l.attValue
LEFT OUTER JOIN tableD ON tableA.ColD = l.attValue
Ie, join against every possible sub table and use a CASE to return the field from the one you want.
But if you are reduced to doing this then I would suggest redesigning the database at the earliest opportunity.
I learned the hard way that i shouldn't store serialized data in a table when i need to make it searchable .
So i made 3 tables the base & two 1-n relation tables .
So here is the query i get if i want to select a specific activity .
SELECT
jdc_organizations_activities.id
FROM
jdc_activity_sector ,
jdc_activity_type
INNER JOIN jdc_organizations_activities ON jdc_activity_type.activityId = jdc_organizations_activities.id
AND
jdc_activity_sector.activityId = jdc_organizations_activities.id
WHERE
jdc_activity_sector.activitySector = 5 AND
jdc_activity_type.activityType = 3
Questions :
1- What kind of indexes can i add on a 1-n relation table , i already have a unique combination of (activityId - activitySector) & (activityId - activityType)
2- Is there a better way to write the query to have a better performance ?
Thank you !
I would re-organise the query to avoid the cross product caused by using , notation.
Also, you are effectively only using the sector and type tables as filters. So put activity table first, and then join on your other tables.
Some may suggest that; the first join should ideally be the join which is most likely to restrict your results the most, leaving the minimal amount of work to do in the second join. In reality, the sql engine can actually re-arrange your query when generateing a plan, but it does help to think this way to help you think about the efforts the sql engine are having to go to.
Finally, there are the indexes on each table. I would actually suggest reversing the Indexes...
- ActivitySector THEN ActivityId
- ActivityType THEN ActivityId
This is specifically because the sql engine is manipulating your query. It can take the WHERE clause and say "only include records from the Sector table where ActivitySector = 5", and similarly for the Type table. By having the Sector and Type identifies FIRST in the index, this filtering of the tables can be done much faster, and then the joins will have much less work to do.
SELECT
[activity].id
FROM
jdc_organizations_activities AS [activity]
INNER JOIN
jdc_activity_sector AS [sector]
ON [activity].id = [sector].activityId
INNER JOIN
jdc_activity_type AS [type]
ON [activity].id = [type].activityId
WHERE
[sector].activitySector = 5
AND [type].activityType = 3
Or, because you don't actually use the content of the Activity table...
SELECT
[sector].activityId
FROM
jdc_activity_sector AS [sector]
INNER JOIN
jdc_activity_type AS [type]
ON [sector].activityId = [type].activityId
WHERE
[sector].activitySector = 5
AND [type].activityType = 3
Or...
SELECT
[activity].id
FROM
jdc_organizations_activities AS [activity]
WHERE
EXISTS (SELECT * FROM jdc_activity_sector WHERE activityId = [activity].id AND activitySector = 5)
AND EXISTS (SELECT * FROM jdc_activity_type WHERE activityId = [activity].id AND activityType = 3)
I would advise against mixing old style from table1, table2 and new style from table1 inner join table2 ... in a single query. And you can alias tables using table1 as t1, shortening long table names to an easy to remember mnenomic:
select a.id
from jdc_organizations_activities a
join jdc_activity_sector as
on as.activityId = a.Id
join jdc_activity_type as at
on at.activityId = a.Id
where as.activitySector = 5
and at.activityType = 3
Or even more readable using IN:
select a.id
from jdc_organizations_activities a
where a.id in
(
select activityId
from jdc_activity_sector
where activitySector = 5
)
and a.id in
(
select activityId
from jdc_activity_type
where activityType = 3
)