MySQL combination of view, subquery and left join produces a strange result

MySQL combination of view, subquery and left join produces a strange result - mysql

Update 1
I discover when it does the wrong behaviour. If the view is composed by two tables, only the fields in the first table has values inside the subquery. I don't know why, but if I change the JOIN order, it works. As soon as I try to match another field with the second table it returns NULL again.
Update 2
I've created a working example here: http://sqlfiddle.com/#!2/d4eb97/1
Update 3
The same example works in a newer MySQL version (5.6.6) so maybe there is a bug in the 5.5 - http://sqlfiddle.com/#!9/4e140/2
I've a schema in which I ended doing a SQL like this:
SELECT view.user,
(
SELECT tableA.user
FROM tableA
LEFT JOIN tableB ON tableA.id = tableB.tableA_id
WHERE tableA.user = view.user
LIMIT 1
) as b_user
FROM view
WHERE view.user = 1
What I'm doing here is simple:
Select two fields from view
view is a MySQL view, not a real table.
The second field is a subquery of:
2.1 The field user of the table tableA
2.2 Left join with the table tableB with the relational field
There are no rows in tableB yet
2.3 Only where the the tableA user is the same as in the view
2.4 Limit 1, just for this example
Limit results to user = 1
The strange thing here is that in some situations the field b_user is NULL, but the data is ok.
I can make three changes to make it works:
fix 1
Put the user id manually make it works
SELECT view.user,
(
SELECT tableA.user
FROM tableA
LEFT JOIN tableB ON tableA.id = tableB.tableA_id
WHERE tableA.user = 1
LIMIT 1
) as b_user
FROM view
WHERE view.user = 1
fix 2
Remove the left join also make it works:
SELECT view.user,
(
SELECT tableA.user
FROM tableA
WHERE tableA.user = view.user
LIMIT 1
) as b_user
FROM view
WHERE view.user = 1
fix 3
Another option is not to use the MySQL view:
SELECT view.user,
(
SELECT tableA.user
FROM tableA
WHERE tableA.user = view_table_a.user
LEFT JOIN tableB ON tableA.id = tableB.tableA_id
LIMIT 1
) as b_user
FROM view_table_a INNER JOIN view_table_b ON condition
WHERE table_a.user = 1
I'm not being able to reproduce this recreating a new database schema manually, it only happens in my current setup, which I cannot expose here due to security reasons.
Why the subquery return NULL values? I need to make the first query works since I can't use any of the three fixes.

Why have the subquery in the first place? I like subqueries, they are very handy things of have around. But they shouldn't be used if they don't have to be. Queries can get complicated enough with no help from us.
You are looking for a particular user from the main table (the fact that it is really a view is irrelevant) then using the same User value to join with TableA and then optionally joining to TableB using the ID value associated with that user:
select rs.Origin, a.Origin as Same_Origin
from requests_status rs
join assignments a
on a.employee = rs.employee
and a.origin = rs.origin
left join assignments_author aa
on aa.assignment = a.id
where rs.employee = 1;
Then I noticed that in your fiddles, you create the assignments_author table but never populate it. But that doesn't really matter because you left join to it. But you don't use any data from that table. So in actuality, you don't need that table in your query at all. Thus the equivalent query would be:
select rs.Origin, a.Origin as Same_Origin
from requests_status rs
join assignments a
on a.employee = rs.employee
and a.origin = rs.origin
where rs.employee = 1;
I don't know why you get a NULL in one but not the other. But since the query above returns the same answer in both fiddles and it is the expected results, my work here is finished.

I assume this is a bug, maybe this one (http://bugs.mysql.com/bug.php?id=52051) because the query fails in MySQL 5.5 (http://sqlfiddle.com/#!2/d4eb97/1) but works in 5.6 (http://sqlfiddle.com/#!9/4e140/2)

Related

MySQL Query limiting results by sub table

I'm really struggling with this query and I hope somebody can help.
I am querying across multiple tables to get the dataset that I require. The following query is an anonymised version:
SELECT main_table.id,
sub_table_1.field_1,
main_table.field_1,
main_table.field_2,
main_table.field_3,
main_table.field_4,
main_table.field_5,
main_table.field_6,
main_table.field_7,
sub_table_2.field_1,
sub_table_2.field_2,
sub_table_2.field_3,
sub_table_3.field_1,
sub_table_4.field_1,
sub_table_4.field_2
FROM main_table
INNER JOIN sub_table_4 ON sub_table_4.id = main_table.id
INNER JOIN sub_table_2 ON sub_table_2.id = main_table.id
INNER JOIN sub_table_3 ON sub_table_3.id = main_table.id
INNER JOIN sub_table_1 ON sub_table_1.id = main_table.id
WHERE sub_table_4.field_1 = '' AND sub_table_4.field_2 = '0' AND sub_table_2.field_1 != ''
The query works, the problem I have is sub_table_1 has a revision number (int 11). Currently I get duplicate records with different revision numbers and different versions of sub_table_1.field_1 which is to be expected, but I want to limit the result set to only include results limited by the latest revision number, giving me only the latest sub_table_1_field_1 and I really can not figure it out!
Can anybody lend me a hand?
Many Thanks.

It's always important to remember that a JOIN can be on a subquery as well as a table. You could build a subquery that returns the results you want to see then, once you've got the data you want, join it in the parent query.
It's hard to 'tailor' an answer that's specific to you problem, as it's too obfuscated (as you admit) to know what the data and tables really look like, but as an example:
Say table1 has four fields: id, revision_no, name and stuff. You want to return a distinct list of name values, with their latest version of stuff (which, we'll pretend varies by revision). You could do this in isolation as:
select t.* from table1 t
inner join
(SELECT name, max(revision_no) maxr
FROM table1
GROUP BY name) mx
on mx.name = t.name
and mx.maxr = t.revision_no;
(Note: see fiddle at the end)
That would return each individual name with the latest revision of stuff.
Once you've got that nailed down, you could then swap out
INNER JOIN sub_table_1 ON sub_table_1.id = main_table.id
....with....
INNER JOIN (select t.* from table1 t
inner join
(SELECT name, max(revision_no) maxr
FROM table1
GROUP BY name) mx
on mx.name = t.name
and mx.maxr = t.revision_no) sub_table_1
ON sub_table_1.id = main_table.id
...which would allow a join with a recordset that is more tailored to that which you want to join (again, don't get hung up on the actual query I've used, it's just there to demonstrate the method).
There may well be more elegant ways to achieve this, but it's sometimes good to start with a simple solution that's easier to replicate, then simplify it once you've got the general understanding of the what and why nailed down.
Hope that helps - as I say, it's as specific as I could offer without having an idea of the real data you're using.
(for the sake of reference, here is a fiddle with a working version of the above example query)

In your case where you only need one column from the table, make this a subquery in your select clause instead of than a join. You get the latest revision by ordering by revision number descending and limiting the result to one row.
SELECT
main_table.id,
(
select sub_table_1.field_1
from sub_table_1
where sub_table_1.id = main_table.id
order by revision_number desc
limit 1
) as sub_table_1_field_1,
main_table.field_1,
...
FROM main_table
INNER JOIN sub_table_4 ON sub_table_4.id = main_table.id
INNER JOIN sub_table_2 ON sub_table_2.id = main_table.id
INNER JOIN sub_table_3 ON sub_table_3.id = main_table.id
WHERE sub_table_4.field_1 = ''
AND sub_table_4.field_2 = '0'
AND sub_table_2.field_1 != '';

Selecting single max values

Say I need to pull data from several tables like so:
item 1 - from table 1
item 2 - from table 1
item 3 - from table 1 - but select only max value of item 3 from table 1
item 4 - from table 2 - but select only max value of item 4 from table 2
My query is pretty simple:
select
a.item 1,
a.item 2,
b.item 3,
c.item 4
from table 1 a
left join (select b.key_item, max(item 3) from table 1, group by key_item) b on a.key_item = b.key_item
left join (select c.key_item, max(item 4) from table 2, group by key_item) c on c.key_item = a.key_item
I am not sure if my methodology of pulling just a single max item from a table is the most efficient. Assume both tables are over a million rows. my actual sql run forever using this sql setup.
EDIT: I changed the group by clause to reflect comments made. I hope it makes a bit of sense now?

Your best bet is to add an index on table1 and table2, as follows:
ALTER TABLE table1
ADD INDEX `GoodIndexName1` (`key_item`,`item3`)
ALTER TABLE table2
ADD INDEX `GoodIndexName2` (`key_item`,`item4`)
This will allow you to use queries as described in the MySQL documentation for finding the rows holding the group-wise maximum, which appears to be what you are looking for.
Your original (edited) query should work:
select
a.item1,
a.item2,
b.item3,
c.item4
from table1 a
LEFT OUTER JOIN (
SELECT
b.key_item,
MAX(item3) AS item3
FROM table1
GROUP BY key_item
) b
ON a.key_item = b.key_item
LEFT OUTER JOIN (
SELECT
c.key_item,
MAX(item4)
FROM table2
GROUP BY key_item
) c
ON c.key_item = a.key_item
and if that performs slowly after adding the indexes, try the following too:
SELECT
a.item1,
a.item2,
b.item3,
c.item4
FROM table1 a
LEFT OUTER JOIN table1 b
ON b.key_item = a.key_item
LEFT OUTER JOIN table1 larger_b
ON larger_b.key_item = b.key_item
AND larger_b.item3 > b.item_3
LEFT OUTER JOIN table2 c
ON c.key_item = a.key_item
LEFT OUTER JOIN table2 larger_c
ON larger_c.key_item = c.key_item
AND larger_c.item4 > c.item4
WHERE larger_b.key_item IS NULL
AND larger_c.key_item IS NULL
(I have modified the table and column names only slightly, so that they conform to correct MySQL syntax. )
I work with queries that use the above structure all the time, and they perform very efficiently with indexes like the one I provided.
That said, usually I am using INNER JOINs on the b and c tables, but I don't see why your query should have any issues.
If you do experience performance problems still, report the data types of the key_item columns for each table, as if you try to join on different data types, you will generally get poor performance.

MySql dynamic select

Table Name: Look
FieldName: LookUp
example fieldname value : Country.CountryCode
While making a select inside table 'Look' I should dynamically split on value of the fieldname 'LookUp' and get the first value as Tablename and second value as Fieldname to do a dynamic select. I have the split function in place the problem is how to make it work in a case statement or maybe somebody has an alternative solution. currently i have this which is clearly not working
SELECT l.Id,
case when l.lookup is not null then
SELECT t.Id
FROM (SPLIT_STR(l.LOOKUP,'.',1)) AS t
WHERE t.(SPLIT_STR(l.LOOKUP,'.',2)) = l.attValue
LIMIT 1
END AS attValue
FROM look as l

Don't believe it is possible to pick up the table name from a field. Does suggest that there is an issue with your database design though.
Previous similar question:-
MYSQL query using variable as table name in LEFT JOIN
If there is a limited number of related tables / fields to join on and you know them all in advance then something like the following might do it:-
SELECT l.Id,
CASE
WHEN SPLIT_STR(l.LOOKUP,'.',1) = 'tableA' THEN tableA.Id
WHEN SPLIT_STR(l.LOOKUP,'.',1) = 'tableB' THEN tableB.Id
WHEN SPLIT_STR(l.LOOKUP,'.',1) = 'tableC' THEN tableC.Id
WHEN SPLIT_STR(l.LOOKUP,'.',1) = 'tableD' THEN tableD.Id
ELSE NULL
END AS SubId
FROM look as l
LEFT OUTER JOIN tableA ON tableA.ColA = l.attValue
LEFT OUTER JOIN tableB ON tableA.ColB = l.attValue
LEFT OUTER JOIN tableC ON tableA.ColC = l.attValue
LEFT OUTER JOIN tableD ON tableA.ColD = l.attValue
Ie, join against every possible sub table and use a CASE to return the field from the one you want.
But if you are reduced to doing this then I would suggest redesigning the database at the earliest opportunity.

UPDATE Syntax with ORDER BY, LIMIT and Multiple Tables

Learning SQL, sorry if this is rudimentary. Trying to figure out a working UPDATE solution for the following pseudoish-code:
UPDATE tableA
SET tableA.col1 = '$var'
WHERE tableA.user_id = tableB.id
AND tableB.username = '$varName'
ORDER BY tableA.datetime DESC LIMIT 1
The above is more like SELECT syntax, but am basically trying to update a single column value in the latest row of tableA, where a username found in tableB.username (represented by $varName) is linked to its ID number in tableB.id, which exists as the id in tableA.user_id.
Hopefully, that makes sense. I'm guessing some kind of JOIN is necessary, but subqueries seem troublesome for UPDATE. I understand ORDER BY and LIMIT are off limits when multiple tables are involved in UPDATE... But I need the functionality. Is there a way around this?
A little confused, thanks in advance.

The solution is to nest ORDER BY and LIMIT in a FROM clause as part of a join. This let's you find the exact row to be updated (ta.id) first, then commit the update.
UPDATE tableA AS target
INNER JOIN (
SELECT ta.id
FROM tableA AS ta
INNER JOIN tableB AS tb ON tb.id = ta.user_id
WHERE tb.username = '$varName'
ORDER BY ta.datetime DESC
LIMIT 1) AS source ON source.id = target.id
SET col1 = '$var';
Hat tip to Baron Schwartz, a.k.a. Xaprb, for the excellent post on this exact topic:
http://www.xaprb.com/blog/2006/08/10/how-to-use-order-by-and-limit-on-multi-table-updates-in-mysql/

You can use following query syntax:
update work_to_do as target
inner join (
select w. client, work_unit
from work_to_do as w
inner join eligible_client as e on e.client = w.client
where processor = 0
order by priority desc
limit 10
) as source on source.client = target.client
and source.work_unit = target.work_unit
set processor = #process_id;
This works perfectly.

Is there a way to force MySQL execution order?

I know I can change the way MySQL executes a query by using the FORCE INDEX (abc) keyword. But is there a way to change the execution order?
My query looks like this:
SELECT c.*
FROM table1 a
INNER JOIN table2 b ON a.id = b.table1_id
INNER JOIN table3 c ON b.itemid = c.itemid
WHERE a.itemtype = 1
AND a.busy = 1
AND b.something = 0
AND b.acolumn = 2
AND c.itemid = 123456
I have a key for every relation/constraint that I use. If I run explain on this statement I see that mysql starts querying c first.
id select_type table type
1 SIMPLE c ref
2 SIMPLE b ref
3 SIMPLE a eq_ref
However, I know that querying in the order a -> b -> c would be faster (I have proven that)
Is there a way to tell mysql to use a specific order?
Update: That's how I know that a -> b -> c is faster.
The above query takes 1.9 seconds to complete and returns 7 rows. If I change the query to
SELECT c.*
FROM table1 a
INNER JOIN table2 b ON a.id = b.table1_id
INNER JOIN table3 c ON b.itemid = c.itemid
WHERE a.itemtype = 1
AND a.busy = 1
AND b.something = 0
AND b.acolumn = 2
HAVING c.itemid = 123456
the query completes in 0.01 seconds (Without using having I get 10.000 rows).
However that is not a elegant solution because this query is a simplified example. In the real world I have joins from c to other tables. Since HAVING is a filter that is executed on the entire result it would mean that I would pull some magnitues more records from the db than nescessary.
Edit2: Just some information:
The variable part in this query is c.itemid. Everything else are fixed values that don't change.
Indexes are setup fine and mysql chooses the right ones for me
between a and b there is a 1:n relation (index PRIMARY is used)
between b and c there is a many to many relation (index IDX_ITEMID is used)
the point is that mysql should start querying table a and work it's way down to c and not the other way round. Any change to achive that.
Solution: Not exactly what I wanted but this seems to work:
SELECT c.*
FROM table1 a
INNER JOIN table2 b ON a.id = b.table1_id
INNER JOIN table3 c ON b.itemid = c.itemid
WHERE a.itemtype = 1
AND a.busy = 1
AND b.something = 0
AND b.acolumn = 2
AND c.itemid = 123456
AND f.id IN (
SELECT DISTINCT table2.id FROM table1
INNER JOIN table2 ON table1.id = table2.table1_id
WHERE table1.itemtype = 1 AND table1.busy = 1)

Perhaps you need to use STRAIGHT_JOIN.
http://dev.mysql.com/doc/refman/5.0/en/join.html
STRAIGHT_JOIN is similar to JOIN, except that the left table is always read before the right table. This can be used for those (few) cases for which the join optimizer puts the tables in the wrong order.

You can use FORCE INDEX to force the execution order, and I've done that before.
If you think about it, there's usually only one order you could query tables in for any index you pick.
In this case, if you want MySQL to start querying a first, make sure the index you force on b is one that contains b.table1_id. MySQL will only be able to use that index if it's already queried a first.

You can try rewriting in two ways
bring some of the WHERE condition into JOIN
introduce subqueries even though they are not necessary
Both things might impact the planner.
First thing to check, though, would be if your stats are up to date.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL combination of view, subquery and left join produces a strange result - mysql

I assume this is a bug, maybe this one (http://bugs.mysql.com/bug.php?id=52051) because the query fails in MySQL 5.5 (http://sqlfiddle.com/#!2/d4eb97/1) but works in 5.6 (http://sqlfiddle.com/#!9/4e140/2)

Related

MySQL Query limiting results by sub table

Selecting single max values

MySql dynamic select

UPDATE Syntax with ORDER BY, LIMIT and Multiple Tables

Is there a way to force MySQL execution order?

Categories

Resources