Join predicate order - mysql

I'm always be amused and confused(at same time) whenever I have been to asked prepare and run Join query on Sql Console.
And the cause of most confusion is mainly based upon the fact whether/or not the ordering of join predicate hold any importances in Join results.
Example.
SELECT "zones"."name", "ip_addresses".*
FROM "ip_addresses"
INNER JOIN "zones" ON "zones"."id" = "ip_addresses"."zone_id"
WHERE "ip_addresses"."resporg_accnt_id" = 1
AND "zones"."name" = 'us-central1'
LIMIT 1;
Given the sql query, the Join predicate look like this.
... INNER JOIN "zones" ON "zones"."id" = "ip_addresses"."zone_id" WHERE "ip_addresses"."resporg_accnt_id"
Now, would it make any difference in term of performance of Join as well as the authenticity of the obtained result. If happen to change the predicate to look like this
... INNER JOIN "zones" ON "ip_addresses"."zone_id" = "zones"."id" WHERE "ip_addresses"."resporg_accnt_id"

The predicate order won't make a performance difference in your case, a simple equality condition, but personally I like to place the columns from the table I'm JOINing to on the LHS of each ON condition
SELECT ...
FROM ip_addresses ia
JOIN zones z
ON z.id = ia.zone_id
WHERE ...
The optimiser can use any index available on these columns during the JOIN and I find it easier to visualise this way.
Any additional conditions also tend to be on columns of the table being JOINed to and I find again this reads better when this table is consistently on the LHS
Not quite the same, but I did see a case where performance was affected by the choice of column to isolate
I think the JOIN looked something like
SELECT ...
FROM table_a a
JOIN table_b b
ON a.id = b.id - 1
Changing this to
SELECT ...
FROM table_a a
JOIN table_b b
ON b.id = a.id + 1
allowed the optimiser to use an index on b.id, but presumably at the cost of an index on a.id
I suspect this kind of query might need analysing on a case by case basis
Furthermore, I would probably switch your table order round too and write your original query:
SELECT z.name,
ia.*
FROM zones z
JOIN ip_addresses ia
ON ia.zone_id = z.id
AND ia.resporg_accnt_id = 1
WHERE z.name = 'us-central1'
LIMIT 1
Conceptually, you are saying "Start with the 'us-central1' zone and fetch me all the ip_addresses associated with a resporg_accnt_id of 1"
Check the EXPLAIN plans if you want to verify that there is no difference in your case

Related

Check condition across several rows from a join

I have some entities A which can all have one or more B. How can I query "all entities A which have this and that from B"?
I can query "all entities A which have "this" or "that" from B":
SELECT *
FROM A
INNER JOIN B ON B.A_id = A.id
WHERE f = "this" OR f = "that"
GROUP BY A.ID
But I can’t check that they have "this" and "that", because I can’t check something across different rows from A.
I might use a GROUP_CONCAT, but then how can I effectively check that "blah,thing,foo,bar" has blah and foo without some unmaintainable REGEXP mess?
(I actually love regexes, but not so much in the context of an SQL query, for something that seems like it wouldn’t need a regex).
SELECT *, GROUP_CONCAT(f)
FROM A
INNER JOIN B ON B.A_ID = A.ID
WHERE f = "this" OR f = "that"
GROUP BY A.ID
How can I query "all entities A which have this and that from B"?
For such tasks we usually don't join, but use EXISTS or IN instead. E.g.
select *
from a
where id in (select a_id from b where f = 'this')
and id in (select a_id from b where f = 'that');
Here is a solution with an aggregation. In this particular case I see no advantage in using this. There are other situations, though, when an aggregation may be the appropriate solution.
select *
from a
where id in
(
select a_id
from b
group by a_id
having max(f = 'this') = 1
and max(f = 'that') = 1
);
In MySQL true is 1 and false is 0, so taking the maximum of a boolean expression tells us, whether there is at least one row for which the condition is true.
Your own query works, too, by the way, if you add the appropriate HAVING clause. And there is no regular expression matching needed for that. As your WHEREclause limits f to 'this' and 'that', your GROUP_CONCAT result can never be 'this,something_else;that', but only contain 'this' and 'that'. Well, depending on the table there may be duplicates, like 'this,this,that'. Use an ORDER BY clause and DISTINCT:
SELECT a.*
FROM a
INNER JOIN b ON b.a_id = a.id
WHERE b.f IN ('this', 'that')
GROUP BY a.id
HAVING GROUP_CONCAT(DISTINCT b.f ORDER BY b.f) = 'that,this';

Grouping method

I am working on a query with the following format:
I require all the columns from the Database 'A', while I only require the summed amount (sum(amount)) from the Database 'B'.
SELECT A.*, sum(B.CURTRXAM) as 'Current Transaction Amt'
FROM A
LEFT JOIN C
ON A.Schedule_Number = C.Schedule_Number
LEFT JOIN B
ON A.DOCNUMBR = B.DOCNUMBR
ON A.CUSTNMBR = B.CUSTNMBR
GROUP BY A
ORDER BY A.CUSTNMBR
My question is regarding the grouping statement, database A has about 12 columns and to group by each individually is tedious, is there a cleaner way to do this such as:
GROUP BY A
I am not sure if a simpler way exists as I am new to SQL, I have previously investigated GROUPING_ID statements but thats about it.
Any help on lumped methods of grouping would be helpful
Since the docnumber is the primary key - just use the following SQL:
SELECT A.*, sum(B.CURTRXAM) as 'Current Transaction Amt'
FROM A
LEFT JOIN C
ON A.Schedule_Number = C.Schedule_Number
LEFT JOIN B
ON A.DOCNUMBR = B.DOCNUMBR
ORDER BY RM20401.CUSTNMBR
GROUP BY A.DOCNUMBR

Mysql query with '=' is much slower than 'LIKE'

I am currently running into an issue where, when I use a "LIKE" in my query I get the result in 2 seconds. But when I use the '=' instead, it takes around 1 minute for the result to show up.
The following is my query:
QUERY1
The following query takes 2 seconds:
`select distinct p.Name from Timeset s
join table1 f on (f.id = s.id)
join table2 p on (p.source=f.table_name)
join table3 d on (d.Name = p.Name) WHERE
s.Active = 'Y' AND **p.sourcefrom like '%sometable%'`
QUERY2
The same query replacing the 'like' by '=' takes 1 minute:
select distinct p.Name from Timeset s
join table1 f on (f.id = s.id)
join table2 p on (p.source=f.table_name)
join table3 d on (d.Name = p.Name) WHERE
s.Active = 'Y' AND **p.sourcefrom = 'sometable'
I am really puzzled because I know that 'LIKE' is usually slower (than '=') since mysql need to look for different possibilities. But I am sure why in my case, "=" is slower with such a substantial difference.
thank you kindly for the help in advance,
regards,
When you use = MySQL is probably using a different index compared to when you use LIKE. Check the output from the two execution plans and see what the differnce is. Then you can FORCE the use of the better performing index. Might be worth running ANALYZE TABLE for each of the tables involved.

Mysql - db-select with several joins and tablestructure

I am using the following mysql select-query with several joins. I am wondering if this is how a somewhat good select-statement should look like:
SELECT *
FROM table_news AS a
INNER JOIN table_cat AS b ON a.cat_id = b.id
INNER JOIN table_countries AS c ON a.country_id = c.id
INNER JOIN table_addresses AS d ON a.id = d.news_id
WHERE a.deleted = 0
AND a.hidden = 0
AND a.cat_id = ".$search_cat."
AND a.country_id = ".$search_country."
AND a.title LIKE '%".$search_string."%'
OR a.deleted = 0
AND a.hidden = 0
AND a.cat_id = ".$search_cat."
AND a.country_id = ".$search_country."
AND a.subtitle LIKE '%".$search_string."%'"
It seems to be a lot of joins. Even though table b and table c contain only 3 or 4 fields, I wonder if the number of joins would clearly slow down the search on the starting-page?
Would it be better to put the fields from table d (street, city and so on) back into the main-table, as they should be needed most of the time this query is executed?
Thanx in advance,
Jayden
I don't think there is necessarily anything wrong with having three joins. There are a couple of things you can do to make sure the query is optimised.
Firstly, you should never do SELECT * - instead explicitly state what fields you want to return from the database.
Also, I would create indexes on all the fields you have in the where clause, and all of the fields you are joining. This can be a little bit of a trade off - for example if you are doing a lot of write operations then there is a hit because you need to write to the index everytime.

Is there a way to force MySQL execution order?

I know I can change the way MySQL executes a query by using the FORCE INDEX (abc) keyword. But is there a way to change the execution order?
My query looks like this:
SELECT c.*
FROM table1 a
INNER JOIN table2 b ON a.id = b.table1_id
INNER JOIN table3 c ON b.itemid = c.itemid
WHERE a.itemtype = 1
AND a.busy = 1
AND b.something = 0
AND b.acolumn = 2
AND c.itemid = 123456
I have a key for every relation/constraint that I use. If I run explain on this statement I see that mysql starts querying c first.
id select_type table type
1 SIMPLE c ref
2 SIMPLE b ref
3 SIMPLE a eq_ref
However, I know that querying in the order a -> b -> c would be faster (I have proven that)
Is there a way to tell mysql to use a specific order?
Update: That's how I know that a -> b -> c is faster.
The above query takes 1.9 seconds to complete and returns 7 rows. If I change the query to
SELECT c.*
FROM table1 a
INNER JOIN table2 b ON a.id = b.table1_id
INNER JOIN table3 c ON b.itemid = c.itemid
WHERE a.itemtype = 1
AND a.busy = 1
AND b.something = 0
AND b.acolumn = 2
HAVING c.itemid = 123456
the query completes in 0.01 seconds (Without using having I get 10.000 rows).
However that is not a elegant solution because this query is a simplified example. In the real world I have joins from c to other tables. Since HAVING is a filter that is executed on the entire result it would mean that I would pull some magnitues more records from the db than nescessary.
Edit2: Just some information:
The variable part in this query is c.itemid. Everything else are fixed values that don't change.
Indexes are setup fine and mysql chooses the right ones for me
between a and b there is a 1:n relation (index PRIMARY is used)
between b and c there is a many to many relation (index IDX_ITEMID is used)
the point is that mysql should start querying table a and work it's way down to c and not the other way round. Any change to achive that.
Solution: Not exactly what I wanted but this seems to work:
SELECT c.*
FROM table1 a
INNER JOIN table2 b ON a.id = b.table1_id
INNER JOIN table3 c ON b.itemid = c.itemid
WHERE a.itemtype = 1
AND a.busy = 1
AND b.something = 0
AND b.acolumn = 2
AND c.itemid = 123456
AND f.id IN (
SELECT DISTINCT table2.id FROM table1
INNER JOIN table2 ON table1.id = table2.table1_id
WHERE table1.itemtype = 1 AND table1.busy = 1)
Perhaps you need to use STRAIGHT_JOIN.
http://dev.mysql.com/doc/refman/5.0/en/join.html
STRAIGHT_JOIN is similar to JOIN, except that the left table is always read before the right table. This can be used for those (few) cases for which the join optimizer puts the tables in the wrong order.
You can use FORCE INDEX to force the execution order, and I've done that before.
If you think about it, there's usually only one order you could query tables in for any index you pick.
In this case, if you want MySQL to start querying a first, make sure the index you force on b is one that contains b.table1_id. MySQL will only be able to use that index if it's already queried a first.
You can try rewriting in two ways
bring some of the WHERE condition into JOIN
introduce subqueries even though they are not necessary
Both things might impact the planner.
First thing to check, though, would be if your stats are up to date.