MySQL query optimisation - to alias or not? - mysql

Is there any major difference from an optimisation point of view between the following two alternatives? In the first option I alias the table, so the total_paid calculation is only run once. In the second option, there is no table alias, but the SUM calculation is required a few times.
Option 1
SELECT tt.*, tt.amount - tt.total_paid as outstanding
FROM
(
SELECT t1.id, t1.amount, SUM(t2.paid) as total_paid
FROM table1 t1
LEFT JOIN table2 t2 on t1.id = t2.t1_id
GROUP BY t1.id
) as temp_table tt
WHERE (tt.amount - tt.total_paid) > 0
LIMIT 0, 25
Option 2
SELECT t1.id, t1.amount, SUM(t2.paid) as total_paid
, (t1.amount - SUM(t2.paid)) as outstanding
FROM table1 t1
LEFT JOIN table2 t2 on t1.id = t2.t1_id
WHERE (t1.amount - SUM(t2.paid)) > 0
GROUP BY t1.id
LIMIT 0, 25
Or perhaps there is an even better option?

if you run the queries with EXPLAIN, you'll be able to see what's going on 'inside'.
Also, why don't you just run it and compare the execution times?
Read more on explain here

Related

MySQL: LIMIT parameter in JOIN producing unexpected results

I have read here that MySQL processes ordering before applying limits. However, I receive different results when applying a LIMIT parameter in conjunction with a JOIN subquery. Here is my query:
SELECT
t1.id,
(t2.counts / c.matches)
FROM
table_one t1
JOIN
table_two t2 ON t1.id = t2.id
JOIN
(
SELECT
t1.id, COUNT(DISTINCT t1.id) AS matches
FROM
table_one t1
JOIN table_two t2 ON t1.id = t2.id
WHERE
t1.id IN (3390 , 3236, 148, 2811, 829, 137)
AND t2.value_one <= 30
AND t2.value_two < 2
GROUP BY t1.id
ORDER BY (t2.counts / matches)
LIMIT 0, 50 -- PROBLEM IS HERE (I think)
) c ON c.id = t1.id
ORDER BY (t2.counts / c.matches), t1.id;
Here is a rough description of what I think is happening:
The sub-query selects a bunch of ids from table_one that meet the criteria
These are ordered by (t2.counts / matches)
The top 50 (in ascending order) are fashioned into a table
This resulting table is then joined on the the id column
Results are returned from the top level JOIN - without a GROUP BY clause this time. table_one is a reference table so this will return many rows with the same ID.
I appreciate that some of these joins don't make a lot of sense but I have stripped down my query for readability - it's normally quite chunky .
The problem is that when, I include the LIMIT parameter I get a different set of results and not just the top 50. What I want to do is get the top results from the subquery and use these to join onto a bunch of other tables based on the reference table.
Here is what I have tried so far:
LIMIT on the outer query (this is undesirable as this cuts off important information).
Trying different LIMIT tables and values.
Any idea what is going wrong, or what else I could try?
I have found a solution to my problem. It seems as if my matches column name does can't be used in my ORDER BY clause - which is weird since I don't get an error. Either way, this solves the problem:
SELECT
t1.id,
(t2.counts / c.matches)
FROM
table_one t1
JOIN
table_two t2 ON t1.id = t2.id
JOIN
(
SELECT
t1.id, COUNT(DISTINCT t1.id) AS matches
FROM
table_one t1
JOIN table_two t2 ON t1.id = t2.id
WHERE
t1.id IN (3390 , 3236, 148, 2811, 829, 137)
AND t2.value_one <= 30
AND t2.value_two < 2
GROUP BY t1.id
ORDER BY (t2.counts / COUNT(DISTINCT t1.id)) -- This line is changed
LIMIT 0, 50
) c ON c.id = t1.id
ORDER BY (t2.counts / c.matches), t1.id;

A query that will see which results are the most

There is a table where strings are stored by type.
How to write 1 query so that it counts and outputs the number of results of each type.
Tried to do this:
SELECT COUNT(t1.id), COUNT(t2.id)
FROM table t1
LEFT JOIN table t2
ON t2.type='n1'
WHERE t1.type='b2';
But nothing works.
There can be many types, how can this be done?
I think this is what you want:
SELECT t1.type, COUNT(DISTINCT t1.id), COUNT(DISTINCT t2.id)
FROM table1 AS t1
LEFT JOIN table2 AS t2 ON t1.type = t2.type
GROUP BY t1.type
Note that this won't show any types that are only in table2. For that you'd need a FULL OUTER JOIN, which MySQL doesn't have. See How can I do a FULL OUTER JOIN in MySQL? for how to emulate it.

How to optimize mysql on left join

I try to explain a very high level
I have two complex SELECT queries(for the sake of example I reduce the queries to the following):
SELECT id, t3_id FROM t1;
SELECT t3_id, MAX(added) as last FROM t2 GROUP BY t3_id;
query 1 returns 16k rows and query 2 returns 15k
each queries individually takes less than 1 second to compute
However what I need is to sort the results using column added of query 2, when I try to use LEFT join
SELECT
t1.id, t1.t3_Id
FROM
t1
LEFT JOIN
(SELECT t3_id, MAX(added) as last FROM t2 GROUP BY t3_id) AS t_t2
ON t_t2.t3_id = t1.t3_id
GROUP BY t1.t3_id
ORDER BY t_t2.last
However, the execution time goes up to over a 1 minute.
I like to understand the reason
what is the cause of such a huge explosion?
NOTE:
ALL the used columns on every table have been indexed
e.g. :
table t1 has index on id,t3_Id
table t2 has index on t3_id and added
EDIT1
after #Tim Biegeleisen suggestion, I change the query to the following now the query is executing in about 16 seconds. If I remove the ORDER BY it query gets executed in less than 1 seconds. The problem is that ORDER BY the sole reason for this.
SELECT
t1.id, t1.t3_Id
FROM
t1
LEFT JOIN
t2 ON t2.t3_id = t1.t3_id
GROUP BY t1.t3_id
ORDER BY MAX(t2.added)
Even though table t2 has an index on column t3_id, when you join t1 you are actually joining to a derived table, which either can't use the index, or can't use it completely effectively. Since t1 has 16K rows and you are doing a LEFT JOIN, this means the database engine will need to scan the entire derived table for each record in t1.
You should use MySQL's EXPLAIN to see what the exact execution strategy is, but my suspicion is that the derived table is what is slowing you down.
The correct query should be:
SELECT
t1.id,
t1.t3_Id,
MAX(t2.added) as last
FROM t1
LEFT JOIN t2 on t1.t3_Id = t2.t3_Id
GROUP BY t2.t3_id
ORDER BY last;
This is happen because a temp table is generating on each record.
I think you could try to order everything after the records are available. Maybe:
select * from (
select * from
(select t3_id,max(t1_id) from t1 group by t3_id) as t1
left join (select t3_id,max(added) as last from t2 group by t3_id) as t2
on t1.t3_id = t2.t3_id ) as xx
order by last

Mysql Join 3 Tables in One Query with sorted Result

I have 3 tables and want to join all in one query to show latest 10 entries by datetime.
t1: id, username
t2: id, id_t1, med_id, ga_id, au_id, re_id, text, datetime
t3: id, id_t1, pro_id, au_id, re_id, text, datetime
First I saw it would be easy with simple left join and where id, but i got double results. Then i tried inner and outer join, also group by, but the result was bad.
So my question is how can i join all without double results of the last 10 of t2 and t3?
Hard to tell what exactly you are trying to acheive, but here is a clue how it could be complemented.
SELECT TOP 10 DISTINCT T1.*
FROM T1
INNER JOIN T2 ON T1.id = T2.id_t1
INNER JOIN T3 ON T1.id = T3.id_t1
ORDER BY (CASE WHEN T2.[DateTime] > T3.[DateTime] THEN
T2.[DateTime]
ELSE
T3.[DateTime]
END) DESC
If you need to select field from T2 and T3, GROUP BY on all T1 field with aggregate on field from t2 and t3 is an option. Otherwise, linked-subquery is the way to go.
As sgeddes commented already, it's hard to know what you need, without seeing some example data from your tables. It would really help to know what the relationship between the three tables is.
One question I have, in particular, is: how are t2 and t3 related, if at all? It looks like they might not be, as each of them has its own datetime column.
Perhaps the following could do the job, but we need some more info to know for sure:
(SELECT DISTINCT t1.*, t2.id, t2.au_id, t2.re_id, t2.text, t2.`datetime`, t2.med_id, t2.ga_id, NULL AS pro_id
FROM t1
INNER JOIN t2 ON t1.id = t2.id_t1)
UNION
(SELECT DISTINCT t1.*, t3.id, t3.au_id, t3.re_id, t3.text, t3.`datetime`, NULL AS med_id, NULL AS ga_id, t3.pro_id
FROM t1
INNER JOIN t3 ON t1.id = t3.id_t1)
ORDER BY datetime DESC
LIMIT 10
The following selects the username and the datetime for the last ten posts.
SELECT username, last_ten.`datetime` AS lastpost
FROM t1
INNER JOIN (
SELECT 't2' AS tab, id, `datetime`, t2.id_t1
FROM t2
UNION ALL
SELECT 't3' AS tab, id, `datetime`, t3.id_t1
FROM t3
ORDER BY datetime DESC
LIMIT 10
) AS last_ten ON t1.id = last_ten.id_t1

Using a query on a joined table instead of actual table

I have a query now which works great for finding player info including their rank:
SELECT t1.pid,
t1.type,
t1.pspeed,
t1.distance,
t1.maxspeed,
t1.prestrafe,
t1.strafes,
t1.sync,
t1.wpn,
1+SUM(t2.pid IS NOT NULL) as rank
FROM records t1
LEFT JOIN records t2
ON t1.type = t2.type
AND t1.pspeed = t2.pspeed
AND t1.distance < t2.distance
WHERE t1.pid = "'.$pid.'"
AND t1.type IN ("type1","type2","type3")
GROUP BY t1.type, t1.pspeed
The problem now is that I wish to use an authid instead of a pid as my variable. I can do an easy join on the records table along with a player table that has an id equal to the pid in the records table as well as the authid I wish to search by:
SELECT * FROM records JOIN players ON records.pid=players.id
I wish to use this second query in place of the FROM records portion of the first query but I've confused myself in my attempts via the many aliases used.
Here's a pitiful example of an attempt:
SELECT t1.pid,
t1.type,
t1.pspeed,
t1.distance,
t1.maxspeed,
t1.prestrafe,
t1.strafes,
t1.sync,
t1.wpn,
1+SUM(t2.pid IS NOT NULL) as rank
FROM (
SELECT *
FROM records
JOIN players ON records.pid=players.id
) t3 t1
LEFT JOIN records t2
ON t1.type = t2.type
AND t1.pspeed = t2.pspeed
AND t1.distance < t2.distance
WHERE t1.authid = "'.$authid.'"
AND t1.type IN ("type1","type2","type3")
GROUP BY t1.type, t1.pspeed
Anyone able to whip me into shape please feel free to do so. I'm awful at SQL still.
Your Join looks pretty close. However you gave your subquery two aliases. Try just one
SELECT t1.pid, t1.type, t1.pspeed, t1.distance, t1.maxspeed, t1.prestrafe, t1.strafes, t1.sync, t1.wpn, 1+SUM(t2.pid IS NOT NULL) as rank
FROM
(
SELECT players.*, rec.pid, rec.type, rec.pspeed, rec.distance, rec.maxspeed,
rec.prestrafe, rec.strafes, rec.sync, rec.wpn
FROM records AS rec JOIN players ON rec.pid = players.id ) AS t1
LEFT OUTER JOIN records t2
ON t1.type = t2.type AND t1.pspeed = t2.pspeed AND t1.distance < t2.distance
WHERE t1.authid = "'.$authid.'" AND
t1.type IN ("type1","type2","type3")
GROUP BY t1.type, t1.pspeed
Also, naming your columns in your subquery should eliminate the duplicate column error.