How to optimize simple MySQL query with lots of INNER JOINS - mysql

I've found info on how to optimize MySQL queries, but most of the tips seem to suggest avoiding things MySQL isn't built for (e.g., calculations, validation, etc.) My query on the other hand is very straight forward but joins a lot of tables together.
Is there an approach to speeding up simple queries with many INNER JOINS? How would I fix my query below?
SELECT t_one.id FROM table_one t_one
INNER JOIN entr_to_state st
INNER JOIN entr_to_country ct
INNER JOIN entr_to_domain dm
INNER JOIN entr_timing t
INNER JOIN entr_to_weather a2w
INNER JOIN entr_to_imp_num a2i
INNER JOIN entr_collection c
WHERE t_one.type='normal'
AND t_one.campaign_id = c.id
AND t_one.status='running'
AND c.status='running'
AND (c.opt_schedule = 'continuous' OR (c.opt_schedule = 'schedulebydate'
AND (c.start_date <= '2011-03-06 14:25:52' AND c.end_date >= '2011-03-06 14:25:52')))
AND t.entr_id = t_one.id AND ct.entr_id = t_one.id
AND st.entr_id = t_one.id AND a2w.entr_id = t_one.id
AND (t_one.targeted_gender = 'male' OR t_one.targeted_gender = 'both')
AND t_one.targeted_min_age <= 23.1 AND t_one.targeted_max_age > 23.1
AND (ct.abbreviation = 'US' OR ct.abbreviation = 'any')
AND (st.abbreviation = 'CO' OR st.abbreviation = 'any')
AND t.sun = 1 AND t.hour_14 = 1
AND (a2w.weather_category_id = 1 OR a2w.weather_category_id = 0)
AND t_one.targeted_min_temp <= 46
AND t_one.targeted_max_temp > 46 GROUP BY t_one.id

Index all relevant fields, of course, which I'm sure you have
Then find which joins are the most costly ones by running EXPLAIN SELECT...
Consider splitting them off into a seperate query i.e. narrow down the record(s) you're looking for, then perform the joins on those records rather than all the records
i.e.
SELECT c.*, ....
FROM (SELECT x, y, z .... ) AS c

You would need to EXPLAIN SELECT the query, check which parts of the query are not using in indices, and then attempt to index those. If possible, break the query down into smaller parts as well.
If you really cannot in any way optimize the underlying DB or your query, you could resort to a flat table that has the data you need for fast access. Then just hook up the main query to update the flat table to run as often as needed.

Related

SQL JOIN query needs over 15s to run

I have a pretty big SQL query to get data from multiple database tables. I use the ON condition to check if the guild_ids are always the same and in some cases, he check's for an user_id too.
That is my query:
SELECT
SUM( f.guild_id = 787672220503244800 AND f.winner_id LIKE '%841827102331240468%' ) AS guild_winner,
SUM( f.winner_id LIKE '%841827102331240468%' ) AS win_sum,
m.message_count,
r.bypass_role_id,
i.real_count,
i.total_count,
i.bonus_count,
i.left_count
FROM
guild_finished_giveaways AS f
JOIN guild_message_count AS m
JOIN guild_role_settings AS r
JOIN guild_invite_count AS i ON m.guild_id = f.guild_id
AND m.user_id = 841827102331240468
AND r.guild_id = f.guild_id
AND i.guild_id = f.guild_id
AND i.user_id = m.user_id
But it runs pretty slow, with over 15s. I can't see why it needs so long.
I figured out that if I remove the "guild_invite_count" JOIN, it's pretty fast again. Do I have some simple error here that I don't see? Or what could be the issue?
Each JOIN expression needs it's own ON. Don't wait until the end for this. As it was, the server was forced to build up a cartesian product of all those tables before narrowing them down again, and I'm surprised the query ran at all (I'd expect a syntax error for missing ON clauses).
FROM guild_finished_giveaways AS f
JOIN guild_message_count AS m ON m.guild_id = f.guild_id
JOIN guild_role_settings AS r ON r.guild_id = f.guild_id
JOIN guild_invite_count AS i ON i.guild_id = f.guild_id
AND i.user_id = m.user_id
WHERE m.user_id = 841827102331240468
It's also more than a little odd to use SUM() or any other aggregate function in the same query as non-aggregated values without a GROUP BY clause.
Are you using InnoDB?
Does every table have a PRIMARY KEY?
These may help:
m: PRIMARY KEY(user_id) -- assuming that is unique in that table
f: INDEX(guild_id, winner_id)
r: INDEX(guild_id, bypass_role_id)
i: INDEX(user_id,)
It looks like some tables should not be separate -- perhaps r,i,f could be combined? (I need to see SHOW CREATE TABLE to say more.)
Do NOT have a commalist in winner_id. Instead have another table with one row per winner per game (or whatever it is a winner of). Perhaps just to columns like a Many-to-many mapping table.
Noting that the execution is likely to start with m and then go next to i let's improve on Joel's suggestion:
FROM guild_message_count AS m
JOIN guild_invite_count AS i ON i.user_id = m.user_id
JOIN guild_finished_giveaways AS f ON f.guild_id = m.guild_id
JOIN guild_role_settings AS r ON r.guild_id = m.guild_id
WHERE m.user_id = 841827102331240468
Note that 3 tables are joined on guild_id; but only 2 = are needed.
SUM without GROUP BY sums up the entire resultset (after JOINing). But you have 6 non-aggregates, so you need to GROUP BY all 6.
But that may lead to grossly inflated sums. Maybe you need to do the aggregation just over f first since that is where you are summing. Then JOIN to the rest??

MariaDB Slow Query on Plesk vs local PC

I have the below query, which I appreciate probably isn't well written, but on my local PC with Xampp and MariaDB it executes in 0.1719 seconds, which is about the speed I would hope for.
However, on my development server with Plesk and MariaDB the same query with the same data takes over 12 seconds. Obviously would be no use.
Probably the query could be modified to make it better, but can somebody explain why the performance difference? The server is a VPS, it has no shortage of resources - it isn't live so usage is almost none at all, yet still 12+ seconds for this query.
The query:
SELECT m.id AS match_id, e.event AS event1
FROM matches m
JOIN competitions co ON co.id = m.competition
JOIN clubs h ON h.id = m.hometeam
JOIN clubs a ON a.id = m.awayteam
LEFT JOIN match_events e ON e.match = m.id
AND e.player = '7138'
WHERE (m.hometeam = '1'
OR m.awayteam = '1'
)
AND m.season = '121'
Are you sure you need AND e.player = '7138' in the ON clause of a LEFT JOIN and not in the WHERE clause?
Better indexing
Recommend these composite, covering, indexes:
m: (season, awayteam, hometeam, competition, id)
e: (player, match, event)
Avoiding OR
OR optimizes poorly. A common trick is to turn it into UNION. Such may work for your query:
SELECT ...
FROM matches JOIN ...
WHERE m.season = 121
AND m.hometeam = 1
UNION ALL
SELECT ...
FROM matches JOIN ...
WHERE m.season = 121
AND m.awayteam = 1
And have these two indexes:
INDEX(season, hometeam) -- will be used by one part of the UNION
INDEX(season, awayteam) -- will be used by the other
I chose UNION ALL because it is faster than UNION DISTINCT. But if you get unwanted dups, change it.

Efficiency of LEFT JOIN

I've searched extensively and increased performance through my research, but I'm still not getting results as quickly as I think are possible.
I have the following SQL:
SELECT x.`level`, count(x.`level`) AS TOTAL FROM (
SELECT a.`level` FROM `gharaffa`.`wwlassessments` a
LEFT JOIN `gharaffa`.`users` u on u.`pupilID` = a.`pupilID`
WHERE a.`dateAchieved` = (
SELECT MAX(a2.`dateAchieved`)
FROM `gharaffa`.`wwlassessments` a2
WHERE a.`pupilID`= a2.`pupilID` && a2.`id`='867' && u.form='Y02GA' && u.enrolled='1' )
)x
GROUP BY x.`level`
It executes on my table of 50,000 rows in 1.8 seconds.
However, the page will run this query with different parameters 60 times.
That's taking too long.
Originally I had this part:
u.form='Y02GA' && u.enrolled='1'
outside of the join and that took 4.20 seconds.... I've more than halved the time, but I can't help thinking it's still not as efficient as it could be.
Any pointers gratefully received.
John :-)
This is your query:
SELECT x.level, count(x.level) AS TOTAL
FROM (SELECT a.level
FROM gharaffa.wwlassessments a LEFT JOIN
gharaffa.`users` u
on u.`pupilID` = a.`pupilID`
WHERE a.dateAchieved = (SELECT MAX(a2.dateAchieved)
FROM gharaffa.wwlassessments a2
WHERE a.pupilID = a2.pupilID AND
a2.id = 867 AND
u.form = 'Y02GA' AND
u.enrolled = 1
)
) x
GROUP BY x.level;
This query seems way overcomplicated. Do note two smallish changes:
Removed single quotes around the numeric constants. Don't use single quotes for numeric values (I presume id is numeric).
Changed && to AND. The latter is standard SQL.
Some observations:
The innermost subquery turns the LEFT JOIN to an INNER JOIN.
The conditions on u should be in the middle WHERE, not the innermost WHERE.
The outer subquery is not needed.
So, I would write this as:
SELECT a.level, COUNT(*) as TOTAL
FROM gharaffa.wwlassessments a JOIN
gharaffa.users u
ON u.pupilID = a.pupilID
WHERE u.form = 'Y02GA' AND
u.enrolled = 1 AND
a.dateAchieved = (SELECT MAX(a2.dateAchieved)
FROM gharaffa.wwlassessments a2
WHERE a.pupilID = a2.pupilID AND
a2.id = 867
)
GROUP BY a.level;
For this, you want indexes on users(form, enrolled, pupilID) and wwlassessments(pupilId, id, level).

Optimizing a where statement MYSQL

Im writing this complex query to return a large dataset, which is about 100,000 records. The query runs fine until i add in this OR statement to the WHERE clause:
AND (responses.StrategyFk = strategies.Id Or responses.StrategyFk IS
Null)
Now i understand that by putting the or statement in there it adds a lot of overhead.
Without that statement and just:
AND responses.StrategyFk = strategies.Id
The query runs within 15 seconds, but doesn't return any records that didn't have a fk linking a strategie.
Although i would like these records as well. Is there an easier way to find both records with a simple where statement? I can't just add another AND statement for null records because that will break the previous statement. Kind of unsure of where to go from here.
Heres the lower half of my query.
FROM
responses, subtestinstances, students, schools, items,
strategies, subtests
WHERE
subtestinstances.Id = responses.SubtestInstanceFk
AND subtestinstances.StudentFk = students.Id
AND students.SchoolFk = schools.Id
AND responses.ItemFk = items.Id
AND (responses.StrategyFk = strategies.Id Or responses.StrategyFk IS Null)
AND subtests.Id = subtestinstances.SubtestFk
try:
SELECT ... FROM
responses
JOIN subtestinstances ON subtestinstances.Id = responses.SubtestInstanceFk
JOIN students ON subtestinstances.StudentFk = students.Id
JOIN schools ON students.SchoolFk = schools.Id
JOIN items ON responses.ItemFk = items.Id
JOIN subtests ON subtests.Id = subtestinstances.SubtestFk
LEFT JOIN strategies ON responses.StrategyFk = strategies.Id
That's it. No OR condition is really needed, because that's what a LEFT JOIN does in this case. Anywhere responses.StrategyFk IS NULL will result in no match to the strategies table, and it wil return a row for that.
See this link for a simple explanation of joins: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
After that, if you're still having performance issues then you can start looking at the EXPLAIN SELECT ... ; output and looking for indexes that may need to be added. Optimizing Queries With Explain -- MySQL Manual
Try using explicit JOINs:
...
FROM responses a
INNER JOIN subtestinstances b
ON b.id = a.subtestinstancefk
INNER JOIN students c
ON c.id = b.studentfk
INNER JOIN schools d
ON d.id = c.schoolfk
INNER JOIN items e
ON e.id = a.itemfk
INNER JOIN subtests f
ON f.id = b.subtestfk
LEFT JOIN strategies g
ON g.id = a.strategyfk

Mysql - db-select with several joins and tablestructure

I am using the following mysql select-query with several joins. I am wondering if this is how a somewhat good select-statement should look like:
SELECT *
FROM table_news AS a
INNER JOIN table_cat AS b ON a.cat_id = b.id
INNER JOIN table_countries AS c ON a.country_id = c.id
INNER JOIN table_addresses AS d ON a.id = d.news_id
WHERE a.deleted = 0
AND a.hidden = 0
AND a.cat_id = ".$search_cat."
AND a.country_id = ".$search_country."
AND a.title LIKE '%".$search_string."%'
OR a.deleted = 0
AND a.hidden = 0
AND a.cat_id = ".$search_cat."
AND a.country_id = ".$search_country."
AND a.subtitle LIKE '%".$search_string."%'"
It seems to be a lot of joins. Even though table b and table c contain only 3 or 4 fields, I wonder if the number of joins would clearly slow down the search on the starting-page?
Would it be better to put the fields from table d (street, city and so on) back into the main-table, as they should be needed most of the time this query is executed?
Thanx in advance,
Jayden
I don't think there is necessarily anything wrong with having three joins. There are a couple of things you can do to make sure the query is optimised.
Firstly, you should never do SELECT * - instead explicitly state what fields you want to return from the database.
Also, I would create indexes on all the fields you have in the where clause, and all of the fields you are joining. This can be a little bit of a trade off - for example if you are doing a lot of write operations then there is a hit because you need to write to the index everytime.