I have a pretty big SQL query to get data from multiple database tables. I use the ON condition to check if the guild_ids are always the same and in some cases, he check's for an user_id too.
That is my query:
SELECT
SUM( f.guild_id = 787672220503244800 AND f.winner_id LIKE '%841827102331240468%' ) AS guild_winner,
SUM( f.winner_id LIKE '%841827102331240468%' ) AS win_sum,
m.message_count,
r.bypass_role_id,
i.real_count,
i.total_count,
i.bonus_count,
i.left_count
FROM
guild_finished_giveaways AS f
JOIN guild_message_count AS m
JOIN guild_role_settings AS r
JOIN guild_invite_count AS i ON m.guild_id = f.guild_id
AND m.user_id = 841827102331240468
AND r.guild_id = f.guild_id
AND i.guild_id = f.guild_id
AND i.user_id = m.user_id
But it runs pretty slow, with over 15s. I can't see why it needs so long.
I figured out that if I remove the "guild_invite_count" JOIN, it's pretty fast again. Do I have some simple error here that I don't see? Or what could be the issue?
Each JOIN expression needs it's own ON. Don't wait until the end for this. As it was, the server was forced to build up a cartesian product of all those tables before narrowing them down again, and I'm surprised the query ran at all (I'd expect a syntax error for missing ON clauses).
FROM guild_finished_giveaways AS f
JOIN guild_message_count AS m ON m.guild_id = f.guild_id
JOIN guild_role_settings AS r ON r.guild_id = f.guild_id
JOIN guild_invite_count AS i ON i.guild_id = f.guild_id
AND i.user_id = m.user_id
WHERE m.user_id = 841827102331240468
It's also more than a little odd to use SUM() or any other aggregate function in the same query as non-aggregated values without a GROUP BY clause.
Are you using InnoDB?
Does every table have a PRIMARY KEY?
These may help:
m: PRIMARY KEY(user_id) -- assuming that is unique in that table
f: INDEX(guild_id, winner_id)
r: INDEX(guild_id, bypass_role_id)
i: INDEX(user_id,)
It looks like some tables should not be separate -- perhaps r,i,f could be combined? (I need to see SHOW CREATE TABLE to say more.)
Do NOT have a commalist in winner_id. Instead have another table with one row per winner per game (or whatever it is a winner of). Perhaps just to columns like a Many-to-many mapping table.
Noting that the execution is likely to start with m and then go next to i let's improve on Joel's suggestion:
FROM guild_message_count AS m
JOIN guild_invite_count AS i ON i.user_id = m.user_id
JOIN guild_finished_giveaways AS f ON f.guild_id = m.guild_id
JOIN guild_role_settings AS r ON r.guild_id = m.guild_id
WHERE m.user_id = 841827102331240468
Note that 3 tables are joined on guild_id; but only 2 = are needed.
SUM without GROUP BY sums up the entire resultset (after JOINing). But you have 6 non-aggregates, so you need to GROUP BY all 6.
But that may lead to grossly inflated sums. Maybe you need to do the aggregation just over f first since that is where you are summing. Then JOIN to the rest??
Related
my table user contains these fields
id,company_id,created_by,name,image
table valet contains
id,vid,dept_id
table cart contains
id,dept_id,map_id,purchase,time
to get the details i have written this mysql query
SELECT c.id, a.id, c.purchace, c.time
FROM user a
LEFT JOIN valet b ON a.vid = b.id
AND a.is_deleted = 0
LEFT JOIN cart c ON b.dept_id = c.dept_id
WHERE a.company_id = 18
AND a.created_by = 102
AND a.is_deleted = 0
AND c.time
IN ( SELECT MAX( time ) FROM cart WHERE dept_id = b.dept_id )
from these three table i want to select last updated raw from cart along with id from user table which is mapped in valet table
this query works fine but it takes almost 15 sec to retrieve the details .
is there any way to improve this query or may be i am doing some wrong.
any help would be appreciated
For one thing, I can see that you’re running the subquery for each row. Depending on what the optimiser does, that may have an impact. max is a pretty expensive operation (there’s nothing for it but to read every row).
If you plan to update and use this query repeatedly, perhaps you should at least index the table on cart.time. This will make it much easier to find the maximum value.
MySQL has the concept of user variables, so you can set a variable to the result of the subquery, and that might help:
SELECT c.id, a.id, c.purchace, c.time
FROM
user a
LEFT JOIN valet b ON a.vid = b.id AND a.is_deleted = '0'
LEFT JOIN cart c ON b.dept_id = c.dept_id
LEFT JOIN (SELECT dept_id,max(time) as mx FROM cart GROUP BY dept_id) m on m.dept_id=c.dept_id
WHERE
a.company_id = '18'
AND a.created_by = '102'
AND a.is_deleted = '0'
AND c.time=m.mx;
Note also:
since you’re only testing a single value (max) for c.time, you should be using = not in.
I’m not sure about is why you are using strings instead of integers. I shold have though that leaving off the quotes makes more sense.
Your JOIN includes AND a.is_deleted = '0', though you make no mention of it in your table description. In any case, why is it in the JOIN and not in the WHERE clause?
I have the following query:
SELECT PKID, QuestionText, Type
FROM Questions
WHERE PKID IN (
SELECT FirstQuestion
FROM Batch
WHERE BatchNumber IN (
SELECT BatchNumber
FROM User
WHERE RandomString = '$key'
)
)
I've heard that sub-queries are inefficient and that joins are preferred. I can't find anything explaining how to convert a 3+ tier sub-query to join notation, however, and can't get my head around it.
Can anyone explain how to do it?
SELECT DISTINCT a.*
FROM Questions a
INNER JOIN Batch b
ON a.PKID = b.FirstQuestion
INNER JOIN User c
ON b.BatchNumber = c.BatchNumber
WHERE c.RandomString = '$key'
The reason why DISTINCT was specified is because there might be rows that matches to multiple rows on the other tables causing duplicate record on the result. But since you are only interested on records on table Questions, a DISTINCT keyword will suffice.
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
Try :
SELECT q.PKID, q.QuestionText, q.Type
FROM Questions q
INNER JOIN Batch b ON q.PKID = b.FirstQuestion
INNER JOIN User u ON u.BatchNumber = q.BatchNumber
WHERE u.RandomString = '$key'
select
q.pkid,
q.questiontext,
q.type
from user u
join batch b
on u.batchnumber = b.batchnumber
join questions q
on b.firstquestion = q.pkid
where u.randomstring = '$key'
Since your WHERE clause filters on the USER table, start with that in the FROM clause. Next, apply your joins backwards.
In order to do this correctly, you need distinct in the subquery. Otherwise, you might multiply rows in the join version:
SELECT q.PKID, q.QuestionText, q.Type
FROM Questions q join
(select distinct FirstQuestion
from Batch b join user u
on b.batchnumber = u.batchnumber and
u.RandomString = '$key'
) fq
on q.pkid = fq.FirstQuestion
As to whether the in or join version is better . . . that depends. In some cases, particularly if the fields are indexed, the in version might be fine.
Im writing this complex query to return a large dataset, which is about 100,000 records. The query runs fine until i add in this OR statement to the WHERE clause:
AND (responses.StrategyFk = strategies.Id Or responses.StrategyFk IS
Null)
Now i understand that by putting the or statement in there it adds a lot of overhead.
Without that statement and just:
AND responses.StrategyFk = strategies.Id
The query runs within 15 seconds, but doesn't return any records that didn't have a fk linking a strategie.
Although i would like these records as well. Is there an easier way to find both records with a simple where statement? I can't just add another AND statement for null records because that will break the previous statement. Kind of unsure of where to go from here.
Heres the lower half of my query.
FROM
responses, subtestinstances, students, schools, items,
strategies, subtests
WHERE
subtestinstances.Id = responses.SubtestInstanceFk
AND subtestinstances.StudentFk = students.Id
AND students.SchoolFk = schools.Id
AND responses.ItemFk = items.Id
AND (responses.StrategyFk = strategies.Id Or responses.StrategyFk IS Null)
AND subtests.Id = subtestinstances.SubtestFk
try:
SELECT ... FROM
responses
JOIN subtestinstances ON subtestinstances.Id = responses.SubtestInstanceFk
JOIN students ON subtestinstances.StudentFk = students.Id
JOIN schools ON students.SchoolFk = schools.Id
JOIN items ON responses.ItemFk = items.Id
JOIN subtests ON subtests.Id = subtestinstances.SubtestFk
LEFT JOIN strategies ON responses.StrategyFk = strategies.Id
That's it. No OR condition is really needed, because that's what a LEFT JOIN does in this case. Anywhere responses.StrategyFk IS NULL will result in no match to the strategies table, and it wil return a row for that.
See this link for a simple explanation of joins: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
After that, if you're still having performance issues then you can start looking at the EXPLAIN SELECT ... ; output and looking for indexes that may need to be added. Optimizing Queries With Explain -- MySQL Manual
Try using explicit JOINs:
...
FROM responses a
INNER JOIN subtestinstances b
ON b.id = a.subtestinstancefk
INNER JOIN students c
ON c.id = b.studentfk
INNER JOIN schools d
ON d.id = c.schoolfk
INNER JOIN items e
ON e.id = a.itemfk
INNER JOIN subtests f
ON f.id = b.subtestfk
LEFT JOIN strategies g
ON g.id = a.strategyfk
I am using the following mysql select-query with several joins. I am wondering if this is how a somewhat good select-statement should look like:
SELECT *
FROM table_news AS a
INNER JOIN table_cat AS b ON a.cat_id = b.id
INNER JOIN table_countries AS c ON a.country_id = c.id
INNER JOIN table_addresses AS d ON a.id = d.news_id
WHERE a.deleted = 0
AND a.hidden = 0
AND a.cat_id = ".$search_cat."
AND a.country_id = ".$search_country."
AND a.title LIKE '%".$search_string."%'
OR a.deleted = 0
AND a.hidden = 0
AND a.cat_id = ".$search_cat."
AND a.country_id = ".$search_country."
AND a.subtitle LIKE '%".$search_string."%'"
It seems to be a lot of joins. Even though table b and table c contain only 3 or 4 fields, I wonder if the number of joins would clearly slow down the search on the starting-page?
Would it be better to put the fields from table d (street, city and so on) back into the main-table, as they should be needed most of the time this query is executed?
Thanx in advance,
Jayden
I don't think there is necessarily anything wrong with having three joins. There are a couple of things you can do to make sure the query is optimised.
Firstly, you should never do SELECT * - instead explicitly state what fields you want to return from the database.
Also, I would create indexes on all the fields you have in the where clause, and all of the fields you are joining. This can be a little bit of a trade off - for example if you are doing a lot of write operations then there is a hit because you need to write to the index everytime.
I've found info on how to optimize MySQL queries, but most of the tips seem to suggest avoiding things MySQL isn't built for (e.g., calculations, validation, etc.) My query on the other hand is very straight forward but joins a lot of tables together.
Is there an approach to speeding up simple queries with many INNER JOINS? How would I fix my query below?
SELECT t_one.id FROM table_one t_one
INNER JOIN entr_to_state st
INNER JOIN entr_to_country ct
INNER JOIN entr_to_domain dm
INNER JOIN entr_timing t
INNER JOIN entr_to_weather a2w
INNER JOIN entr_to_imp_num a2i
INNER JOIN entr_collection c
WHERE t_one.type='normal'
AND t_one.campaign_id = c.id
AND t_one.status='running'
AND c.status='running'
AND (c.opt_schedule = 'continuous' OR (c.opt_schedule = 'schedulebydate'
AND (c.start_date <= '2011-03-06 14:25:52' AND c.end_date >= '2011-03-06 14:25:52')))
AND t.entr_id = t_one.id AND ct.entr_id = t_one.id
AND st.entr_id = t_one.id AND a2w.entr_id = t_one.id
AND (t_one.targeted_gender = 'male' OR t_one.targeted_gender = 'both')
AND t_one.targeted_min_age <= 23.1 AND t_one.targeted_max_age > 23.1
AND (ct.abbreviation = 'US' OR ct.abbreviation = 'any')
AND (st.abbreviation = 'CO' OR st.abbreviation = 'any')
AND t.sun = 1 AND t.hour_14 = 1
AND (a2w.weather_category_id = 1 OR a2w.weather_category_id = 0)
AND t_one.targeted_min_temp <= 46
AND t_one.targeted_max_temp > 46 GROUP BY t_one.id
Index all relevant fields, of course, which I'm sure you have
Then find which joins are the most costly ones by running EXPLAIN SELECT...
Consider splitting them off into a seperate query i.e. narrow down the record(s) you're looking for, then perform the joins on those records rather than all the records
i.e.
SELECT c.*, ....
FROM (SELECT x, y, z .... ) AS c
You would need to EXPLAIN SELECT the query, check which parts of the query are not using in indices, and then attempt to index those. If possible, break the query down into smaller parts as well.
If you really cannot in any way optimize the underlying DB or your query, you could resort to a flat table that has the data you need for fast access. Then just hook up the main query to update the flat table to run as often as needed.