this query has multiple JOIN including aggregate functions
executing this query for approximately 6000 users took 20 seconds.
is there any other method to run this query faster?
SELECT users.id, SUM(orders.totalCost) AS bought, COUNT(comment.id) AS commentsCount, COUNT(topics.id) AS topicsCount, COUNT(users_login.id) AS loginCount, COUNT(users_download.id) AS downloadsCount
FROM users
LEFT JOIN orders ON users.id=orders.userID AND orders.status=1
LEFT JOIN comment ON users.id=comment.userID
LEFT JOIN topics ON users.id=topics.userID
LEFT JOIN users_login ON users.id=users_login.userID
LEFT JOIN users_download ON users.id=users_download.userID
WHERE users.id='$userID'
GROUP BY users.id
ORDER BY `bought` DESC
The result of running explain
The EXPLAIN output shows you are doing full-table scans on everything except users. You need to create secondary (non-unique) indexes on userID on all the other tables in the join. That will speed up queries on individual users.
However, if you're going to process all users in one pass then do a single select without a WHERE users.id= clause. Your aggregation returns only one row per user and you should create a single resultset containing all the rows and iterate over that, instead of reissuing the query once per user. In this case the secondary indexes may still help as counts can be determined from the index alone without looking at the tables themselves.
Related
I have a query with a long list (> 2000 ids) in a WHERE IN clause in mysql (InnoDB):
SELECT id
FROM table
WHERE user_id IN ('list of >2000 ids')
I tried to optimize this by using an INNER JOIN instead of the wherein like this (both ids and the user_id use an index):
SELECT table.id
FROM table
INNER JOIN users ON table.user_id = users.id WHERE users.type = 1
Surprisingly, however, the first query is much faster (by the factor 5 to 6). Why is this the case? Could it be that the second query outperforms the first one, when the number of ids in the where in clause becomes much larger?
This is not Ans to your Question but you may use as alternative to your first query, You can better increase performance by replacing IN Clause with EXISTS since EXISTS performance better than IN ref : Here
SELECT id
FROM table t
WHERE EXISTS (SELECT 1 FROM USERS WHERE t.user_id = users.id)
This is an unfair comparison between the 2 queries.
In the 1st query you provide a list of constants as a search criteria, therefore MySQL has to open and search only table and / or 1 index file.
In the 2nd query you instruct MySQL to obtain the list dynamically from another table and join that list back to the main table. It is also not clear, if indexes were used to create a join or a full table scan was needed.
To have a fair comparison, time the query that you used to obtain the list in the 1st query along with the query itself. Or try
SELECT table.id FROM table WHERE user_id IN (SELECT users.id FROM users WHERE users.type = 1)
The above fetches the list of ids dynamically in a subquery.
I have a MYSQL query of this form:
SELECT
employee.name,
totalpayments.totalpaid
FROM
employee
JOIN (
SELECT
paychecks.employee_id,
SUM(paychecks.amount) totalpaid
FROM
paychecks
GROUP BY
paychecks.employee_id
) totalpayments on totalpayments.employee_id = employee.id
I've recently found that this returns MUCH faster in this form:
SELECT
employee.name,
(
SELECT
SUM(paychecks.amount)
FROM
paychecks
WHERE
paychecks.employee_id = employee.id
) totalpaid
FROM
employee
It surprises me that there would be a difference in speed, and that the lower query would be faster. I prefer the upper form for development, because I can run the subquery independently.
Is there a way to get the "best of both worlds": speedy results return AND being able to run the subquery in isolation?
Likely, the correlated subquery is able to make effective use of an index, which is why it's fast, even though that subquery has to be executed multiple times.
For the first query with the inline view, that causing MySQL to create a derived table, and for large sets, that's effectively a MyISAM table.
In MySQL 5.6.x and later, the optimizer may choose to add an index on the derived table, if that would allow a ref operation and the estimated cost of the ref operation is lower than the nested loops scan.
I recommend you try using EXPLAIN to see the access plan. (Based on your report of performance, I suspect you are running on MySQL version 5.5 or earlier.)
The two statements are not entirely equivalent, in the case where there are rows in employees for which there are no matching rows in paychecks.
An equivalent result could be obtained entirely avoiding a subquery:
SELECT e.name
, SUM(p.amount) AS total_paid
FROM employee e
JOIN paychecks p
ON p.employee_id = e.id
GROUP BY e.id
(Use an inner join to get a result equivalent to the first query, use a LEFT outer join to be equivalent to the second query. Wrap the SUM() aggregate in an IFNULL function if you want to return a zero rather than a NULL value when no matching row with a non-null value of amount is found in paychecks.)
Join is basically Cartesian product that means all the records of table A will be combined with all the records of table B. The output will be
number of records of table A * number of records of table b =rows in the new table
10 * 10 = 100
and out of those 100 records, the ones that match the filters will be returned in the query.
In the nested queries, there is a sample inner query and whatever is the total size of records of the inner query will be the input to the outter query that is why nested queries are faster than joins.
I have two inner joins in my SQL query:
SELECT `M`.`msg_id`,
`U`.`username`,
`U`.`seo_username`
FROM `newdb2`.`users` AS `U`
INNER JOIN (SELECT subscriber_to_id
FROM subscriptions
WHERE subscriber_id = 434) AS subscriber
ON id = subscriber_to_id
INNER JOIN `newdb2`.`messages` AS `M`
ON (`M`.`uid_fk` = `U`.`id`)
ORDER BY id DESC LIMIT 10
When I execute this query I see that is really slow.
How can I modify thiş query to make it faster?
Quick fixes for things like this are adding indexes which allows your database server to quickly look up columns you are searching on. For more info on how to add indexes to columns, see the manual.
In this query, those columns are:
subscriptions.subscriber_id
subscriptions.subscriber_to_id
users.id
messages.uid_fk
The ORDER BY id should be OK as I assume your id column has a primary key index on it already, but ordering queries will slow it down too.
Subselect queries will also slow the query down. In this particular query, I can't see the alias subscriber (containing the results of your subquery, which is inner joined on) used anywhere, so remove that join completely.
I have two tables. Table people with 16500 rows, visits with 17000 rows.
My query contains LEFT JOIN because I have to link visits to people. I'm aware that if there is people record without visits record those visits columns will be NULL.
This simple query works like a charm.
SELECT * FROM people LEFT JOIN visits ON people.id = visits.id_people;
But when I try to count returned rows, MySQL hangs (or counting) 30+ seconds or until I kill it. That is not acceptable in production environment.
Here are different methods I tried to use for counting resulted rows, but all of them has the same hanging result.
SELECT COUNT(*) FROM people LEFT JOIN visits ON people.id = visits.id_people;
SELECT SQL_CALC_FOUND_ROWS * FROM people LEFT JOIN visits ON people.id = visits.id_people;
SELECT FOUND_ROWS();
Strange is that those methods are working fine on small testing tables (5 and 5 rows).
Can anyone help?
If you are creating a new MySQL table you can specify a column to index by using the INDEX term.Indexes are something extra that you can enable on your MySQL tables to increase performance
http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm
http://www.tutorialspoint.com/mysql/mysql-indexes.htm view this it gives you much idea..
cheers
I would like to know the impact on performance if I run this query in the following conditions.
Query:
select `players`.*, count(`clicks`.`id`) as `clicks_count`
from `players` left join `clicks` on `clicks`.`player_id` = `players`.`id`
group by `players`.`id`
order by `clicks_count` desc
limit 1
Conditions:
In the clicks table I expect to get
insert 1000 times in a 1 minute
The clicks table will contain more
then 1,000,000 rows
The players table will contain
10,000 rows
The players table get inserted into every 5
minutes
I would like to know what to expect performance-wise if I run the query 1000 times in 1 minute.
Thanks
That query will never run in milliseconds with any meaningful amounts of data in your tables. It'll run two full table scans, join the two together, aggregate the mess, and fetch the top row from that.
Use a trigger to store the total in the players, and index that field. You'll then be able to avoid the join altogether:
select p.* from players p order by clicks_count desc limit 1
First & foremost, you should worry about your schema if you want decent performance with that number of records and frequent writes; i.e. proper indexes and constraints must be created if not already in place.
Next, the query itself, select the minimum number of fields needed (so if you do not need ALL players field, avoid using "players.*").
Personal pref, I'd restructure tables (e.g. playerID in place of id) and query like so:
SELECT p.*, COUNT(c.id) as clicks_count
FROM players p
JOIN clicks c USING(playerID)
GROUP BY p.playerID
ORDER BY clicks_count desc
LIMIT 1
Again, see if you really need ALL player table fields; if not, omit "p.*" and replace with p.foo, p.bar, etc.