mysql query and performance - mysql

I would like to know the impact on performance if I run this query in the following conditions.
Query:
select `players`.*, count(`clicks`.`id`) as `clicks_count`
from `players` left join `clicks` on `clicks`.`player_id` = `players`.`id`
group by `players`.`id`
order by `clicks_count` desc
limit 1
Conditions:
In the clicks table I expect to get
insert 1000 times in a 1 minute
The clicks table will contain more
then 1,000,000 rows
The players table will contain
10,000 rows
The players table get inserted into every 5
minutes
I would like to know what to expect performance-wise if I run the query 1000 times in 1 minute.
Thanks

That query will never run in milliseconds with any meaningful amounts of data in your tables. It'll run two full table scans, join the two together, aggregate the mess, and fetch the top row from that.
Use a trigger to store the total in the players, and index that field. You'll then be able to avoid the join altogether:
select p.* from players p order by clicks_count desc limit 1

First & foremost, you should worry about your schema if you want decent performance with that number of records and frequent writes; i.e. proper indexes and constraints must be created if not already in place.
Next, the query itself, select the minimum number of fields needed (so if you do not need ALL players field, avoid using "players.*").
Personal pref, I'd restructure tables (e.g. playerID in place of id) and query like so:
SELECT p.*, COUNT(c.id) as clicks_count
FROM players p
JOIN clicks c USING(playerID)
GROUP BY p.playerID
ORDER BY clicks_count desc
LIMIT 1
Again, see if you really need ALL player table fields; if not, omit "p.*" and replace with p.foo, p.bar, etc.

Related

Mysql: Why is WHERE IN much faster than JOIN in this case?

I have a query with a long list (> 2000 ids) in a WHERE IN clause in mysql (InnoDB):
SELECT id
FROM table
WHERE user_id IN ('list of >2000 ids')
I tried to optimize this by using an INNER JOIN instead of the wherein like this (both ids and the user_id use an index):
SELECT table.id
FROM table
INNER JOIN users ON table.user_id = users.id WHERE users.type = 1
Surprisingly, however, the first query is much faster (by the factor 5 to 6). Why is this the case? Could it be that the second query outperforms the first one, when the number of ids in the where in clause becomes much larger?
This is not Ans to your Question but you may use as alternative to your first query, You can better increase performance by replacing IN Clause with EXISTS since EXISTS performance better than IN ref : Here
SELECT id
FROM table t
WHERE EXISTS (SELECT 1 FROM USERS WHERE t.user_id = users.id)
This is an unfair comparison between the 2 queries.
In the 1st query you provide a list of constants as a search criteria, therefore MySQL has to open and search only table and / or 1 index file.
In the 2nd query you instruct MySQL to obtain the list dynamically from another table and join that list back to the main table. It is also not clear, if indexes were used to create a join or a full table scan was needed.
To have a fair comparison, time the query that you used to obtain the list in the 1st query along with the query itself. Or try
SELECT table.id FROM table WHERE user_id IN (SELECT users.id FROM users WHERE users.type = 1)
The above fetches the list of ids dynamically in a subquery.

Executing extremely slow MySQL query

this query has multiple JOIN including aggregate functions
executing this query for approximately 6000 users took 20 seconds.
is there any other method to run this query faster?
SELECT users.id, SUM(orders.totalCost) AS bought, COUNT(comment.id) AS commentsCount, COUNT(topics.id) AS topicsCount, COUNT(users_login.id) AS loginCount, COUNT(users_download.id) AS downloadsCount
FROM users
LEFT JOIN orders ON users.id=orders.userID AND orders.status=1
LEFT JOIN comment ON users.id=comment.userID
LEFT JOIN topics ON users.id=topics.userID
LEFT JOIN users_login ON users.id=users_login.userID
LEFT JOIN users_download ON users.id=users_download.userID
WHERE users.id='$userID'
GROUP BY users.id
ORDER BY `bought` DESC
The result of running explain
The EXPLAIN output shows you are doing full-table scans on everything except users. You need to create secondary (non-unique) indexes on userID on all the other tables in the join. That will speed up queries on individual users.
However, if you're going to process all users in one pass then do a single select without a WHERE users.id= clause. Your aggregation returns only one row per user and you should create a single resultset containing all the rows and iterate over that, instead of reissuing the query once per user. In this case the secondary indexes may still help as counts can be determined from the index alone without looking at the tables themselves.

What SQL indexes to put for big table

I have two big tables from which I mostly select but complex queries with 2 joins are extremely slow.
First table is GameHistory in which I store records for every finished game (I have 15 games in separate table).
Fields: id, date_end, game_id, ..
Second table is GameHistoryParticipants in which I store records for every player participated in certain game.
Fields: player_id, history_id, is_winner
Query to get top players today is very slow (20+ seconds).
Query:
SELECT p.nickname, count(ghp.player_id) as num_games_today
FROM `GameHistory` as gh
INNER JOIN GameHistoryParticipants as ghp ON gh.id=ghp.history_id
INNER JOIN Players as p ON p.id=ghp.player_id
WHERE TIMESTAMPDIFF(DAY, gh.date_end, NOW())=0 AND gh.game_id='scrabble'
GROUP BY ghp.player_id ORDER BY count(ghp.player_id) DESC LIMIT 10
First table has 1.5 million records and the second one 3.5 million.
What indexes should I put ? (I tried some and it was all slow)
You are only interested in today's records. However, you search the whole GameHistory table with TIMESTAMPDIFF to detect those records. Even if you have an index on that column, it cannot be used, due to the fact that you use a function on the field.
You should have an index on both fields game_id and date_end. Then ask for the date_end value directly:
WHERE gh.date_end >= DATE(NOW())
AND gh.date_end < DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)
AND gh.game_id = 'scrabble'
It would even be better to have an index on date_end's date part rather then on the whole time carrying date_end. This is not possible in MySQL however. So consider adding another column trunc_date_end for the date part alone which you'd fill with a before-insert trigger. Then you'd have an index on trunc_date_end and game_id, which should help you find the desired records in no time.
WHERE gh.trunc_date_end = DATE(NOW())
AND gh.game_id = 'scrabble'
add 'EXPLAIN' command at the beginning of your query then run it in a database viewer(ex: sqlyog) and you will see the details about the query, look for the 'rows' column and you will see different integer values. Now, index the table columns indicated in the EXPLAIN command result that contain large rows.
-i think my explanation is kinda messy, you can ask for clarification

MySQL Selecting things where a condition on a row is met 2 or more times, but showing the two or more results

If I use GROUP BY then I will get just 1 row per group. For example
Sessions table: SessionId (other things)
Actions table: ActionId, SessionId, (other things)
With:
SELECT S.*, A.* FROM ActionList A JOIN SessionList S ON A.SessionId
=S.SessionId
WHERE 1 /*various criteria to filter*/
ORDER BY S.SessionId DESC, ActionId DESC;
Thus showing me the most recent session at the top. Now I want to look at only sessions with 2 or more actions.
If I use GROUP BY A.SessionId then I can get COUNT(ActionId) and use HAVING to look at rows only with the required count, but I wont get both rows (or more) rows, just the one.
I suspect I can do this by JOINing a table with SessionIds and the count of action IDs but I'm fairly new to joins (I could do this via a subquery any ANY).
If a view would help, I would create a view of the form:
SELECT SessionId, COUNT(*) FROM Actions GROUP BY SessionId;
Or put this in brackets and JOIN on it (but I confess I'd have to loop up 3 table joins)
What is the neatest way to do this?
Also is this where "Foreign keys" come into play? That'd probably stop the "ambiguity errors" I get if I don't qualify SessionId. I've avoided them for fear of TRIGGERs, I also didn't know about JOINs and just used subqueries until recently. I've realised it is stupid to avoid things that were added to help.
Additionally I'm quite timid with joins because I know what it does, well worst case. If I JOIN on a table with m rows, and another with n I end up with m*n rows. That could be VERY large! I'm dealing with large tables (as in: schema wont fit in RAM large) so that is quite scary. I do know MySQL optimises well (able to move stuff from HAVING to WHERE and so forth) but still!
If you want to look at sessions with two or more actions, then use a join:
select sl.*
from SessionList sl join
(select SessionId, count(*) as cnt
from Actions
group by SessionId
) a
on sl.SessionId = a.SessionId and cnt > 1;

Getting last few rows from more than one tables in MySql?

i am pretty much stucked in an Sql Query from past few hours . i need to get latest few elements from four tables as follows..
table names are -- events , contactinfo , video , news
i need last 3 results from events and news and last single result from video and contactinfo..
i tried following query but as expected it didnt worked ..
SELECT * FROM
((SELECT * FROM EVENTS ORDER BY eventid DESC LIMIT 3)EV) INNER JOIN
((SELECT * FROM NEWS ORDER BY newsid DESC LIMIT 3)NE) INNER JOIN
((SELECT * FROM VIDEOS ORDER BY videoid DESC LIMIT 1)VI) INNER JOIN
((SELECT * FROM CONTACTINFO ORDER BY cid DESC LIMIT 1)AB);
Actually i am not a DB Expert i am a Developer and i really dont know much about MySql.
Any Help Would be Appreciated.
If these tables have the same columns you can do a UNION (instead of your INNER JOIN). If not, I suggest doing 4 queries.
JOINs suggests that the data that is joined correlates to each other and if that's not the case than doing an JOIN seams like the wrong solution.
If you need result as a single table then use SELECT and UNION to union data, providing same column numbers and their data types in each query (CAST column and provide default values if need). Otherwise, if you need results with different structures then run 4 queries.
JOINs don't make sense for your task as last N rows from one table unlikely have corresponding rows within last N rows of another table.
UPDATE
See example:
SELECT * FROM
(SELECT TOP 5 n.ID, n.Content, n.CreatedOn as CreatedOn, n.UserID as NewsUserID, 1 as SourceType FROM News n ORDER BY n.CreatedOn DESC) t1
UNION ALL
SELECT * FROM
(SELECT TOP 5 e.ID, e.Description as Content, e.CreatedAt as CreatedOn, NULL as NewsUserID, 2 as SourceType FROM Events e ORDER BY e.CreatedAt DESC) t2
ORDER BY SourceType, CreatedOn DESC
So i decided i want to have ID, Content and CreatedOn from every source, and also want to have UserID from News table. I built 2 queries so they return same columns of same datatypes. Each query takes only first 5 rows from source (TOP 5 is MS SQL syntax, please use your database's). Also i added an extra field SourceType that keeps type of entity. In the main query i union all results and order by source type first, then by CreatedDate.
This is not a logical way to get four table data in one call, since all tables are independent.
I think you wants to minimise database call,
In order to minimise database hits, you should use memcache instead of using such query.
Memcache :
It save data as key value pair, for each key you will get result set.
Its very fast.