mysql count rows in query containing LEFT JOIN hangs server

mysql count rows in query containing LEFT JOIN hangs server - mysql

I have two tables. Table people with 16500 rows, visits with 17000 rows.
My query contains LEFT JOIN because I have to link visits to people. I'm aware that if there is people record without visits record those visits columns will be NULL.
This simple query works like a charm.
SELECT * FROM people LEFT JOIN visits ON people.id = visits.id_people;
But when I try to count returned rows, MySQL hangs (or counting) 30+ seconds or until I kill it. That is not acceptable in production environment.
Here are different methods I tried to use for counting resulted rows, but all of them has the same hanging result.
SELECT COUNT(*) FROM people LEFT JOIN visits ON people.id = visits.id_people;
SELECT SQL_CALC_FOUND_ROWS * FROM people LEFT JOIN visits ON people.id = visits.id_people;
SELECT FOUND_ROWS();
Strange is that those methods are working fine on small testing tables (5 and 5 rows).
Can anyone help?

If you are creating a new MySQL table you can specify a column to index by using the INDEX term.Indexes are something extra that you can enable on your MySQL tables to increase performance
http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm
http://www.tutorialspoint.com/mysql/mysql-indexes.htm view this it gives you much idea..
cheers

Related

AWS RDS MySQL fetching not being completed

I am writing queries off tables in my AWS RDS MySQL server and I can't get the query to complete the fetching. The duration of the query is 5.328 seconds but then fetching just doesn't end. I have Left Joined a sub query. When I run the sub separately it runs very quick and almost has no fetch time. When I run the main query it works great. The main query does have about 97,000 rows. I'm new to AWS RDS Servers and wonder if there is a parameter adjustment I need to be made? I feel as if the query is pretty simple.
We are in the middle of switching from BigQuery and BigQuery runs it just fine with the same data and same query.
Any ideas of what I can do to get it to fetch and speed up the fetching?
I've tried indexing and changing buffer pool size but still no luck
FROM
project__c P
LEFT JOIN contact C ON C.id = P.salesperson__c
LEFT JOIN account A ON A.id = P.sales_team_account_at_sale__c
LEFT JOIN contact S ON S.id = P.rep_contact__c
LEFT JOIN (
SELECT
U.name,
C.rep_id__c EE_Code
FROM
user U
LEFT JOIN profile P ON P.id = U.profileid
LEFT JOIN contact C ON C.email = U.email
WHERE
(P.name LIKE "%Inside%"OR P.name LIKE "%rep%")
AND C.active__c = TRUE
AND C.rep_id__c IS NOT NULL
AND C.recordtypeid = "############"
) LC ON LC.name = P.is_rep_formula__c

You can analyze the query by adding EXPLAIN [your query] and running that to see what indexes are being used and how many rows are examined for each joined table. It might be joining a lot more rows than you expect.
You can try:
SELECT SQL_CALC_FOUND_ROWS [your query] limit 1;
If the problem is in sending too much data (i.e. more rows than you think it's trying to return), this will return quickly. It would prove that the problem does lie in sending the data, not in the query stage. If this is what you find, run the following next:
select FOUND_ROWS();
This will tell you how many total rows matched your query. If it's bigger than you expect, your joins are wrong in some way. The explain analyzer mentioned above should provide some insight. While the outer query has 97000 rows, each of those rows could join with more than one row from the subquery. This can happen if you expect the left side of the join to always have a value, but find out there are rows in which it is empty/null. This can lead to a full cross join where every row from the left joins with every row on the right.
If the limit 1 query also hangs, the problem is in the query itself. Again, the explain analyzer should tell you where the problem lies. Most likely it's a missing index causing very slow scans of tables instead of fast lookups in both tables joins and where clauses.
The inner query might be fine/fast on its own. When you join with another query, if not indexed/joined properly, it could lead to a result set and/or query time many times larger.
If missing indices with derived tables is the problem, read up on how mysql can optimize via different settings by visiting https://dev.mysql.com/doc/refman/5.7/en/derived-table-optimization.html
As seen in a comment to your question, creating a temp table and joining directly gives you control instead of relying on/hoping mysql to optimize your query in a way that uses fast indices.
I'm not versed in BigQuery, but unless it's running the same core mysql engine under the hood, it's not really a good comparison.
Why all the left joins. Your subquery seems to be targeted to a small set of results, yet you still want to get 90k+ rows? If you are using this query to render any sort of list in a web application, apply reasonable limits and pagination.

Executing extremely slow MySQL query

this query has multiple JOIN including aggregate functions
executing this query for approximately 6000 users took 20 seconds.
is there any other method to run this query faster?
SELECT users.id, SUM(orders.totalCost) AS bought, COUNT(comment.id) AS commentsCount, COUNT(topics.id) AS topicsCount, COUNT(users_login.id) AS loginCount, COUNT(users_download.id) AS downloadsCount
FROM users
LEFT JOIN orders ON users.id=orders.userID AND orders.status=1
LEFT JOIN comment ON users.id=comment.userID
LEFT JOIN topics ON users.id=topics.userID
LEFT JOIN users_login ON users.id=users_login.userID
LEFT JOIN users_download ON users.id=users_download.userID
WHERE users.id='$userID'
GROUP BY users.id
ORDER BY `bought` DESC
The result of running explain

The EXPLAIN output shows you are doing full-table scans on everything except users. You need to create secondary (non-unique) indexes on userID on all the other tables in the join. That will speed up queries on individual users.
However, if you're going to process all users in one pass then do a single select without a WHERE users.id= clause. Your aggregation returns only one row per user and you should create a single resultset containing all the rows and iterate over that, instead of reissuing the query once per user. In this case the secondary indexes may still help as counts can be determined from the index alone without looking at the tables themselves.

MySQL Selecting things where a condition on a row is met 2 or more times, but showing the two or more results

If I use GROUP BY then I will get just 1 row per group. For example
Sessions table: SessionId (other things)
Actions table: ActionId, SessionId, (other things)
With:
SELECT S.*, A.* FROM ActionList A JOIN SessionList S ON A.SessionId
=S.SessionId
WHERE 1 /*various criteria to filter*/
ORDER BY S.SessionId DESC, ActionId DESC;
Thus showing me the most recent session at the top. Now I want to look at only sessions with 2 or more actions.
If I use GROUP BY A.SessionId then I can get COUNT(ActionId) and use HAVING to look at rows only with the required count, but I wont get both rows (or more) rows, just the one.
I suspect I can do this by JOINing a table with SessionIds and the count of action IDs but I'm fairly new to joins (I could do this via a subquery any ANY).
If a view would help, I would create a view of the form:
SELECT SessionId, COUNT(*) FROM Actions GROUP BY SessionId;
Or put this in brackets and JOIN on it (but I confess I'd have to loop up 3 table joins)
What is the neatest way to do this?
Also is this where "Foreign keys" come into play? That'd probably stop the "ambiguity errors" I get if I don't qualify SessionId. I've avoided them for fear of TRIGGERs, I also didn't know about JOINs and just used subqueries until recently. I've realised it is stupid to avoid things that were added to help.
Additionally I'm quite timid with joins because I know what it does, well worst case. If I JOIN on a table with m rows, and another with n I end up with m*n rows. That could be VERY large! I'm dealing with large tables (as in: schema wont fit in RAM large) so that is quite scary. I do know MySQL optimises well (able to move stuff from HAVING to WHERE and so forth) but still!

If you want to look at sessions with two or more actions, then use a join:
select sl.*
from SessionList sl join
(select SessionId, count(*) as cnt
from Actions
group by SessionId
) a
on sl.SessionId = a.SessionId and cnt > 1;

does LIMIT have effect before or after JOIN?

It's a simple question - I just found different answers to this so I'm not sure. Let me describe:
If you have a query, like this one:
SELECT logins.timestamp, users.name
FROM logins
LEFT JOIN users
ON users.id = logins.user_id
LIMIT 10
This basically would list the last 10 entries of the logins table, replacing the user_id with the username over a JOIN.
Now my question is, does LIMIT take effect while the JOIN happens (so that it only joins the first 10 entries) or after the JOIN? (Where it would join the whole table, and then cut out the first 10 entries).
I'm asking this because the sample table logins will have many entries - and I'm not sure if a JOIN is too costly performance wise. If LIMIT would only case 10 JOIN's to happen, that wouldn't be a problem.
A second question that came up with this: Is the functionality the same, if a DISTINCT is added? Will it still stop, at 10 entries? And no, this isn't going to be ordered by ORDER BY.

Don't worry. The LIMIT will happen at the same time with the join: MySQL will not read through the entire logins table, but fetch line by line (joining each time on users) until it has found 10 lines.
Do note that if a users.id appears two times in the table, the JOIN will duplicate the logins line and add each users line. The total amount of lines will still be 10, but you'll have 9 logins.

mysql query and performance

I would like to know the impact on performance if I run this query in the following conditions.
Query:
select `players`.*, count(`clicks`.`id`) as `clicks_count`
from `players` left join `clicks` on `clicks`.`player_id` = `players`.`id`
group by `players`.`id`
order by `clicks_count` desc
limit 1
Conditions:
In the clicks table I expect to get
insert 1000 times in a 1 minute
The clicks table will contain more
then 1,000,000 rows
The players table will contain
10,000 rows
The players table get inserted into every 5
minutes
I would like to know what to expect performance-wise if I run the query 1000 times in 1 minute.
Thanks

That query will never run in milliseconds with any meaningful amounts of data in your tables. It'll run two full table scans, join the two together, aggregate the mess, and fetch the top row from that.
Use a trigger to store the total in the players, and index that field. You'll then be able to avoid the join altogether:
select p.* from players p order by clicks_count desc limit 1

First & foremost, you should worry about your schema if you want decent performance with that number of records and frequent writes; i.e. proper indexes and constraints must be created if not already in place.
Next, the query itself, select the minimum number of fields needed (so if you do not need ALL players field, avoid using "players.*").
Personal pref, I'd restructure tables (e.g. playerID in place of id) and query like so:
SELECT p.*, COUNT(c.id) as clicks_count
FROM players p
JOIN clicks c USING(playerID)
GROUP BY p.playerID
ORDER BY clicks_count desc
LIMIT 1
Again, see if you really need ALL player table fields; if not, omit "p.*" and replace with p.foo, p.bar, etc.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008