MySQL Subquery Causing Server to Hang

MySQL Subquery Causing Server to Hang - mysql

I'm trying to execute the following query
SELECT * FROM person
WHERE id IN
( SELECT user_id FROM participation
WHERE activity_id = '1' AND application_id = '1'
)
The outer query returns about 4000 responses whilst the inner returns 29. When executed on my web server nothing happened and when I tested it locally mysql ended up using 100% CPU and still achieved nothing. Could the size be the cause?
Specifically it causes the server to hang forever, I'm fairly sure the web server I ran the query on is in the process of crashing due to it (woops).

why don't you use an inner join for this query? i think that would be faster (and easier to read) - and maybe it solves your problem (but i can't find a failure in your query).
EDIT: the inner-join-solution would look like this:
SELECT
person.*
FROM
person
INNER JOIN
participation
ON
person.id = participation.user_id
WHERE
participation.activity_id = '1'
AND
participation.application_id = '1'

How many rows are there in participation table and what indexes are there?
A multi-column index on (user_id, activity_id, application_id) could help here.
Re comments: IN isn't slow. Subqueries within IN can be slow, if they're correlated to outer query.

Related

AWS RDS MySQL fetching not being completed

I am writing queries off tables in my AWS RDS MySQL server and I can't get the query to complete the fetching. The duration of the query is 5.328 seconds but then fetching just doesn't end. I have Left Joined a sub query. When I run the sub separately it runs very quick and almost has no fetch time. When I run the main query it works great. The main query does have about 97,000 rows. I'm new to AWS RDS Servers and wonder if there is a parameter adjustment I need to be made? I feel as if the query is pretty simple.
We are in the middle of switching from BigQuery and BigQuery runs it just fine with the same data and same query.
Any ideas of what I can do to get it to fetch and speed up the fetching?
I've tried indexing and changing buffer pool size but still no luck
FROM
project__c P
LEFT JOIN contact C ON C.id = P.salesperson__c
LEFT JOIN account A ON A.id = P.sales_team_account_at_sale__c
LEFT JOIN contact S ON S.id = P.rep_contact__c
LEFT JOIN (
SELECT
U.name,
C.rep_id__c EE_Code
FROM
user U
LEFT JOIN profile P ON P.id = U.profileid
LEFT JOIN contact C ON C.email = U.email
WHERE
(P.name LIKE "%Inside%"OR P.name LIKE "%rep%")
AND C.active__c = TRUE
AND C.rep_id__c IS NOT NULL
AND C.recordtypeid = "############"
) LC ON LC.name = P.is_rep_formula__c

You can analyze the query by adding EXPLAIN [your query] and running that to see what indexes are being used and how many rows are examined for each joined table. It might be joining a lot more rows than you expect.
You can try:
SELECT SQL_CALC_FOUND_ROWS [your query] limit 1;
If the problem is in sending too much data (i.e. more rows than you think it's trying to return), this will return quickly. It would prove that the problem does lie in sending the data, not in the query stage. If this is what you find, run the following next:
select FOUND_ROWS();
This will tell you how many total rows matched your query. If it's bigger than you expect, your joins are wrong in some way. The explain analyzer mentioned above should provide some insight. While the outer query has 97000 rows, each of those rows could join with more than one row from the subquery. This can happen if you expect the left side of the join to always have a value, but find out there are rows in which it is empty/null. This can lead to a full cross join where every row from the left joins with every row on the right.
If the limit 1 query also hangs, the problem is in the query itself. Again, the explain analyzer should tell you where the problem lies. Most likely it's a missing index causing very slow scans of tables instead of fast lookups in both tables joins and where clauses.
The inner query might be fine/fast on its own. When you join with another query, if not indexed/joined properly, it could lead to a result set and/or query time many times larger.
If missing indices with derived tables is the problem, read up on how mysql can optimize via different settings by visiting https://dev.mysql.com/doc/refman/5.7/en/derived-table-optimization.html
As seen in a comment to your question, creating a temp table and joining directly gives you control instead of relying on/hoping mysql to optimize your query in a way that uses fast indices.
I'm not versed in BigQuery, but unless it's running the same core mysql engine under the hood, it's not really a good comparison.
Why all the left joins. Your subquery seems to be targeted to a small set of results, yet you still want to get 90k+ rows? If you are using this query to render any sort of list in a web application, apply reasonable limits and pagination.

Fast to query slow to create table

I have an issue on creating tables by using select keyword (it runs so slow). The query is to take only the details of the animal with the latest entry date. that query will be used to inner join another query.
SELECT *
FROM amusementPart a
INNER JOIN (
SELECT DISTINCT name, type, cageID, dateOfEntry
FROM bigRegistrations
GROUP BY cageID
) r ON a.type = r.cageID
But because of slow performance, someone suggested me steps to improve the performance. 1) use temporary table, 2)store the result and use it and join it the the other statement.
use myzoo
CREATE TABLE animalRegistrations AS
SELECT DISTINCT name, type, cageID, MAX(dateOfEntry) as entryDate
FROM bigRegistrations
GROUP BY cageID
unfortunately, It is still slow. If I only use the select statement, the result will be shown in 1-2 seconds. But if I add the create table, the query will take ages (approx 25 minutes)
Any good approach to improve the query time?
edit: the size of big registration table is around 3.5 million rows

Can you please try the query in the way below to achieve The query is to take only the details of the animal with the latest entry date. that query will be used to inner join another query, the query you are using is not fetching records as per your requirement and it will faster:
SELECT a.*, b.name, b.type, b.cageID, b.dateOfEntry
FROM amusementPart a
INNER JOIN bigRegistrations b ON a.type = b.cageID
INNER JOIN (SELECT c.cageID, max(c.dateOfEntry) dateofEntry
FROM bigRegistrations c
GROUP BY c.cageID) t ON t.cageID = b.cageID AND t.dateofEntry = b.dateofEntry
Suggested indexing on cageID and dateofEntry

This is a multipart question.
Use Temporary Table
Don't use Distinct - group all columns to make distinct (dont forget to check for index)
Check the SQL Execution plans

Here you are not creating a temporary table. Try the following...
CREATE TEMPORARY TABLE IF NOT EXISTS animalRegistrations AS
SELECT name, type, cageID, MAX(dateOfEntry) as entryDate
FROM bigRegistrations
GROUP BY cageID

Have you tried doing an explain to see how the plan is different from one execution to the next?
Also, I have found that there can be locking issues in some DB when doing insert(select) and table creation using select. I ran this in MySQL, and it solved some deadlock issues I was having.
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

The reason the query runs so slow is probably because it is creating the temp table based on all 3.5 million rows, when really you only need a subset of those, i.e. the bigRegistrations that match your join to amusementPart. The first single select statement is faster b/c SQL is smart enough to know it only needs to calculate the bigRegistrations where a.type = r.cageID.
I'd suggest that you don't need a temp table, your first query is quite simple. Rather, you may just need an index. You can determine this manually by studying the estimated execution plan, or running your query in the database tuning advisor. My guess is you need to create an index similar to below. Notice I index by cageId first since that is what you join to amusementParks, so that would help SQL narrow the results down the quickest. But I'm guessing a bit - view the query plan or tuning advisor to be sure.
CREATE NONCLUSTERED INDEX IX_bigRegistrations ON bigRegistrations
(cageId, name, type, dateOfEntry)
Also, if you want the animal with the latest entry date, I think you want this query instead of the one you're using. I'm assuming the PK is all 4 columns.
SELECT name, type, cageID, dateOfEntry
FROM bigRegistrations BR
WHERE BR.dateOfEntry =
(SELECT MAX(BR1.dateOfEntry)
FROM bigRegistrations BR1
WHERE BR1.name = BR.name
AND BR1.type = BR.type
AND BR1.cageID = BR.cageID)

SQL Join condition optimization

I had an issue with a SQL query that was slow and lead me to a 500 Internal Server Error.
After playing a bit with the query I had found out that moving join conditions outside makes the query work and the error vanish.
Now I wonder:
is the second query is equivalent to the first one ?
what was my bug mistake, why it was so slow ?
Here the queries.
Original (slow) SQL:
SELECT *
FROM `TABLE_1` AS `main_table`
INNER JOIN `TABLE_2` AS `w`
ON main_table.entity_id = w.product_id
LEFT JOIN `TABLE_3` AS `ur`
ON main_table.entity_id = ur.product_id
AND ur.category_id IS NULL
AND ur.store_id = 1
AND ur.is_system = 1
WHERE (w.website_id = '1')
Faster SQL:
SELECT *
FROM `TABLE_1` AS `main_table`
INNER JOIN `TABLE_2` AS `w`
ON main_table.entity_id = w.product_id
LEFT JOIN `TABLE_3` AS `ur`
ON main_table.entity_id = ur.product_id
WHERE (w.website_id = '1')
AND ur.category_id IS NULL
AND ur.store_id = 1
AND ur.is_system = 1

Is the 2nd query equivalent to the 1st?
Depending on the data in TABLE_3, the 2nd query is NOT equivalent to the first, since you're using a LEFT JOIN.
Consider what happens when you have a record in TABLE_1, with an entity_id that does not match any rows in TABLE_3: Your first query would still return this record from TABLE_1, with NULL values for all the columns of TABLE_3. The 2nd query applies filters to the columns from TABLE_3, but since these are all NULLs, the record now gets filtered out.
In general, query 2 would thus return fewer records than query 1 (unless, of course, all your entity_id's match a product_id from TABLE_3).
What was my bug mistake, why it was so slow?
Both queries are valid SQL, so none of them should give you a 500 Internal Server Error. This does not even look like a MySQL error, so maybe the bug arose elsewhere? Perhaps you have a web application that is unable to handle the NULL's returned from the 1st query? It's impossible to answer the question without more details about the error.
As for the speed of the queries, this depends a lot on what indexes are defined. Generally, I would not expect the 2nd query to be faster than the 1st, but that depends. As I mentioned above, the 1st query might return more records than the 2nd. If you're not querying the database directly, but instead through some application, could it be the application slowing down your query?

Simple MySQL INSERT query very slow

I think I'm having an issue with my MySQL server or the Query I'm using. I'm not sure which.
Server is VM Ubuntu12.4 4 cores/16gb Ram
MySQL 5.5.24 x86
My query:
INSERT INTO `NEWTEXT`.`Order_LineDetails`
( OrderLineItem_ID, Customer_ID, Order_ID, ProductName )
SELECT
'Order_Details'.'OrderDetailID',
'Orders'.'CustomerID',
'Order_Details'.'OrderID',
'prods'.'ProductName'
FROM Order_Details
JOIN Orders ON Orders.OrderID = Order_Details.OrderID
JOIN Products prods ON prods.ProductID = Order_Details.ProductID
WHERE Orders.OrderID = 500000
I'm not really sure where to start looking for the problem. The above query takes 9+ seconds to complete. The Order_Details table contains 1,800,000+ records in it.
The thing that is bugging me on this is that when I run a select query it also goes slow. BUT, I have another server that's running win2k MsSql and its almost instant with the same SELECT query.
I'm hoping someone could point me in the right direction here.
EDIT
Well, sorry for the troubles and thanks for your help.
I found that the problem was that after I finished the import I skipped the step where I would normally assign the new tables a PrimaryKey. I know, :( dumb.
Anyway! Don't forget to assign your Primary Keys!

Back to the start:
http://dev.mysql.com/doc/refman/5.5/en/optimizing-primary-keys.html

MySQL Query not returning

I'm unfamiliar with MySQL, and I don't know where to begin with solving this query issue.
SELECT *
FROM `rmedspa`.`clients` c
inner join `rmedspa`.`tickets` t on c.clientid = t.clientid
where c.fldclass is not null
AND t.ticketID > 0
This query returns just fine in MySQL Workbench, in 30 seconds, and the IDE is limiting the query results to 1000 records. The database is not on my own machine, but on a server that is in a different location (in other words, it's going out to the internet and it's slow). If I add an order by at the end, the query never returns.
SELECT *
FROM `rmedspa`.`clients` c
inner join `rmedspa`.`tickets` t on c.clientid = t.clientid
where c.fldclass is not null
AND t.ticketID > 0
ORDER BY t.ticketid
There are "many" tickets for 1 client. t.ticketid is an int. clientid is an int, too.
I don't know where to begin to find out why the ORDER BY is causing this query to never return. It doesn't fail, it just doesn't return.

This Post is a short review of previous comments plus some hints on SQL-Query for you.
Query building
Whenever the column names of a join condition match, you can rewrite the join statement as follows:
INNER JOIN {database}.{table} t USING({joinColumn})
Check your indexes - in your case: Are the columns of the join condition and the where statement indexed?
mySQL result
The ORDER BY statement often requires a reordering of the result set in memory. If you expected large results (many rows) and/or large rows (many columns) but see no result the mySQL Server is not properly configured for that case - How MySQL Uses Memory, Optimizing Queries with EXPLAIN

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008