I'm trying to find the most recent entry time at a bunch of specific dates. When I run
select max(ts) as maxts from factorprice where ts <= '2011-1-5'
It returns very quickly.
EXPLAIN gives select_type SIMPLE and "Select tables optimized away".
But when I run
select (select max(ts) from factorprice where ts <= dates.dt) as maxts, dates.dt
from
trends.dates where dates.dt in ('2011-1-6');
It takes a long time to return (~10 seconds).
Explain gives:
select_type=PRIMARY table=dates rows=506 Extra=Using where
select_type=DEPENDENT SUBQUERY table=factorprice type=index
possible_keys=PRIMARY key=PRIMARY keylen=8 rows=26599224 Extra=Using
where; Using index
This query also takes a long time (10 sec)
select dt, max(ts) as maxts from factorprice as f inner join trends.dates as d
where ts <= dt and dt in ('2011-1-6')
group by dt;
Explain gives:
select_type=SIMPLE table=d type=ALL rows =509 Extra=Using where
select_type=SIMPLE table=f type=range possible_keys=PRIMARY key=PRIMARY keylen=8 rows=26599224 Extra=Using
where; Using index
I'd like to do this same operation on many different dates. Is there a way I can do that efficiently?
It looks like this bug:
http://bugs.mysql.com/bug.php?id=32665
Maybe if you create an index on dates.dt, it will go away.
This part of your SQL is a dependent query
select max(ts) from factorprice where ts <= dates.dt
which is executed for each row in the resultset. So the total time is approximately the time of the standalone query times the rows in the result set.
Judging from your EXPLAIN output. This query is visiting 506 rows in the dates table and then for each of those rows, over 26 million rows in the factorprice table. 10 seconds to do all of that isn't too bad.
My guess is that you are inadvertently creating a CROSS JOIN situation where every row of one table is matched up with every row in another table.
Related
I have this query which takes around 29 second to perform and need to make it faster. I have created index on aggregate_date column and still no real improvement. Each aggregate_date has almost 26k rows within the table.
One more thing the query will run starting from 1/1/2018 till yesterday date
select MAX(os.aggregate_date) as lastMonthDay,
os.totalYTD
from (
SELECT aggregate_date,
Sum(YTD) AS totalYTD
FROM tbl_aggregated_tables
WHERE subscription_type = 'Subcription Income'
GROUP BY aggregate_date
) as os
GROUP by MONTH(os.aggregate_date), YEAR(os.aggregate_date);
I used Explain Select ... and received the following
update
The most of query time is consumed by the inner Query, so as scaisEdge suggested bellow i have tested the query and the time is reduced to almost 8s.
the Inner Query will look like:
select agt.aggregate_date,SUM(YTD)
from tbl_aggregated_tables as agt
FORCE INDEX(idx_aggregatedate_subtype_YTD)
WHERE agt.subscription_type = 'Subcription Income'
GROUP by agt.aggregate_date
I have noticed that this comparison "WHERE agt.subscription_type = 'Subcription Income'" takes the most of time. So is there any way to change that and to be mentioned the column of subscription_type only have 2 values which is 'Subcription Income' and 'Subcription Unit'
The index on aggregate_date column is not useful for performance because in not involved in where condition
looking to your code an useful index should be on column subscription_type
you could try using a redundant index adding also the column involved in select clause (for try to obtain all the data in query from index avoiding access to table)so you index could be
create idx1 on tbl_aggregated_tables (subscription_type, aggregate_date, YTD )
the meaning of the last group by seems not coherent with the select clause
In Mysql, I have a table with two columns (id, uuid). Then I inserted 30 million values into it. (ps: the uuid can repeated)
Now, I want to find the repeat value in the table by using Mysql grammar, but the sql spent too much time.
I want to search all columns, but it takes much time, so I tried querying first million rows, the it spent 8 seconds.
Then I tried with 10 million rows, it spend 5mins,
then with 20 million rows, the server seem died.
select count(uuid) as cnt
from uuid_test
where id between 1
and 1000000
group by uuid having cnt > 1;
Anyone can help me to optimized the sql, thanks
Try this query,
SELECT uuid, count(*) cnt FROM uuid_test GROUP BY 1 HAVING cnt>1;
Hope it helps.
Often the fastest way to find duplicates uses a correlated subquery rather than aggregation:
select ut2.*
from uuid_test ut2
where exists (select 1
from uuid_test ut2
where ut2.uuid = ut.uuid and
ut2.id <> ut.id
);
This can take advantage of an index on uuid_test(uuid, id).
I don't know why the query I've wroten, doesn't give me any output and just can't finish. Here http://sqlfiddle.com/#!9/8656d2/1 is the sample of my database, in real I have there about 260k records(rows). So you can see that query works with that table in the link, but in my whole database something is wrong. I waited almost 30 minutes for any results, but the query process is interminably. I don't know what should I do now, what can be the reason of described problem?
I don't actually know why the query isn't completing on your MySQL with 260K records. But I can speculate that the correlated subquery you have in the SELECT statement is the culprit. Let's look closely at that guy:
SELECT DATE_SUB(MAX(EVENT_TIME), INTERVAL 12 HOUR)
FROM my_table mt
WHERE
EVENT_TYPE = '2' AND
mt.ID = my_table.ID
You are basically telling MySQL to do a MAX() calculation across the entire table, for every record in your my_table table. Note that because the subquery is correlated, it might have to be run fresh for literally all 260K records. Hopefully you can see that 260K x 260K operations would be a bit slow.
If I be correct, then a possible fix would be to rephrase your query using a join to a subquery table which finds the max event times for each ID in your table. This query would be run once, and only once, and then it would be up to MySQL to find an efficient way to join back to your original table. But in any case, this approach should be eons faster than what you were using.
SELECT t1.*
FROM my_table t1
INNER JOIN
(
SELECT ID, MAX(EVENT_TIME) AS max_event_time
FROM my_table
WHERE EVENT_TYPE = '2'
GROUP BY ID
) t2
ON t1.ID = t2.ID AND
t1.EVENT_TIME BETWEEN
DATE_SUB(t2.max_event_time, INTERVAL 12 HOUR) AND
t2.max_event_time
WHERE t1.EVENT_TYPE != 3
ORDER BY t1.ID;
Here is link to your updated Fiddle:
Demo
I have query like this
UPDATE linksupload as lu SET lu.progress = (SELECT SUM(l.submitted)
FROM links l
WHERE l.upload_id = lu.id)
It takes 10 sec to execute. linksupload contains 10 rows, links contains 700k rows.
Query:
UPDATE linksupload as lu SET lu.progress = (SELECT count(*)
FROM links l
WHERE l.upload_id = lu.id)
takes 0,0003 sek to execute. Also select with sum with group by from first query is fast. upload_id and id are indexes. Why first query takes so long time to execute? How to speed it up?
Indexes allow the database application to find data fast; without reading the whole table.
second query should be just count so it is not reading table. But first query should be sum of submitted column.So it is reading table.
First query should be slow.
I am having join query that seems fetching slowly. How can I optimize it, or it is
reasonable?
time to execute
29 total, Query took 1.6956 sec
mysql query
SELECT SQL_CALC_FOUND_ROWS
t2.AuctionID ,t2.product_name ,t3.user_name ,t1.date_time ,t1.owned_price
,t2.specific_product_id
FROM table_user_ownned_auction AS t1
INNER JOIN table_product AS t2 ON t1.specific_product_id=t2.specific_product_id
INNER JOIN table_user_information AS t3 ON t3.user_id=t1.user_id
ORDER BY ownned_id DESC
Here's the explain output
Looking at the explain output, your problem is in
The second line: The join with table t1.
Put an index on t1.specific_product_id and t2.specific_product_id.
the first line has one 3 rows in it, using filesort on that is actually faster than using the index because it saves on I/O-time.
The following code will add an index to t2.specific_product_id.
ALTER TABLE table_product ADD INDEX spi(specific_product_id);
Because you only have 29 rows of output, using the index should speed up your query to instantaneous.
If you want to understand the performance issues of a query, just use the EXPLAIN keyword in front of your query:
EXPLAIN SELECT SQL_CALC_FOUND_ROWS
,t2.AuctionID ,t2.product_name ,t3.user_name ,t1.date_time ,t1.owned_price
,t2.specific_product_id
FROM table_user_ownned_auction AS t1 inner
JOIN table_product AS t2 ON t1.specific_product_id=t2.specific_product_id
INNER JOIN table_user_information AS t3 ON t3.user_id=t1.user_id
ORDER BY ownned_id DESC
It will tell you important information about your query. The most important columns are "key" and "Extra".
If "key" is NULL you need an index. Mostly for columns that are used in WHERE or GROUP BY or ORDER BY statements. "Extra" tells you about resource-consuming (CPU or Memory) operations.
So, add an index on the "ownned_id" (which i presume should be "owned_id") and explain it again. Then look at the performance gain.
If you have problems, I can help you better if you paste the EXPLAIN output.
By looking at your explain table ,type are all which is very bad if you have more than 10 000 row in your table .I Strongly advise to index this column in your table
t1.specific_product_id
t2.specific_product_id
t3.user_id
t1.user_id
If should your table reach 10 000 row , you should be be able to see perfomance boost. For more information please this video from 00:00 to 02:04 minutes , As you can in the below video , before indexing the query have to search more than 90000 row of data and after index it will search less than 5 row of data hope it will help.
https://www.youtube.com/edit?o=U&video_id=ojyEcNMAj8k