I have this query which takes around 29 second to perform and need to make it faster. I have created index on aggregate_date column and still no real improvement. Each aggregate_date has almost 26k rows within the table.
One more thing the query will run starting from 1/1/2018 till yesterday date
select MAX(os.aggregate_date) as lastMonthDay,
os.totalYTD
from (
SELECT aggregate_date,
Sum(YTD) AS totalYTD
FROM tbl_aggregated_tables
WHERE subscription_type = 'Subcription Income'
GROUP BY aggregate_date
) as os
GROUP by MONTH(os.aggregate_date), YEAR(os.aggregate_date);
I used Explain Select ... and received the following
update
The most of query time is consumed by the inner Query, so as scaisEdge suggested bellow i have tested the query and the time is reduced to almost 8s.
the Inner Query will look like:
select agt.aggregate_date,SUM(YTD)
from tbl_aggregated_tables as agt
FORCE INDEX(idx_aggregatedate_subtype_YTD)
WHERE agt.subscription_type = 'Subcription Income'
GROUP by agt.aggregate_date
I have noticed that this comparison "WHERE agt.subscription_type = 'Subcription Income'" takes the most of time. So is there any way to change that and to be mentioned the column of subscription_type only have 2 values which is 'Subcription Income' and 'Subcription Unit'
The index on aggregate_date column is not useful for performance because in not involved in where condition
looking to your code an useful index should be on column subscription_type
you could try using a redundant index adding also the column involved in select clause (for try to obtain all the data in query from index avoiding access to table)so you index could be
create idx1 on tbl_aggregated_tables (subscription_type, aggregate_date, YTD )
the meaning of the last group by seems not coherent with the select clause
Related
I want to move back from any given date. Help me in getting 3 months back date from given date.
I have tried DATE_SUB, DATE_ADD functions which do not return any result an keep running the query.
SELECT * FROM Table a
INNER JOIN Tabel e
on a.MobileNumber=e.Phone_number
AND e.bill_date<a.invoice_date
AND e.bill_date>DATE_ADD(a.invoice_date,INTERVAL -3 MONTH)
Query keep running and never stops. If i just remove the last condition, it shows the results in less than a second.
Adding an index along the lines of the following might help the performance:
CREATE INDEX idx ON table_e (Phone_number, bill_date, col1, col2);
Here col1 and col2 are the other two columns which might appear in the SELECT clause. The strategy of this index, if used, would be to scan the relatively small table_a, which only has 309 records. For each record in a, MySQL would then use the above index to rapidly find the matching records in the e table.
If you can update then it is better to create index on a.invoice_date and . Please find the link for the same.
I have this query
SELECT `PR_CODIGO`, `PR_EXIBIR`, `PR_NOME`, `PRC_DETALHES` FROM `PROPRIETARIOS` LEFT JOIN `PROPRIETARIOSCONTATOS` ON `PROPRIETARIOSCONTATOS`.`PRC_COD_CAD` = `PROPRIETARIOS`.`PR_CODIGO` WHERE `PR_EXIBIR` = 'T' LIMIT 20
It runs very fast, less than 1 second.
If i add GROUP BY, it takes several seconds (5+) to run. Even the Group By field being index.
I'm using group by because the query above returns repeated rows (i search for a name and his contacts on another table, show's 4 times same name).
How do i fix this?
With the GROUP BY clause, the LIMIT clause isn't applied until after the rows are collapsed by the group by operation.
To get an understanding of the operations that MySQL is performing and which indexes are being considered and chosen by the optimizer, we use EXPLAIN.
Unstated in the question is what "field" (columns or expressions) are in the GROUP BY clause. So we are only guessing.
Based on the query shown in the question...
SELECT pr.pr_codigo
, pr.pr_exibir
, pr.pr_nome
, prc.prc_detalhes
FROM `PROPRIETARIOS` pr
LEFT
JOIN `PROPRIETARIOSCONTATOS` prc
ON prc.prc_cod_cad = pr.pr_codigo
WHERE pr.pr_exibir = 'T'
LIMIT 20
Our guess at the most appropriate indexes...
... ON PROPRIETARIOSCONTATOS (prc_cod_cad, prc_detalhes)
... ON PROPRIETARIOS (pr_exibir, pr_codigo, pr_exibir, pr_nome)
Our guess is going to change depending on what column(s) are listed in the GROUP BY clause. And we might also suggest an alternative query to return an equivalent result.
But without knowing the GROUP BY clause, without knowing if our guesses about which table each column is from are correct, without knowing the column datatypes, without any estimates of cardinality, and without example data and expected output, ... we're flying blind and just making guesses.
I have a query that works, but it is slow. Is there a way to speed this up? Basically I have a table with timecard entries, and then a second table with time breakdowns of that entry, related by the TimecardID. What I am looking for is timeblocks that there are no breakdowns for. I thought if I cut the criteria down to 2 months that it would speed it up. Thanks for your help
SELECT * FROM Timecards
WHERE NOT EXISTS (SELECT TimeCardID FROM TimecardBreakdown WHERE Timecards.ID = TimecardBreakdown.TimeCardID)
AND Status <> 0
AND DateIn >= CURRENT_DATE() - INTERVAL 2 MONTH
It seems you want to know the TimecardIDs which do not exist in the TimecardBreakdown table, in which case you can use the left outer join.
SELECT a.*
FROM Timecards a
LEFT OUTER JOIN TimecardBreakdown b ON a.TimecardID = b.TimecardID
WHERE b.TimecardID IS NULL
This would get rid of the subquery (which is expensive) and use join (which is more efficient).
MySQL stinks doing correlated subqueries fast. Try to make your subqueries independent and join them. You can use the LEFT JOIN ... IS NULL pattern to replace WHERE NOT EXISTS.
SELECT tc.*
FROM Timecards tc
LEFT JOIN TimecardBreakdown tcb ON tc.ID = tcb.TimeCardId
WHERE tc.DateIn >= CURRENT_DATE() - INTERVAL 2 MONTH
AND tc.Status <> 0
AND tcb.TimeCardId IS NULL
Some optimization points.
First, if you can change tc.Status <> 0 to tc.Status > 0 it makes an index range scan possible on that column.
Second, when you're optimizing stuff, SELECT * is considered harmful. Instead, if you can give the names of just the columns you need, things will be quicker. The database server has to sling around all the data you ask for; it can't tell if you're going to ignore some of it.
Third, this query will be helped by a compound index on Timecards (DateIn, Status, ID). That compound index can be used to do the heavy lifing of satisfying your query conditions.
That's called a covering index; it contains the data needed to satisfy much of your query. If you were to index just the DateIn column, then the query handler would have to bounce back to the main table to find the values of Status and ID. When those columns appear in the index, it saves that extra operation.
If you SELECT a certain set of columns rather than doing SELECT *, including those columns in the covering index can dramatically improve query performance. That's one of several reasons SELECT * is considered harmful.
(Some makes and model of DBMS have ways to specify lists of columns to ride along on indexes without actually indexing them. MySQL requires you to index them. But covering indexes still help.)
Read this: http://use-the-index-luke.com/
I have a problem optimizing a really slow SQL query. I think is an index problem, but I can´t find which index I have to apply.
This is the query:
SELECT
cl.ID, cl.title, cl.text, cl.price, cl.URL, cl.ID AS ad_id, cl.cat_id,
pix.file_name, area.area_name, qn.quarter_name
FROM classifieds cl
/*FORCE INDEX (date_created) */
INNER JOIN classifieds_pix pix ON cl.ID = pix.classified_id AND pix.picture_no = 0
INNER JOIN zip_codes zip ON cl.zip_id = zip.zip_id AND zip.area_id = 132
INNER JOIN area_names area ON zip.area_id = area.id
LEFT JOIN quarter_names qn ON zip.quarter_id = qn.id
WHERE
cl.confirmed = 1
AND cl.country = 'DE'
AND cl.date_created <= NOW() - INTERVAL 1 DAY
ORDER BY
cl.date_created
desc LIMIT 7
MySQL takes about 2 seconds to get the result, and start working in pix.picture_no, but if I force index to "date_created" the query goes much faster, and takes only 0.030 s. But the problem is that the "INNER JOIN zip_codes..." is not always in the query, and when is not, the forced index make the query slow again.
I've been thinking in make a solution by PHP conditions, but I would like to know what is the problem with indexes.
These are several suggestions on how to optimize your query.
NOW Function - You're using the NOW() function in your WHERE clause. Instead, I recommend to use a constant date / timestamp, to allow the value to be cached and optimized. Otherwise, the value of NOW() will be evaluated for each row in the WHERE clause. An alternative to a constant value in case you need a dynamic value, is to add the value from the application (for example calculate the current timestamp and inject it to the query as a constant in the application before executing the query.
To test this recommendation before implementing this change, just replace NOW() with a constant timestamp and check for performance improvements.
Indexes - in general, I would suggest adding an index the contains all columns of your WHERE clause, in this case: confirmed, country, date_created. Start with the column that will cut the amount of data the most and move forward from there. Make sure you adjust the WHERE clause to the same order of the index, otherwise the index won't be used.
I used EverSQL SQL Query Optimizer to get these recommendations (disclaimer: I'm a co-founder of EverSQL and humbly provide these suggestions).
I would actually have a compound index on all elements of your where such as
(country, confirmed, date_created)
Having the country first would keep your optimized index subset to one country first, then within that, those that are confirmed, and finally the date range itself. Don't query on just the date index alone. Since you are ordering by date, the index should be able to optimize it too.
Add explain in front of the query and run it again. This will show you the indexes that are being used.
See: 13.8.2 EXPLAIN Statement
And for an explanation of explain see MySQL Explain Explained. Or: Optimizing MySQL: Queries and Indexes
I have a table with ~3M rows. The rows are date, time, msec, and some other columns with int data. Some unknown fraction of these rows are considered 'invalid' based on their existence in a separate table outages (based on date ranges).
Currently the query does a select * and then uses a huge WHERE to remove the invalid date ranges ( lots of 'and not ( RecordDate >'2008-08-05' and RecordDate < '2008-08-10' )') and so on. This blows away any chance of using an index.
Im looking for a better way to limit the results. As it stands now the query takes several minutes to run.
DELETE b FROM bigtable b
INNER JOIN outages o ON (b.`date` BETWEEN o.datestart AND o.dateend)
WHERE (1=1) //In some modes MySQL demands a `where` clause or it will not run.
Make sure you have an index on all fields involved in the query.