Query takes so long in MySQL - mysql

I have 3 models called stores, customers, subscriptions.
subscription has two foreign keys from store and customer models and also has start_date and end_date.
The tables are pretty simple. store only has id and name same as customers.
I'm running this query.
SELECT subscription_subscription.store_id, COUNT(*) AS sub_store
FROM subscription_subscription
WHERE CURRENT_DATE() <= subscription_subscription.end_date
GROUP BY subscription_subscription.store_id
ORDER BY sub_store DESC
And here it is: 621760 total, Query took 9.6737 seconds.
All of tables have 1 million rows.
But when I remove the WHERE CURRENT_DATE() <= subscription_subscription.end_date query takes 0.3177 seconds.
How can I optimize date comparison?

You can try these two things:
use a variable to store CURRENT_DATE() and use this variable in query instead of function
Create an index on end_date which includes store_id

Related

How to speed up query for datetime in Mysql

SELECT *
FROM LOGS
WHERE datetime > DATE_SUB(NOW(), INTERVAL 1 MONTH)
I have a big table LOGS (InnoDB). When I try to get last month's data, the query waits too long.
I created an index for column datetime but it seems not helping. How to speed up this query?
Since the database records are inserted in oldest to newest, you could create 2 calls. The first call requesting the ID of the oldest record:
int oldestRecordID = SELECT TOP 1 MIN(id)
FROM LOGS
WHERE datetime > DATE_SUB(NOW(), INTERVAL 1 MONTH)
Then with that ID just request all records where ID > oldestRecordID:
SELECT *
FROM LOGS
WHERE ID > oldestRecordID
It's multiple calls, but it could be faster however I am sure you could combine those 2 calls too.
Probably the only thing you can do is create a clustered index on datetime. This will ensure that the values are co-located.
However, I don't think this will solve your real problem. Why are you bringing back all records from a month. This is a lot of data.
In all likelihood, you could summarize the data in the database and only bring back the information you need rather than all the data.

How to generate faster mysql query with 1.6M rows

I have a table that has 1.6M rows. Whenever I use the query below, I get an average of 7.5 seconds.
select * from table
where pid = 170
and cdate between '2017-01-01 0:00:00' and '2017-12-31 23:59:59';
I tried adding a LIMIT 1000 or 10000 or change the date to filter for 1 month, it still processes it to an average of 7.5s. I tried adding a composite index for pid and cdate but it resulted to 1 second slower.
Here is the INDEX list
https://gist.github.com/primerg/3e2470fcd9b21a748af84746554309bc
Can I still make it faster? Is this an acceptable performance considering the amount of data?
Looks like the index is missing. Create this index and see if its helping you.
CREATE INDEX cid_date_index ON table_name (pid, cdate);
And also modify your query to below.
select * from table
where pid = 170
and cdate between CAST('2017-01-01 0:00:00' AS DATETIME) and CAST('2017-12-31 23:59:59' AS DATETIME);
Please provide SHOW CREATE TABLE clicks.
How many rows are returned? If it is 100K rows, the effort to shovel that many rows is significant. And what will you do with that many rows? If you then summarize them, consider summarizing in SQL!
Do have cdate as DATETIME.
Do you use id for anything? Perhaps this would be better:
PRIMARY KEY (pid, cdate, id) -- to get benefit from clustering
INDEX(id) -- if still needed (and to keep AUTO_INCREMENT happy)
This smells like Data Warehousing. DW benefits significantly from building and maintaining Summary table(s), such as one that has the daily click count (etc), from which you could very rapidly sum up 365 counts to get the answer.
CAST is unnecessary. Furthermore 0:00:00 is optional -- it can be included or excluded for either DATE or DATETIME. I prefer
cdate >= '2017-01-01'
AND cdate < '2017-01-01' + INTERVAL 1 YEAR
to avoid leap year, midnight, date arithmetic, etc.

How can I make this sql query faster?

I have a table user_notifications that has 1100000 records and I have to run this below query but it takes more than 3 minutes to complete the query what can I do to improve the fetch time.
SELECT `user_notifications`.`user_id`
FROM `user_notifications`
WHERE `user_notifications`.`notification_template_id` = 175
AND (DATE(sent_at) >= DATE_SUB(CURDATE(), INTERVAL 4 day))
AND `user_notifications`.`user_id` IN (
1203, 1282, 1499, 2244, 2575, 2697, 2828, 2900, 3085, 3989,
5264, 5314, 5368, 5452, 5603, 6133, 6498..
)
the user ids in IN block are sometimes upto 1k.
for optimisation I have indexed on user_id and notification_template_id column in user_notification table.
Big IN() lists are inherently slow. Create a temporary table with an index and put the values in the IN() list into that tempory table instead, then you'll get the power of an indexed join instead of giant IN() list.
You seem to be querying for a small date range. How about having an index based on SENT_AT column? Do you know what index the current query is using?
(1) Don't hide columns in functions if you might need to use an index:
AND (DATE(sent_at) >= DATE_SUB(CURDATE(), INTERVAL 4 day))
-->
AND sent_at >= CURDATE() - INTERVAL 4 day
(2) Use a "composite" index for
WHERE `notification_template_id` = 175
AND sent_at >= ...
AND `user_id` IN (...)
The first column should be the one with '='. It is unclear what to put next, so I suggest adding both of these indexes:
INDEX(notification_template_id, user_id, sent_at)
INDEX(notification_template_id, sent_at)
The Optimizer will probably pick between them correctly.
Composite indexes are not the same as indexes on the individual columns.
(3) Yes, you could try putting the IN list in a tmp table, but the cost of doing such might outweigh the benefit. I don't think of 1K values in IN() as being "too many".
(4) My cookbook on building indexes.

Getting last 30 days of records

I have a table called 'Articles' in that table I have 2 columns that will be essential in creating the query I want to create. The first column is the dateStamp column which is a datetime type column. The second column is the Counter column which is an int(255) column. The Counter column technically holds the views for that particular field.
I am trying to create a query that will generate the last 30 days of records. It will then order the records based on most viewed. This query will only pick up 10 records. The current query I have is this:
SELECT *
FROM Articles
WHERE DATEDIFF(day, dateStamp, getdate()) BETWEEN 0 and 30
LIMIT 10
) TOP10
ORDER BY Counter DESC
This query is not displaying any records, but I don't understand what I am doing wrong. Any suggestions?
The MySQL version of the query would look like this:
SELECT a.*
FROM Articles a
WHERE a.dateStamp >= CURDATE() - interval 30 day
ORDER BY a.counter DESC
LIMIT 10;
Your query is generating an error. You should look at that error before fixing the query.
The query would look different in SQL Server.

How to optimize query with date calculation

This is my table structure (about 1 millions records):
I need to select a few indices at certain dates, but only Year and Month are relevant:
SELECT `index_name`,`results` FROM `mst_ind` WHERE
((`index_name`='MSCI EAFE Mid NR USD' AND MONTH(`date`) = 3 AND YEAR(`date`) = 2003) OR
(`index_name`='MSCI Morocco PR USD' AND MONTH(`date`) = 3 AND YEAR(`date`) = 2003))
AND `time_period`='M1'
It works fine, but the performance is horrible. I run the query through profiler, but it could not suggest any possible keys.
The primary key contains index_id, date and time_period.
How can I optimize/improve this query?
Thanks!
Update: the explain report:
You are probably invalidating the use of an index as you are applying a transformation to fields that would be indexed by using functions such as MONTH and YEAR.
You could:
write the WHERE clause differently such that it doesn't use the MONTH and YEAR functions, such as:
date >= '2003-03-01' and date < '2003-04-01'
Edit: just realized you probably don't have any indexes on this table. Consider adding indexes to the index_name, date and time_period field.