slow sql query - selecting data depending on date interval

slow sql query - selecting data depending on date interval - mysql

I have a query that is causing me some trouble. I'm wondering if there is a better way to write this SQL;
SELECT * FROM report
WHERE blogid = 1769577
AND DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= datetime
so as its faster to fetch the results.
Thanks in advance.

I don't see anything wrong with the query, but you could make sure to have indexes on the blogid and datetime columns
If your table is huge, you might consider horizontal partitioning, which can have a significant impact on performance. See http://dev.mysql.com/tech-resources/articles/performance-partitioning.html

In Oracle SQL, you can do date arithmetic through +/- operators. I don't know if it would work in MySQL, but you might as well try doing datetime >= (CURDATE() - 30).
SELECT * FROM report WHERE blogid = 1769577 AND datetime >= (curdate() - 30)
Edit: This blog entry seems to confirm my suggestion: http://mysql-tips.blogspot.com/2005/04/mysql-date-calculations.html

I suggest using the internal datediff() function instead of subtract the interval. like this:
datediff(curdate(), datetime) < 30
so the query is:
SELECT * FROM report WHERE blogid = 1769577 AND datediff(curdate(), datetime) < 30

Guess 1: your blogid is not an int column. Then you can read the MySQL manual:
Comparison of dissimilar columns may
prevent use of indexes if values
cannot be compared directly without
conversion. Suppose that a numeric
column is compared to a string column.
For a given value such as 1 in the
numeric column, it might compare equal
to any number of values in the string
column such as '1', ' 1', '00001', or
'01.e1'. This rules out use of any
indexes for the string column.
Guess 2: your blogid is not indexed.
Guess 3: the reports table is myisam. In this case when you modify data MySQL uses table level locking on the whole reports table. You say that every time a blog is viewed a new record is added. These frequent updates may cause table level locking and slow down your select queries.
Otherwise your query is fine.
Cheers!

Related

Add minutes to datetime in SQL SELECT from other column

I have an SQL query, which selects a datetime from a database. In another column called "add_minutes" are minutes I want to add to the datetime within the query. It should look like this:
SELECT * from availability WHERE ? < (datetime_start + add_minutes)
Any hints how to solve this?
Thank you!

SELECT *
FROM availability
WHERE ? < DATE_ADD(datetime_start, INTERVAL add_minutes MINUTE)
Also:
SELECT *
FROM availability
WHERE ? < ADDTIME(datetime_start, SEC_TO_TIME(add_minutes * 60))
Note that MySql is dumb, and both DATE_ADD() and ADDTIME() work with string expressions. Because of potential localization/formatting issues, converting between numbers and strings can be surprisingly expensive operations, especially if you have to do this for every column in a table.
Additionally, what we're doing here breaks any possibility of using indexes you might have on these columns. You can improve performance considerably like this:
SELECT *
FROM availability
WHERE ADDTIME(?, SEC_TO_TIME(add_minutes * -60)) < datetime_start
This inverts the interval and adds it to the source value instead. It still needs to look at every value in the add_minutes column, regardless of index, but now datetime_start is unchanged, and therefore indexes on that column can still be used.
Just for fun, here's how Sql Server does it:
SELECT *
FROM availability
WHERE DATEADD(minute, add_minutes * -1, ?) < datetime_start
Sql Server is less dumb about it's DATEADD() function. Everything here is numeric; there are no messy conversions between strings and numbers or dates. Sql Server also supports computed columns with indexes. So you could include an column in the table defined as DATEADD(minute, add_minutes, datetime_start), and have an index on that column. IIRC, MySql also supports computed columns, but does not support indexes on those columns.

What keys should be indexed here to make this query optimal

I have a query that looks like the following:
SELECT * from foo
WHERE days >= DATEDIFF(CURDATE(), last_day)
In this case, days is an INT. last_day is a DATE column.
so I need two individual indexes here for days and last_day?

This query predicate, days >= DATEDIFF(CURDATE(), last_day), is inherently not sargeable.
If you keep the present table design you'll probably benefit from a compound index on (last_day, days). Nevertheless, satisfying the query will require a full scan of that index.
Single-column indexes on either one of those columns, or both, will be useless or worse for improving this query's performance.
If you must have this query perform very well, you need to reorganize your table a bit. Let's figure that out. It looks like you are trying to exclude "overdue" records: you want expiration_date < CURDATE(). That is a sargeable search predicate.
So if you added a new column expiration_date to your table, and then set it as follows:
UPDATE foo SET expiration_date = last_day + INTERVAL days DAY
and then indexed it, you'd have a well-performing query.

You must be careful with indexes, they can help you reading, but they can reduce performance in insert.
You may consider to create a partition over last_day field.
I should try to create only in last_day field, but, I think the best is making some performance tests with different configurations.

Since you are using an expression in the where criteria, mysql will not be able to use indexes on any of the two fields. If you use this expression regularly and you have at least mysql v5.7.8, then you can create a generated column and create an index on it.
The other option is to create a regular column and set its value to the result of this expression and index this column. You will need triggers to keep it updated.

Why does MySQL drops my index when using DATE(`table`.`column`)

I have a MySQL innodb table with a few columns.
one of them is named "dateCreated" which is a DATETIME column and it is indexed.
My query:
SELECT
*
FROM
`table1`
WHERE
DATE(`dateCreated`) BETWEEN '2014-8-7' AND '2013-8-7'
MySQL for some reason refuses to use the index on the dateCreated column (even with USE INDEX or FORCE INDEX.
However, if I change the query to this:
SELECT
*
FROM
`table1`
WHERE
`dateCreated` BETWEEN '2014-8-7' AND '2013-8-7'
note the DATE(...) removal
MySQL uses the index just fine.
I could manage without using the DATE() function, but this is just weird to me.
I understand that maybe MySQL indexes the full date and time and when searching only a part of it, it gets confused or something. But there must be a way to use a partial date (lets say MONTH(...) or DATE(...)) and still benefit from the indexed column and avoid the full table scan.
Any thoughts..?
Thanks.

As you have observed once you apply a function to that field you destroy access to the index. So,
It will help if you don't use between. The rationale for applying the function to the data is so you can get the data to match the parameters. There are just 2 parameter dates and several hundred? thousand? million? rows of data. Why not reverse this, change the parameters to suit the data? (making it a "sargable" predicate)
SELECT
*
FROM
`table1`
WHERE
( `dateCreated` >= '2013-08-07' AND `dateCreated` < '2014-08-07' )
;
Note 2013-08-07 is used first, and this needs to be true if using between also. You will not get any results using between if the first date is younger than the second date.
Also note that exactly 12 months of data is contained >= '2013-08-07' AND < '2014-08-07', I presume this is what you are seeking.
Using the combination of date(dateCreated) and between would include 1 too many days as all events during '2014-08-07' would be included. If you deliberately wanted one year and 1 day then add 1 day to the higher date i.e. so it would be < '2014-08-08'

Maximize efficiency of SQL SELECT statement

Assum that we have a vary large table. for example - 3000 rows of data.
And we need to select all the rows that thire field status < 4.
We know that the relevance rows will be maximum from 2 months ago (of curse that each row has a date column).
does this query is the most efficient ??
SELECT * FROM database.tableName WHERE status<4
AND DATE< '".date()-5259486."' ;
(date() - php , 5259486 - two months.)...

Assuming you're storing dates as DATETIME, you could try this:
SELECT * FROM database.tableName
WHERE status < 4
AND DATE < DATE_SUB(NOW(), INTERVAL 2 MONTHS)
Also, for optimizing search queries you could use EXPLAIN ( http://dev.mysql.com/doc/refman/5.6/en/explain.html ) like this:
EXPLAIN [your SELECT statement]
Another point where you can tweak response times is by carefully placing appropriate indexes.
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data.
Here are some explanations & tutorials on MySQL indexes:
http://www.tutorialspoint.com/mysql/mysql-indexes.htm
http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
However, keep in mind that using TIMESTAMP instead of DATETIME is more efficient; the former is 4 bytes; the latter is 8. They hold equivalent information (except for timezone issues).

3,000 rows of data is not large for a database. In fact, it is on the small side.
The query:
SELECT *
FROM database.tableName
WHERE status < 4 ;
Should run pretty quickly on 3,000 rows, unless each row is very, very large (say 10k). You can always put an index on status to make it run faster.
The query suggested by cassi.iup makes more sense:
SELECT *
FROM database.tableName
WHERE status < 4 AND DATE < DATE_SUB(NOW(), INTERVAL 2 MONTHS);
It will perform better with a composite index on status, date. My question is: do you want all rows with a status of 4 or do you want all rows with a status of 4 in the past two months? In the first case, you would have to continually change the query. You would be better off with:
SELECT *
FROM database.tableName
WHERE status < 4 AND DATE < date('2013-06-19');
(as of the date when I am writing this.)

Timestamp as int field, query performance

I'm storing timestamp as int field. And on large table it takes too long to get rows inserted at date because I'm using mysql function FROM_UNIXTIME.
SELECT * FROM table WHERE FROM_UNIXTIME(timestamp_field, '%Y-%m-%d') = '2010-04-04'
Is there any ways to speed this query? Maybe I should use query for rows using timestamp_field >= x AND timestamp_field < y?
Thank you
EDITED This query works great, but you should take care of index on timestamp_field.
SELECT * FROM table WHERE
timestamp_field >= UNIX_TIMESTAMP('2010-04-14 00:00:00')
AND timestamp_field <= UNIX_TIMESTAMP('2010-04-14 23:59:59')

Use UNIX_TIMESTAMP on the constant instead of FROM_UNIXTIME on the column:
SELECT * FROM table
WHERE timestamp_field
BETWEEN UNIX_TIMESTAMP('2010-04-14 00:00:00')
AND UNIX_TIMESTAMP('2010-04-14 23:59:59')
This can be faster because it allows the database to use an index on the column timestamp_field, if one exists. It is not possible for the database to use the index when you use a non-sargable function like FROM_UNIXTIME on the column.
If you don't have an index on timestamp_field then add one.
Once you have done this you can also try to further improve performance by selecting the columns you need instead of using SELECT *.

If you're able to, it would be faster to either store the date as a proper datetime field, or, in the code running the query, to convert the date you're after to a unix timestamp before sending it to the query.
The FROM_UNIXTIME would have to convert every record in the table before it can check it which, as you can see, has performance issues. Using a native datatype that is closest to what you're actually using in your queries, or querying with the column's data type, is the fastest way.
So, if you need to continue using an int field for your time, then yes, using < and > on a strict integer would boost performance greatly, assuming you store things to the second, rather than the timestamp that would be for midinight of that day.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

slow sql query - selecting data depending on date interval - mysql

I have a query that is causing me some trouble. I'm wondering if there is a better way to write this SQL; SELECT * FROM report WHERE blogid = 1769577 AND DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= datetime so as its faster to fetch the results. Thanks in advance.

I suggest using the internal datediff() function instead of subtract the interval. like this: datediff(curdate(), datetime) < 30 so the query is: SELECT * FROM report WHERE blogid = 1769577 AND datediff(curdate(), datetime) < 30

Related

Add minutes to datetime in SQL SELECT from other column

What keys should be indexed here to make this query optimal

Why does MySQL drops my index when using DATE(`table`.`column`)

Maximize efficiency of SQL SELECT statement

Timestamp as int field, query performance

Categories

Resources