Timestamp as int field, query performance - mysql

I'm storing timestamp as int field. And on large table it takes too long to get rows inserted at date because I'm using mysql function FROM_UNIXTIME.
SELECT * FROM table WHERE FROM_UNIXTIME(timestamp_field, '%Y-%m-%d') = '2010-04-04'
Is there any ways to speed this query? Maybe I should use query for rows using timestamp_field >= x AND timestamp_field < y?
Thank you
EDITED This query works great, but you should take care of index on timestamp_field.
SELECT * FROM table WHERE
timestamp_field >= UNIX_TIMESTAMP('2010-04-14 00:00:00')
AND timestamp_field <= UNIX_TIMESTAMP('2010-04-14 23:59:59')

Use UNIX_TIMESTAMP on the constant instead of FROM_UNIXTIME on the column:
SELECT * FROM table
WHERE timestamp_field
BETWEEN UNIX_TIMESTAMP('2010-04-14 00:00:00')
AND UNIX_TIMESTAMP('2010-04-14 23:59:59')
This can be faster because it allows the database to use an index on the column timestamp_field, if one exists. It is not possible for the database to use the index when you use a non-sargable function like FROM_UNIXTIME on the column.
If you don't have an index on timestamp_field then add one.
Once you have done this you can also try to further improve performance by selecting the columns you need instead of using SELECT *.

If you're able to, it would be faster to either store the date as a proper datetime field, or, in the code running the query, to convert the date you're after to a unix timestamp before sending it to the query.
The FROM_UNIXTIME would have to convert every record in the table before it can check it which, as you can see, has performance issues. Using a native datatype that is closest to what you're actually using in your queries, or querying with the column's data type, is the fastest way.
So, if you need to continue using an int field for your time, then yes, using < and > on a strict integer would boost performance greatly, assuming you store things to the second, rather than the timestamp that would be for midinight of that day.

Related

Add minutes to datetime in SQL SELECT from other column

I have an SQL query, which selects a datetime from a database. In another column called "add_minutes" are minutes I want to add to the datetime within the query. It should look like this:
SELECT * from availability WHERE ? < (datetime_start + add_minutes)
Any hints how to solve this?
Thank you!
SELECT *
FROM availability
WHERE ? < DATE_ADD(datetime_start, INTERVAL add_minutes MINUTE)
Also:
SELECT *
FROM availability
WHERE ? < ADDTIME(datetime_start, SEC_TO_TIME(add_minutes * 60))
Note that MySql is dumb, and both DATE_ADD() and ADDTIME() work with string expressions. Because of potential localization/formatting issues, converting between numbers and strings can be surprisingly expensive operations, especially if you have to do this for every column in a table.
Additionally, what we're doing here breaks any possibility of using indexes you might have on these columns. You can improve performance considerably like this:
SELECT *
FROM availability
WHERE ADDTIME(?, SEC_TO_TIME(add_minutes * -60)) < datetime_start
This inverts the interval and adds it to the source value instead. It still needs to look at every value in the add_minutes column, regardless of index, but now datetime_start is unchanged, and therefore indexes on that column can still be used.
Just for fun, here's how Sql Server does it:
SELECT *
FROM availability
WHERE DATEADD(minute, add_minutes * -1, ?) < datetime_start
Sql Server is less dumb about it's DATEADD() function. Everything here is numeric; there are no messy conversions between strings and numbers or dates. Sql Server also supports computed columns with indexes. So you could include an column in the table defined as DATEADD(minute, add_minutes, datetime_start), and have an index on that column. IIRC, MySql also supports computed columns, but does not support indexes on those columns.

Why does MySQL drops my index when using DATE(`table`.`column`)

I have a MySQL innodb table with a few columns.
one of them is named "dateCreated" which is a DATETIME column and it is indexed.
My query:
SELECT
*
FROM
`table1`
WHERE
DATE(`dateCreated`) BETWEEN '2014-8-7' AND '2013-8-7'
MySQL for some reason refuses to use the index on the dateCreated column (even with USE INDEX or FORCE INDEX.
However, if I change the query to this:
SELECT
*
FROM
`table1`
WHERE
`dateCreated` BETWEEN '2014-8-7' AND '2013-8-7'
note the DATE(...) removal
MySQL uses the index just fine.
I could manage without using the DATE() function, but this is just weird to me.
I understand that maybe MySQL indexes the full date and time and when searching only a part of it, it gets confused or something. But there must be a way to use a partial date (lets say MONTH(...) or DATE(...)) and still benefit from the indexed column and avoid the full table scan.
Any thoughts..?
Thanks.
As you have observed once you apply a function to that field you destroy access to the index. So,
It will help if you don't use between. The rationale for applying the function to the data is so you can get the data to match the parameters. There are just 2 parameter dates and several hundred? thousand? million? rows of data. Why not reverse this, change the parameters to suit the data? (making it a "sargable" predicate)
SELECT
*
FROM
`table1`
WHERE
( `dateCreated` >= '2013-08-07' AND `dateCreated` < '2014-08-07' )
;
Note 2013-08-07 is used first, and this needs to be true if using between also. You will not get any results using between if the first date is younger than the second date.
Also note that exactly 12 months of data is contained >= '2013-08-07' AND < '2014-08-07', I presume this is what you are seeking.
Using the combination of date(dateCreated) and between would include 1 too many days as all events during '2014-08-07' would be included. If you deliberately wanted one year and 1 day then add 1 day to the higher date i.e. so it would be < '2014-08-08'

Performance of MYSQL WHERE DATE(time) = 'yyyy-mm-dd'

Suppose I have a table 'Tasks' with a DATETIME column approve_time. I have an index on said column.
If I were to write a query of the form:
SELECT task_id, task_desc, task_owner, approve_time
FROM Tasks
WHERE DATE(approve_time) = '2011-08-31'
My question is about the performance of such a query:
Does MYSQL index DATETIME columns in a way that allows constraining by the date component to be fast?
Or does MYSQL know how to optimize the query into something like the following?
WHERE approve_time >= '2011-08-31 00:00:00'
AND approve_time < '2011-09-01 00:00:00'
Or does the query incur a tablescan?
Does MYSQL index DATETIME columns in a way that allows constraining by the date component to be fast?
NO
Or does MYSQL know how to optimize the query into something like the following?
YES
the second query will lead to range filter,
try
explain extended query_1; <--- number of rows sent for scan is more,
which should be aLL rows
vs
explain extended query_2;
the value of DATE(approve_time) only can determined after the function applied to column approve_time in all the row, which mean there is not going to make use on index

slow sql query - selecting data depending on date interval

I have a query that is causing me some trouble. I'm wondering if there is a better way to write this SQL;
SELECT * FROM report
WHERE blogid = 1769577
AND DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= datetime
so as its faster to fetch the results.
Thanks in advance.
I don't see anything wrong with the query, but you could make sure to have indexes on the blogid and datetime columns
If your table is huge, you might consider horizontal partitioning, which can have a significant impact on performance. See http://dev.mysql.com/tech-resources/articles/performance-partitioning.html
In Oracle SQL, you can do date arithmetic through +/- operators. I don't know if it would work in MySQL, but you might as well try doing datetime >= (CURDATE() - 30).
SELECT * FROM report WHERE blogid = 1769577 AND datetime >= (curdate() - 30)
Edit: This blog entry seems to confirm my suggestion: http://mysql-tips.blogspot.com/2005/04/mysql-date-calculations.html
I suggest using the internal datediff() function instead of subtract the interval. like this:
datediff(curdate(), datetime) < 30
so the query is:
SELECT * FROM report WHERE blogid = 1769577 AND datediff(curdate(), datetime) < 30
Guess 1: your blogid is not an int column. Then you can read the MySQL manual:
Comparison of dissimilar columns may
prevent use of indexes if values
cannot be compared directly without
conversion. Suppose that a numeric
column is compared to a string column.
For a given value such as 1 in the
numeric column, it might compare equal
to any number of values in the string
column such as '1', ' 1', '00001', or
'01.e1'. This rules out use of any
indexes for the string column.
Guess 2: your blogid is not indexed.
Guess 3: the reports table is myisam. In this case when you modify data MySQL uses table level locking on the whole reports table. You say that every time a blog is viewed a new record is added. These frequent updates may cause table level locking and slow down your select queries.
Otherwise your query is fine.
Cheers!

Better to use two columns or DATETIME

I'm working on a MySQL database which will create a "Today at" list and send it to subscribers. I'm wondering if it's better to use the DATETIME data type on the start and end fields, or two have two columns, startDate and startTime (with the appropriate data types). My first thought was to use DATETIME, but that makes subsequent use of the system a bit awkward, since you can no longer write:
SELECT * FROM event_list WHERE startAt='2009-04-20';
Instead, the best I found was:
SELECT * FROM event_list WHERE startAt LIKE '2009-04-20%';
and I don't like the hack or its potential impact on performance.
Just use the DATE() function.
SELECT * FROM event_list WHERE DATE(startAt) = '2009-04-20'
SELECT * FROM event_list WHERE startAt >= '2009-04-20' AND startAt < '2009-04-21'
This will use an index on startAt efficiently and handle the boundary conditions correctly. (Any WHERE clause including a function won't be able to use an index - it has no way to know that the expression result has the same ordering as the column values.
Using two columns is a bit like having columns for the integer and decimal parts of real numbers. If you don't need the time, just don't save it in the first place.
you can try smf like this
select * from event_list where date(startAt) = '2009-04-20
How about the best of both worlds -- have a table that uses a single datetime column and a view of that table that gives you both date and time fields.
create view vw_event_list
as select ..., date(startAt) as startDate, time(startAt) as startTime
select * from vw_event_list where startDate = '2009-04-20'
The real consideration between separate date and time fields or 1 datetime field is indexing. You do not want to do this:
select * from event_list where date(startAt) = '2009-04-20'
on a datetime field because it won't use an index. MySQL will convert the startAt data to a date in order to compare it, which means it can't use the index.
You want to do this:
select * from event_list where startAt BETWEEN '2009-04-20 00:00:00' AND '2009-04-20 23:59:59'
The problem with a datetime field is that you can't really use it a compound index since the value is fairly unique. For example, a compound index on startAt+event isn't going to allow you to search on date+event, only datetime+event.
But if you split the data between date and time fields, you can index startDate+event and search on it efficiently.
That's just an example for discussion purposes, you could obviously index on event+startAt instead and it would work. But you may find yourself wanting to search/summarize based on date plus another field. Creating a compound index on that data would make it very efficient.
Just one more thing to add: Beware time zones, if you're offering an online service it'll come up sooner or later and it's really difficult to do retroactively.
Daylight Savings Time is especially bad.
(DAMHIK)