What is the unit of the difference of two timestamps in MySQL? - mysql

There are two columns (t0 and t1) whose types are timestamp (t0 = 2021-11-18 20:25:09 and t1 = 2021-11-18 20:36:41)
I want to find t1 - t0 (expecting ~11 minutes or ~ 700seconds), but the result is 1132.
I was wondering how - is done between two timestamps and what the unit is.

Use the TIMESTAMPDIFF function for that purpose
For your question:
mysql converts the string into a number and then subtracts, its deterministic but not the result you want
SELECT TIMESTAMPDIFF(MINUTE, '2021-11-18 20:25:09','2021-11-18 20:36:41')
| TIMESTAMPDIFF(MINUTE, '2021-11-18 20:25:09','2021-11-18 20:36:41') |
| -----------------------------------------------------------------: |
| 11 |
SELECT TIMESTAMPDIFF(SECOND, '2021-11-18 20:25:09','2021-11-18 20:36:41')
| TIMESTAMPDIFF(SECOND, '2021-11-18 20:25:09','2021-11-18 20:36:41') |
| -----------------------------------------------------------------: |
| 692 |
db<>fiddle here

> SELECT TIMEDIFF(t1, t2) FROM t;
+------------------+
| TIMEDIFF(t1, t2) |
+------------------+
| -00:11:32 |
+------------------+
1 row in set (0.000 sec)
Please see the example here

- does nothing useful for timestamps, except that the sign of the result will tell you which is greater (assuming month and day are not 0), which is easier tested with a comparison operator.
In general, if you cast a timestamp to a number, you get a number formed by putting the parts of the timestamp together. For instance:
select timestamp('2021-11-18 20:25:09.012345')+0
gives
20211118202509.012345
- effectively casts both operands to a number. So differences in days become differences in millions, differences in months become differences in hundreds of millions, and differences in years become differences in tens of billions. This doesn't provide any useful measure of the difference between two timestamps.

Related

LEFT vs. DATE_FORMAT in mysql

LEFT or DATE_FORMAT, which is faster to re-format date in SELECT query in mysql?
I'll show an example of this problem.
Info of PERSON table
Column name
Type
NAME
VARCHAR(20)
YEAR
DATETIME
PERSON
NAME
YEAR
Travis
2020-01-01
Sam
2021-01-01
If execute 'SELECT YEAR FROM PERSON' query, can see below result.
YEAR
2020-01-01 00:00:00
2021-01-01 00:00:00
But I want result like below.
YEAR
2020-01-01
2021-01-01
So, I wanted use one of below queries.
SELECT LEFT(YEAR,10) FROM PERSON
SELECT DATE_FORMAT(YEAR, '%Y-%m-%d')
However, I wonder what query perform better.
Please, help me..
Technically, LEFT() is nearly four times faster, based on this test on my M1 Macbook. Your result might vary.
mysql> select benchmark(100000000, left(year, 10)) from person;
+--------------------------------------+
| benchmark(100000000, left(year, 10)) |
+--------------------------------------+
| 0 |
+--------------------------------------+
1 row in set (1.75 sec)
mysql> select benchmark(100000000, date_format(year, '%Y-%m-%d')) from person;
+-----------------------------------------------------+
| benchmark(100000000, date_format(year, '%Y-%m-%d')) |
+-----------------------------------------------------+
| 0 |
+-----------------------------------------------------+
1 row in set (6.81 sec)
But given that I had to execute both expressions 100 million times to observe a significant difference, both of them are so fast that I wouldn't worry about it. It's likely that other parts of the query will be of far greater influence on performance.
Worrying about which of these two functions has better performance is like worrying if it's better to use one finger or two fingers to lift a 100kg barbell.

Mysql subtraction operator on timestamps [duplicate]

This question already has answers here:
What is the behavior for the minus operator between two datetimes in MySQL?
(2 answers)
Closed 1 year ago.
I wrote a query like
select endtime - begintime ....
and it looked like the difference in seconds. But it turns out that it is very odd number (both columns of type timestamp, no timezones mentioned).
select timestampdiff(seconds, begintime, endtime)
works.
But I am more than a little curious as to what the subtraction operator does! I could not find any documentation. It is certainly a booby trap for new users.
(And nobody really understand timezones. There is what is stored, vs what is displayed in different time zones, which drivers etc. muck with it, and lots and lots of false information and confusion. I don't know what With Timezone really means, but I only use the one timezone of the server, although my browser is in a different timezone so phpadmin might be lying to me.)
When used as a number, a timestamp like '2021-01-02 03:04:05' will be treated as 20210102030405. You can see this with e.g. select timestamp('2021-01-02 03:04:05')-0;. Subtracting two such "numbers" isn't going to be meaningful, except that the sign of the result will tell you which time was later.
This doesn't apply if you use the special INTERVAL syntax to adjust a timestamp by an interval, e.g. select '2021-01-02 03:04:05' - INTERVAL 1 WEEK;.
Here's a demo:
mysql> create table mytable (endtime datetime, begintime datetime);
Query OK, 0 rows affected (0.05 sec)
mysql> insert into mytable values (now(), '2021-05-01');
mysql> select endtime - begintime from mytable;
+---------------------+
| endtime - begintime |
+---------------------+
| 6011403 |
+---------------------+
What's up with this weird value? Well, when you put datetime values into an integer arithmetic expression, they values are converted to integers, but not in units of seconds. You can also force these values to be integers this way:
mysql> select endtime+0 as e, begintime+0 as b from mytable;
+----------------+----------------+
| e | b |
+----------------+----------------+
| 20210507011403 | 20210501000000 |
+----------------+----------------+
Here we see that the values are integers, but they are based on converting the datetime values to YYYYMMDDHHMMSS format.
Guess what the difference is?
mysql> select e-b from (select endtime+0 as e, begintime+0 as b from mytable) as t;
+---------+
| e-b |
+---------+
| 6011403 |
+---------+
But this is not the actual time difference, because there are not 100 minutes in an hour, 100 hours in a day, etc.
mysql> select timestampdiff(second, begintime, endtime) as timestampdiff from mytable;
+---------------+
| timestampdiff |
+---------------+
| 522843 |
+---------------+

SQL calculating difference between columns

I'm a bit of a newby at SQL and I don't really understand what to do here, so any help is really appreciated. I have a table full of readings from different readers, there's like 500.000 of them, so I can't do this by hand.
I received the table without the difference in it. I managed to calculate it, but there's a bit of a problem there...
It looks a bit like this:
reader_id | date | reading | difference
1 | 01-01-2013 | 205 | 0
1 | 02-01-2013 | 210 | 5
1 | 03-01-2013 | 213 | 3
... | ... | ... | ...
1 | 31-12-2013 | 2451 | 4
2 | 01-01-2013 | 8543 | 6092
2 | 02-01-2013 | 8548 | 5
reader_id and date form the primary key. The combination is unique.
How can I make sure I don't get the difference calculated when the last column contained a different reader_id?
When querying my data with a query like this one, the data get skewed by the incorrect difference between the two reader_ids:
SELECT AVG(difference), reader_id FROM table GROUP BY reader_id
For
I just want to get the average difference for each reader.
your query is perfectly good. I think you got something wrong in your difference calculation. The first value for reader_id=2, 6092, is the difference of the last reading from reader1 and the first reading from reader 2, i don't think that makes sense. If i'm not mistaken, the difference value is the current day reading - previous day reading. Therefore you should set the difference value of the first reading of each reader to 0.
You can do this with the following query:
UPDATE table t INNER JOIN (SELECT reader_id, min(date) as first_day FROM table GROUP BY reader_id) as tmp ON tmp.reader_id=t.reader_id AND tmp.first_day=t.date SET t.difference=0
Then
SELECT AVG(difference), reader_id FROM table GROUP BY reader_id
will do what you expect.
If you simply want the average difference, you can use the following query:
SELECT
meter_id,
MAX(reading) - MIN(reading) / COUNT(*) average_difference
FROM table
GROUP BY meter_id
ORDER BY meter_id;
It works on the logic that the the total difference for a given meter_id should be equal to MAX(reading) - MIN(reading).

How to store very old dates in database?

It's not actually a problem I'm having, but imagine someone's building a website about the medieval times and wants to store dates, how would they go about it?
The spec for MySQLs DATE says it won't go below the year 1000. Which makes sense when the format is YYYY-MM-DD. How can you store information about the death of Kenneth II of Scotland in 995? Of course you can store it as a string, but are there real date-type options?
Actually, you can store dates below year 1000 in MySQL despite even documentation clarification:
mysql> describe test;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| birth | date | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
-you still need to input year in YYYY format:
mysql> insert into test values (1, '0995-03-05');
Query OK, 1 row affected (0.02 sec)
mysql> select * from test;
+------+------------+
| id | birth |
+------+------------+
| 1 | 0995-03-05 |
+------+------------+
1 row in set (0.00 sec)
-and you'll be able to operate with this as a date:
mysql> select birth + interval 5 day from test;
+------------------------+
| birth + interval 5 day |
+------------------------+
| 0995-03-10 |
+------------------------+
1 row in set (0.03 sec)
As for safety. I've never faced a case when this will not work in MySQL 5.x (that, of cause, does not mean that it will 100% work, but at least it is reliable with certain probability)
About BC dates (below Christ). I think that is simple - in MySQL there's no way to store negative dates as well. I.e. you will need to store year separately as a signed integer field:
mysql> select '0001-05-04' - interval 1 year as above_bc, '0001-05-04' - interval 2 year as below_bc;
+------------+----------+
| above_bc | below_bc |
+------------+----------+
| 0000-05-04 | NULL |
+------------+----------+
1 row in set, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+--------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------+
| Warning | 1441 | Datetime function: datetime field overflow |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)
But I think, in any case (below/above year 0) it's better to store date parts as integers in that case - this will not rely to undocumented feature. However, you will need to operate with those 3 fields not as the dates (so, in some sense that is not a solution to your problem)
Choose a dbms that supports what you want to do. Among other free database management systems, PostgreSQL supports a timestamp range from 4713 BC to 294276 AD.
If you break up the date into separate columns for year, month, and day, you also need more tables and constraints to guarantee that values in those columns represent actual dates. If those columns let you store the value {2013, 2, 29}, your table is broken. A dbms that supports dates in your range entirely avoids this kind of problem.
Other problems you might run into
Incorrect date arithmetic on dates that are out of range.
Incorrect locale-specific formatting on dates that are out of range.
Surprising behavior from date and time functions on dates that are out of range.
Gregorian calendar weirdness.
Gregorian calendar weirdness? In Great Britain, the day after Sep 2, 1752 is Sep 14, 1752. PostgreSQL documents their rationale for ignoring that as follows.
PostgreSQL uses Julian dates for all date/time calculations. This has
the useful property of correctly calculating dates from 4713 BC to far
into the future, using the assumption that the length of the year is
365.2425 days.
Date conventions before the 19th century make for interesting reading,
but are not consistent enough to warrant coding into a date/time
handler.
Sadly, I think that currently the easiest option is to store year, month and day in separate fields with year as smallint.
To quote from http://dev.mysql.com/doc/refman/5.6/en/datetime.html
For the DATE and DATETIME range descriptions, “supported” means that although earlier values might work, there is no guarantee.
So there's a good change that a wider range will work given a sufficiently configured MySQL installation.
Make sure not to use TIMESTAMP, which seems to have a non-negative range.
The TIMESTAMP data type is used for values that contain both date and time parts. TIMESTAMP has a range of '1970-01-01 00:00:01' UTC to '2038-01-19 03:14:07' UTC.
Here is a JavaScript example how far before the UNIX epoch(1) you can get with 2^36 seconds * -1000 (to get to milliseconds for Javascript).
d = new Date((Math.pow(2, 36) - 1) * -1000)
Sun May 13 -208 18:27:45 GMT+0200 (Westeuropäische Sommerzeit)
So I would suggest to store historical dates as BIGINT relative to the epoch.
See http://dev.mysql.com/doc/refman/5.6/en/integer-types.html for MxSQL 5.6.
(1)
epoch = new Date(0)
Thu Jan 01 1970 01:00:00 GMT+0100 (Westeuropäische Normalzeit)
epoch.toUTCString()
"Thu, 01 Jan 1970 00:00:00 GMT"

Query database in weekly interval

I have a database with a created_at column containing the datetime in Y-m-d H:i:s format.
The latest datetime entry is 2011-09-28 00:10:02.
I need the query to be relative to the latest datetime entry.
The first value in the query should be the latest datetime entry.
The second value in the query should be the entry closest to 7 days from the first value.
The third value should be the entry closest to 7 days from the second value.
REPEAT #3.
What I mean by "closest to 7 days from":
The following are dates, the interval I desire is a week, in seconds a week is 604800 seconds.
7 days from the first value is equal to 1316578202 (1317183002-604800)
the value closest to 1316578202 (7 days) is... 1316571974
unix timestamp | Y-m-d H:i:s
1317183002 | 2011-09-28 00:10:02 -> appear in query (first value)
1317101233 | 2011-09-27 01:27:13
1317009182 | 2011-09-25 23:53:02
1316916554 | 2011-09-24 22:09:14
1316836656 | 2011-09-23 23:57:36
1316745220 | 2011-09-22 22:33:40
1316659915 | 2011-09-21 22:51:55
1316571974 | 2011-09-20 22:26:14 -> closest to 7 days from 1317183002 (first value)
1316499187 | 2011-09-20 02:13:07
1316064243 | 2011-09-15 01:24:03
1315967707 | 2011-09-13 22:35:07 -> closest to 7 days from 1316571974 (second value)
1315881414 | 2011-09-12 22:36:54
1315794048 | 2011-09-11 22:20:48
1315715786 | 2011-09-11 00:36:26
1315622142 | 2011-09-09 22:35:42
I would really appreciate any help, I have not been able to do this via mysql and no online resources seem to deal with relative date manipulation such as this. I would like the query to be modular enough to be able to change the interval weekly, monthly, or yearly. Thanks in advance!
Answer #1 Reply:
SELECT
UNIX_TIMESTAMP(created_at)
AS unix_timestamp,
(
SELECT MIN(UNIX_TIMESTAMP(created_at))
FROM my_table
WHERE created_at >=
(
SELECT max(created_at) - 7
FROM my_table
)
)
AS `random_1`,
(
SELECT MIN(UNIX_TIMESTAMP(created_at))
FROM my_table
WHERE created_at >=
(
SELECT MAX(created_at) - 14
FROM my_table
)
)
AS `random_2`
FROM my_table
WHERE created_at =
(
SELECT MAX(created_at)
FROM my_table
)
Returns:
unix_timestamp | random_1 | random_2
1317183002 | 1317183002 | 1317183002
Answer #2 Reply:
RESULT SET:
This is the result set for a yearly interval:
id | created_at | period_index | period_timestamp
267 | 2010-09-27 22:57:05 | 0 | 1317183002
1 | 2009-12-10 15:08:00 | 1 | 1285554786
I desire this result:
id | created_at | period_index | period_timestamp
626 | 2011-09-28 00:10:02 | 0 | 0
267 | 2010-09-27 22:57:05 | 1 | 1317183002
I hope this makes more sense.
It's not exactly what you asked for, but the following example is pretty close....
Example 1:
select
floor(timestampdiff(SECOND, tbl.time, most_recent.time)/604800) as period_index,
unix_timestamp(max(tbl.time)) as period_timestamp
from
tbl
, (select max(time) as time from tbl) most_recent
group by period_index
gives results:
+--------------+------------------+
| period_index | period_timestamp |
+--------------+------------------+
| 0 | 1317183002 |
| 1 | 1316571974 |
| 2 | 1315967707 |
+--------------+------------------+
This breaks the dataset into groups based on "periods", where (in this example) each period is 7-days (604800 seconds) long. The period_timestamp that is returned for each period is the 'latest' (most recent) timestamp that falls within that period.
The period boundaries are all computed based on the most recent timestamp in the database, rather than computing each period's start and end time individually based on the timestamp of the period before it. The difference is subtle - your question requests the latter (iterative approach), but I'm hoping that the former (approach I've described here) will suffice for your needs, since SQL doesn't lend itself well to implementing iterative algorithms.
If you really do need to determine each period based on the timestamp in the previous period, then your best bet is going to be an iterative approach -- either using a programming language of your choice (like php), or by building a stored procedure that uses a cursor.
Edit #1
Here's the table structure for the above example.
CREATE TABLE `tbl` (
`id` int(10) unsigned NOT NULL auto_increment PRIMARY KEY,
`time` datetime NOT NULL
)
Edit #2
Ok, first: I've improved the original example query (see revised "Example 1" above). It still works the same way, and gives the same results, but it's cleaner, more efficient, and easier to understand.
Now... the query above is a group-by query, meaning it shows aggregate results for the "period" groups as I described above - not row-by-row results like a "normal" query. With a group-by query, you're limited to using aggregate columns only. Aggregate columns are those columns that are named in the group by clause, or that are computed by an aggregate function like MAX(time)). It is not possible to extract meaningful values for non-aggregate columns (like id) from within the projection of a group-by query.
Unfortunately, mysql doesn't generate an error when you try to do this. Instead, it just picks a value at random from within the grouped rows, and shows that value for the non-aggregate column in the grouped result. This is what's causing the odd behavior the OP reported when trying to use the code from Example #1.
Fortunately, this problem is fairly easy to solve. Just wrap another query around the group query, to select the row-by-row information you're interested in...
Example 2:
SELECT
entries.id,
entries.time,
periods.idx as period_index,
unix_timestamp(periods.time) as period_timestamp
FROM
tbl entries
JOIN
(select
floor(timestampdiff( SECOND, tbl.time, most_recent.time)/31536000) as idx,
max(tbl.time) as time
from
tbl
, (select max(time) as time from tbl) most_recent
group by idx
) periods
ON entries.time = periods.time
Result:
+-----+---------------------+--------------+------------------+
| id | time | period_index | period_timestamp |
+-----+---------------------+--------------+------------------+
| 598 | 2011-09-28 04:10:02 | 0 | 1317183002 |
| 996 | 2010-09-27 22:57:05 | 1 | 1285628225 |
+-----+---------------------+--------------+------------------+
Notes:
Example 2 uses a period length of 31536000 seconds (365-days). While Example 1 (above) uses a period of 604800 seconds (7-days). Other than that, the inner query in Example 2 is the same as the primary query shown in Example 1.
If a matching period_time belongs to more than one entry (i.e. two or more entries have the exact same time, and that time matches one of the selected period_time values), then the above query (Example 2) will include multiple rows for the given period timestamp (one for each match). Whatever code consumes this result set should be prepared to handle such an edge case.
It's also worth noting that these queries will perform much, much better if you define an index on your datetime column. For my example schema, that would look like this:
ALTER TABLE tbl ADD INDEX idx_time ( time )
If you're willing to go for the closest that is after the week is out then this'll work. You can extend it to work out the closest but it'll look so disgusting it's probably not worth it.
select unix_timestamp
, ( select min(unix_tstamp)
from my_table
where sql_tstamp >= ( select max(sql_tstamp) - 7
from my_table )
)
, ( select min(unix_tstamp)
from my_table
where sql_tstamp >= ( select max(sql_tstamp) - 14
from my_table )
)
from my_table
where sql_tstamp = ( select max(sql_tstamp)
from my_table )