I have a table that contains 3 columns; day_id, start_date, end_date. start_date and end_date are varchar(8) in a format like this HH:II:SS. Sometimes dates can go over 24h in order to represent that something happened day after, for example: 25:20:01 is 01:20:01 but in a new day. day_id is not unique, it repeats. I need to get first and last event of a day, and this is my code:
SELECT day_id,
MIN(start_date) as start_time,
MAX(end_date) as end_date
FROM events WHERE day_id IN ('day_1', 'day_2', 'day_3')
GROUP BY day_id ORDER BY start_time ASC
It works as intended but I can't figure out why, how does MySQL know that 25:01:45 is larger than 20:21:09 since they are both varchars? The whole table is in utf8mb4_0900_ai_ci collation, running on MySQL server version 8.
It is a string comparison and it compares characters with their ascii value as you know. But it mainly works because it represents both single digits and two digits of time parameters as two digit representation. For example-
1:20:1 is represented as 01:20:01
2:5:7 is represented as 02:05:07
So, there will never be a time where 10:02:07 will come before 2:5:7(since 1 < 2) since 2:5:7 is 02:05:07 and 1 > 0. Hence, it always works.
Sometimes dates can go over 24h in order to represent that something
happened day after, for example: 25:20:01 is 01:20:01
So, if this 25 goes over 2 digits for some reason, then you will have problems. So, use the correct datatype to store it - TIME.
how does MySQL know that 25:01:45 is larger than 20:21:09?
Databases compare strings using a collation. The default collation is alphabetical ordering.
So, MySQL knows that '25' > '20' in exactly the same way that we knows that the word 'BE' comes after 'BA' in the dictionary.
Related
This is a question from leetcode, using the second query I got the question wrong but could not identify why
SELECT
user_id,
max(time_stamp) as "last_stamp"
from
logins
where
year(time_stamp) = '2020'
group by
user_id
and
select
user_id,
max(time_stamp) as "last_stamp"
from
logins
where
time_stamp between '2020-01-01' and '2020-12-31'
group by
user_id
The first query uses a function on every row to extract the year (an integer) and compares that to a string. (It would be preferable to use an integer instead.) Whilst this may be sub-optimal, this query would accurately locate all rows that fall into the year 2020.
The second query could fail to locate all rows that fall into 2020. Here it is important to remember that days have a 24 hour duration, and that each day starts at midnight and concludes at midnight 24 hours later. That is; a day does have a start point (midnight) and an end-point (midnight+24 hours).
However a single date used in SQL code cannot be both the start-point and the end-point of the same day, so every date in SQL represents only the start-point. Also note here, that between does NOT magically change the second given date into "the end of that day" - it simply cannot (and does not) do that.
So, when you use time_stamp between '2020-01-01' and '2020-12-31' you need to think of it as meaning "from the start of 2020-01-01 up to and including the start of 2020-12-31". Hence, this excludes the 24 hours duration of 2020-12-31.
The safest way to deal with this is to NOT use between at all, instead write just a few characters more code which will be accurate regardless of the time precision used by any date/datetime/timestamp column:
where
time_stamp >= '2020-01-01' and time_stamp <'2021-01-01'
with the second date being "the start-point of the next day"
See answer to SQL "between" not inclusive
I'm having a problem working with a Navicat Database. I got a column in SQL called fechaNacimiento (Birthdate) that should be a Date type, but instead it's stored as integers (most negative integers):
SELECT fechaNacimiento FROM Registrados
And I'm getting:
fechaNacimiento
-1451678400
-2082829392
-1798746192
-1199221200
-1356984000
-694299600
-1483214400
-1924976592
-1830368592
-2019670992
-1678909392
239252400
1451617200
-879541200
I don't know how this dates where loaded, I just know that inside that negative integer there's a date, and nobody here have any clue about how to spell SQL, so I have nobodoy to ask. If I just cast it to DATETIME, I get all of them as NULL values. Any idea in how to convert this data to Date type?
Numbers like that make me think of Unix times, number of seconds since 1970. If so, you might be able to do:
select dateadd(second, <column>, '1970-01-01')
This would put the negative values sometime before 1970 (for instance, -1678909392 is 1916-10-19). If you have older dates, then that might be the format being used.
These might also be represented as milliseconds. If so:
select dateadd(second, <column>/1000, '1970-01-01')
In this case, -1678909392 represents 1969-12-12.
In MySQL, you would use:
select '1970-01-01' + interval floor(column / 1000) second
SELECT * FROM table WHERE '2016-03-31' > (SELECT MAX(year) from table where bill_id = 'somevalue')
I am using above query to check if 2016-03-31 is greater than all years present in table against bill_id. It is working fine. but is it correct approach to compare dates. dates will always in above format. Is there any need to convert date format for comparison. value 2016-03-31 will change dynamically but it will be always in Y-m-d format
Note : year is column name which contains full date in Y-m-d format like 2016-05-20
You are not comparing dates. You are comparing a string '2016-03-31' with a number, e.g. 2015.
In order to compare, MySQL silently converts the string to number. One would expect this to crash, as '2016-03-31' certainly isn't a number. MySQL, however, reads from left to right and takes from there all that can be considered a number, i.e. '2016'. Well, one could argue that some people put a minus sign at the end of a number, so this should be '2016-', i.e. -2016. Anyway, MySQL stops before the minus sign, gets 2016 and uses this for the comparision.
I don't know if all this is guaranteed to work in the future. I would not rely on this.
What result would you expect anyway? Is the 31st of March 2016 greater than the year 2016? That's a queer question, don't you think?
Try this. But do you really have a column year that stores only year?
SELECT * FROM table WHERE year(STR_TO_DATE('2016-03-31'))
> (SELECT MAX(year) from table where bill_id = 'somevalue')
SELECT * FROM table WHERE YEAR('2016-03-31') > (SELECT MAX(year) from table where bill_id = 'somevalue')
MySQL YEAR() returns the year for a given date or timestamp. The return value is in the range of 1000 to 9999 or 0 for 'zero' date.
So I have to store a particular date of the year, any year. So I will only be needing the date and month part of a date.
I can either store it with any year and just ignore it on the programming side but that feels dirty. Any better way to handle this?
Closest I could find was this one but that includes time component as well and goes on some different tangent.
If you're using MySQL >= 5.7.6, you could use a generated column. A trivial example table would look like this (untested as, ironically, I don't have access to a recent MySQL server right now):
CREATE TABLE myTable (
the_date DATE,
month_date VARCHAR(5) AS CONCAT(MONTH(the_date), '-', DAY(the_date))
);
Of course, change the generated value according to your needs (different separator, padding with zeroes on the month, etc.)
If you're stuck with an older version, you could perform a similar conversion using a view.
What is wrong with just having a DATE column value, in this example called my_date :
SELECT MONTH(my_date) AS myMonth, DAY(my_date) AS myDay, <othercolumns>
FROM table WHERE id = 1
Then you can use your PHP to get your row and output $row['myMonth'], etc.
You can also output the MONTH / DAY values as any format string you like using MySQL DATE_FORMAT .
You can also CONCAT these two values if you need them in a single column.
SELECT CONCAT(MONTH(my_date),' ',DAY(my_date)) as month_day, <othercolumns>
FROM table WHERE id = 1
Warning:
Storing dates as 0000-00-00 is perfectly valid but MySQL year 0000 is not a leap year so you can not store 0000-02-29, this will instead be saved as a default 0000-00-00.
You might as well use a default year value that is leap year safe (such as year 2k) if you're sure you're never going to use the year value. such as (2000-XX-XX).
You could store the Julian Date as an integer, and convert to Georgian Month/Day when you need it. This can keep things quite clean. Keep your eyes peeled for the leap year.
The notion of Julian here is truly, just "Day of year" and converting when needed.
function getDateFromDay($dayOfYear, $year) {
$date = DateTime::createFromFormat('z Y', strval($dayOfYear) . ' ' . strval($year));
return $date;
}
I'm trying to figure out what MySQL is doing during the math operation of timestamps.
Picture of resulting problem:
You'll see on the left I have two timestamps, start and end, and I need to find the duration from start to end so I just do this:
end - start
I was getting some really weird results. You can see for a duration of only 3 hours I was getting result back that indicated 2 to 3 times that amount.
When I convert to UTC first, the math works out fine.
Can anyone explain what SQL is doing with the timestamps on the left? I've always been under the impression that all timestamps are UTC under the hood, which is why things like min, max, less than, etc work without converting.
Thanks!
Code:
select
min(timestamp) start,
max(timestamp) end,
max(timestamp) - min(timestamp) start_to_end,
unix_timestamp(min(timestamp)) startUTC,
unix_timestamp(max(timestamp)) endUTC,
unix_timestamp(max(timestamp)) - unix_timestamp(min(timestamp)) start_to_end_UTC
from post_snapshots group by permalink;
These examples have nothing to do with timezone conversions -- when you subtract one date directly from the other, MySQL generates a integer from all existing date parts and then makes the math operations. For example, this query:
select now()+1;
returns (it was '2013-02-26 14:38:31' + 1):
+----------------+
| now()+1 |
+----------------+
| 20130226143832 |
+----------------+
So the difference between "2013-02-19 16:49:21" and "2013-02-19 19:07:31" turns out to be:
20130219190731 - 20130219164921 = 25810
The correct way for getting this subtraction is to either convert the dates to timestamps (like you did) or to use TIMESTAMPDIFF(SECOND, start_date, end_date), which would return 8290.
This isn't a DATETIME vs. TIMESTAMP or a time zone problem.
MySQL handles datetimes as operands to a subtraction (or other mathematical operation) by converting each value to a number, but it's not the number of seconds, its just the datetime digits crunched together. Take an example from your data:
2013-02-19 16:49:21 becomes 20130219164921
2013-02-19 19:07:31 becomes 20130219190731
The difference between those two numbers is... 25810, which is the value you're seeing as the result of your subtraction operation. That's not a result in seconds, as you noted. It really doesn't mean much useful at all.
In contrast, TIMESTAMPDIFF() (or pre-converting to Unix timestamps as you did) actually performs the difference using time-appropriate math if you're looking for the difference to be significant for much beyond sorting:
SELECT TIMESTAMPDIFF(SECOND, '2013-02-19 16:49:21', '2013-02-19 19:07:31')
>>> 8290
What happens is you cannot substract dates/datetimes in mysql. For all math operations, the mysql timestamp data type behaves like datetime data type.
You could use instead
select
TIMESTAMPDIFF(SECOND,min(timestamp),max(timestamp))
from post_snapshots group by permalink;