I'm currently storing various metadata about videos and one of those bits of data is the length of a video.
So if a video is 10 minutes 35 seconds long, it's saved as "10:35" in the database.
But what I'd like to do is retrieve a listing of videos by length (longest first, shortest last).
The problem I'm having is that if a video is "2:56", it's coming up as longest because the number 2 is more than the number 1 in.
So, how can I order data based on that length field so that "10:35" is recognized as being longer than "2:56" (as per my example)?
SELECT * FROM table ORDER BY str_to_date(meta_time,'%l:%i')
You can find the specific formatters on the MySQL Website.
For example:
%k -> Hour (0..23)
%l -> Hour (1..12)
The easiest choice is to store a integer (seconds) or a float (minutes) instead of a string. So 10:35 would be 635 in seconds or 10.583 in minutes. You can sort by these numerically very easily. And you can output them in the format you'd like with some simple math and string functions.
Some options:
Save it as an integer representing the total number of seconds. "10:35" => 635
Save it as a timestamp object with no date component. "10:35" => MAKETIME(0, 10, 34)
Save it with leading decimals or spaces. "2:25" => " 2:25"
My preference would be for the first option.
You could try to see if
ORDER BY TIME_TO_SEC(timefield)
would parse it correctly, however it is not an optimal approach to store time as strings in the database, and I suggest that you store them as TIME if you are able to. Then you can use standard formatting functions to present them as you like.
I had the same problem - storing videos length in database.
I solved it by using TIME mysql type - it solves all ordering and converting issues.
Related
I have a getdate() field and I want to convert it into 20210211T172650Z this format how do I do it in SSIS expression?
In SSIS, we have data types for strings, numbers and dates. Dates have no format and when it is converted to a string value, you're getting whatever format the localization rules dictate.
If you have a particular format you want, then you need to control that and the only way you can control it, is by using a string data type.
The pattern we're going to use here, for each element,
extract the digit(s)
convert the digits to string
left pad/prepend a leading zero
extract the last 2 characters from our string
When we extract digits, they're numbers and numbers don't have leading zeroes. We convert to string which will allow us to then add the character zero in front of it because we're just concatenating strings. If the number was less than 10, then this prepending of a zero will result in exactly what we want. 9 -> 09 If it was greater than 9, then we have an extraneous value in there. 11 -> 011. We don't care that we went too big because we're then going to take the right 2 most characters making 09 -> 09 and 011 -> 11. This is the shortest logic to making a leading zero string in SSIS.
Using that logic, we're going to create a variable for each element of our formatted string: year, month, day, hour, minute, second.
What's the starting date?
I created a variable called StartDate of type DateTime and hard coded it to a starting point. This is going to allow me to test various conditions. If I used getdate, then I'd either have to adjust my computer's clock to ensure my code works on 2001-01-01 at 01:01:01 as well as 2021-12-31 at 23:59:59. When you're satisfied your code passes all the tests, you can then specify that StartDate property EvaluateAsExpression is True and then use GetDate(). But I wouldn't use GetDate().
GetDate is going to evaluate every time you inspect it. When your package starts, it will show 2021-02-12 # 11:16 AM But your package takes 5 minutes to run, so when you go to re-use the value that is built on GetDate, you will now get 2021-02-12 # 11:21 AM.
In your case, those keys won't match if you send it more than once to your Amazon thing. Instead, use a System scoped variable like #[System::StartTime] That is updated to the time the package starts executing and remains constant for the duration of the SSIS package execution. So when you're satisfied the expression you've build matches the business rules, then change #[User::StartDate] over to use #[System::StartTime]. It provides the updated time but without the challenges of drifting time.
Extract the digit(s)
The SSIS expression language has YEAR, MONTH and DAY defined but no shorthand methods for time components. But, it does have the DATEPART function in which you can ask for any named date part. I'm going to use that for all of my access methods as it makes it nice and consistent.
As an example, this is how I get the Hour. String literal HOUR and we use our variable
DATEPART("HOUR",#[User::StartDate])
Convert the digits to string
The previous step gave us a number but we've got that leading zero problem to solve so convert that to a string
(DT_WSTR, 2)DATEPART("HOUR",#[User::StartDate])
Cast to string, two characters wide max, the number we generated
left pad/prepend a leading zero
String concatenation is the + operator and since we can't concatenate a string to a number, we make sure we have the correct operand types on both sides
"0" + (DT_WSTR, 2)DATEPART("HOUR",#[User::StartDate])
extract the last 2 characters from our string
Since we might have a 2 or 3 character string at this point, we're going to use the RIGHT function to only get the last N characters.
RIGHT("0" + (DT_WSTR, 2)DATEPART("HOUR",#[User::StartDate]), 2)
Final concatenation
Now that we have our happy little variables and we've checked our boundary conditions, the only thing left is to make one last variable, DateAsISO8601 type of string, EvaulateAsExpression = True
#[User::Year] + #[User::Month] +#[User::Day] + "T" +#[User::Hour] +#[User::Minute] +#[User::Second] + "Z"
I've run into the limit myself, but despite lots of chatter online, I've never seen an explanation for why the upper and lower limit for the TIME data type is what it is. The official reference at http://dev.mysql.com/doc/refman/5.7/en/time.html says
TIME values may range from '-838:59:59' to '838:59:59'. The hours part may be so large because the TIME type can be used not only to represent a time of day (which must be less than 24 hours), but also elapsed time or a time interval between two events (which may be much greater than 24 hours, or even negative).
But I'm wondering not why the hours part is allowed to be "so large", but why it's cut off where it is. There doesn't seem to be any significance to that many hours in regards to days, or if I try to imagine possible cutoffs for how many seconds could be stored as an integer. So why the range?
The TIME values were always stored on 3 bytes in MySQL. But the format changed on version 5.6.4. I suspect this was not the first time when it changed. But the other change, if there was one, happened long time ago and there is no public evidence of it. The MySQL source code history on GitHub starts with version 5.5 (the oldest commit is from May 2008) but the change I am looking for happened somewhere around 2001-2002 (MySQL 4 was launched in 2003)
The current format, as described in the documentation, uses 6 bits for seconds (possible values: 0 to 63), 6 bits for minutes, 10 bits for hours (possible values: 0 to 1023), 1 bit for sign (add the negative values of the already mentioned intervals) and 1 bit is unused and labelled "reserved for future extensions".
It is optimized for working with time components (hours, minutes, seconds) and doesn't waste much space. Using this format it's possible to store values between -1023:59:59 and +1023:59:59. However MySQL limits the number of hours to 838, probably for backward compatibility with applications that were written a while ago, when I think this was the limit.
Until version 5.6.4, the TIME values were also stored on 3 bytes and the components were packed as days * 24 * 3600 + hours * 3600 + minutes * 60 + seconds. This format was optimized for working with timestamps (because it was, in fact, a timestamp). Using this format it would be possible to store values in the range of about -2330 to +2330 hours. While having this big range of values available, MySQL was still limiting the values to -838 to +838 hours.
There was bug #11655 on MySQL 4. It was possible to return TIME values outside the -838..+838 range using nested SELECT statements. It was not a feature but a bug and it was fixed.
The only reason to limit the values to this range and to actively change any piece of code that produces TIME values outside it was backward compatibility.
I suspect MySQL 3 used a different format that, due to the way the data was packed, limited the valid values to the range -838..+838 hours.
By looking into the current MySQL's source code I found this interesting formula:
#define TIME_MAX_VALUE (TIME_MAX_HOUR*10000 + TIME_MAX_MINUTE*100 + TIME_MAX_SECOND)
Let's ignore for the moment the MAX part of the names used above and let's remember only that TIME_MAX_MINUTE and TIME_MAX_SECOND are numbers between 00 and 59. The formula just concatenates the hours, minutes and seconds in a single integer number. For example, the value 170:29:45 becomes 1702945.
This formula raises the following question: given that the TIME values are stored on 3 bytes with sign, what is the maximum positive value that can be represented this way?
The value we are looking for is 0x7FFFFF that in decimal notation is 8388607. Since the last four digits (8607) should be read as minutes (86) and seconds (07) and their maximum valid values is 59, the greatest value that can be stored on 3 bytes with sign using the formula above is 8385959. Which, as TIME is +838:59:59. Ta-da!
Guess what? The fragment of C code listed above was extracted from this:
/* Limits for the TIME data type */
#define TIME_MAX_HOUR 838
#define TIME_MAX_MINUTE 59
#define TIME_MAX_SECOND 59
#define TIME_MAX_VALUE (TIME_MAX_HOUR*10000 + TIME_MAX_MINUTE*100 + TIME_MAX_SECOND)
I am sure this is how MySQL 3 used to keep the TIME values internally. This format imposed the limitation of the range, and the backward compatibility requirement on the subsequent versions propagated the limitation to our days.
DATETIME is stored based on a base of 10, see Date and Time Data Type Representation:
DATETIME: Eight bytes: A four-byte integer for date packed as YYYY×10000 + MM×100 + DD and a four-byte integer for time packed as HH×10000 + MM×100 + SS
For convinience and some other reasons, the (old) time format was encoded in the same way, using 3 bytes:
Hours * 10000 + Minutes * 100 + Seconds
This means:
3 bytes = 2^24 = 16.777.216
with sign: 2^23 = 8.388.608
Using the encoding, this represents the magical 838 hours. And max. 8608 seconds for the minutes and seconds (without overflow), which results in the largest valid time 838:59:59. One nice thing about this is that the integer representation of that time, 8385959, is easily readable to a human. But this encoding of course leaves gaps, invalid (unused) integer values (like 8309999).
As of MySQL 5.6.4, time format changed its encoding to
1 bit sign (1= non-negative, 0= negative)
1 bit unused (reserved for future extensions)
10 bits hour (0-838)
6 bits minute (0-59)
6 bits second (0-59)
---------------------
24 bits = 3 bytes
Even though it could now store more hours, for compatibility it still just allows 838 hours.
Obviously, it's hard to answer these types of questions without getting direct feedback from the designers of the database.
But there is some documentation regarding how the different data types are stored internally, and, to an extent, it can help us understand this a little bit.
For, instance, regarding the TIME data type, notice how it's stored internally according to the documentation:
TIME encoding for nonfractional part:
1 bit sign (1= non-negative, 0= negative)
1 bit unused (reserved for future extensions)
10 bits hour (0-838)
6 bits minute (0-59)
6 bits second (0-59)
---------------------
24 bits = 3 bytes
So, as you can see, the goal is to fit the information within 3 bytes. And, of those 3 bytes, 10 bits are reserved for the hours, which pretty much determines the overall range.
That said, 10 bits does allow values up to 1023, so I guess, technically, without any changes to the storage size, the range could have been -1023:59:59 to 1023:59:59. Why they didn't do that and they chose 838 as the cutoff, I have no idea.
I have the sample following numbers which are stored in an mysql db in the decimal(10,2) format
1499.3927125506 - 1499.39 -> this is how it saved into the database
384.41295546559 - 384.41
278.74493927126 - 278.74
537.44939271255 - 537.45
The actual total before saving into the database is 1700, however after the saving the total becomes 1699.99
How can I make the total 1700 NOT 1699.99?
With numbers like that, you should use DOUBLE, not DECIMAL(.., ..).
(Do not use DOUBLE(.., ..); that will just add to your problems.)
You will need to change your MySQL datatype to match the precision of your data if you want to avoid rounding errors. In your presented data, you will need precision (total digits) of 14 with a scale (digits after decimal) of 10 which would be DECIMAL(14,10) to sum perfectly to 1700.
I have an application in which users are typing measured amounts. I would like to respect the precision they are entering when storing the values in MySQL, i.e. if they type 0.050 I don't want that to become 0.05 since that is loosing information on how exact the measurement was done. Is there a way other than storing the value as a string?
0.050 is equal to 0.05 . If you want 3 digits after comma, you have to implement this feature in application.
As per my knowledge for handling precision point we are using FLOAT, DOUBLE or DECIMAL data types. In your case if you are not using any function like SUM(),AVG(),etc then you can use VARCHAR.
Add always a "control" digit, lets say "1", save to database, then get data from database and always trim ONE time the "control" digit "1" from what you've got.
example:
user inputs 0,000050
save as 0,0000501
get the 0,0000501
trim the last "1" (only the last one be carefull)
k.i.s.s. people :D
edit: proper solution of course, add a column to store precision and right-pad zeroes if needed
I need to calculate the time a user spends on site. It is difference between logout time and login time to give me something like "Mr X spent 4 hours and 43 minutes online". So to store the4 hours and 43 minutes i declared it like this:
duration time NOT NULL
Is this valid or a better way to store this? I need to store in the DB because I have other calculations I need to use this for + other use cases.
Storing it as an integer number of seconds will be the best way to go.
The UPDATE will be clean and simple - i.e. duration = duration + $increment
As Tristram noted, there are limitations to using the TIME field - e.g. "TIME values may range from '-838:59:59' to '838:59:59'"
The days/hours/minutes/seconds display formatting won't be hardcoded.
The execution of your other calculations will almost surely be clearer when working with an integer "number of seconds" field.
I wouldn't use time as you would be limited to 24 hours. The easiest would just to store an integer in minutes (or seconds depending on the resolution you need).
Consider storing both values as a UNIX-epoch-delta.
I generally prefer to use a signed (64b) bigint (for secondly resolution), or a (signed) (64b) double (if fractional seconds are needed), or a signed (32b) int (if scaled down to minutely or hourly).
Make the unit explicit in the name of the column, for example with a suffix like "_epoch_minutely", for example: "started_epoch_minutely", "finished_epoch_minutely".