problems handling significant digits in mysql converting float to double

problems handling significant digits in mysql converting float to double - mysql

I am inserting data from one table into another in a MariaDB database, where the column in the first table is FLOAT, and in the second it's DOUBLE. The data can have values of any size, precision and decimal places.
Here is what happens to the values when I do a straight-forward copy:
INSERT INTO data2 (value) SELECT value FROM data1
The values are given random extra significant figures:
FLOAT in data1 DOUBLE in data2
-0.000000000000454747 -0.0000000000004547473508864641
-122.319 -122.31932830810547
14864199700 14864220160
CAST(value AS DECIMAL(65,30)) generates exactly the same values as col 2 above, except I see trailing zeroes.
Yet when I just do
UPDATE data2 SET value = 14867199700 WHERE id = 133025046;
the DOUBLE value is accepted.
Do I have to export all the value to an SQL script and re-import them? Isn't there a better way?
Despite hours trying to experimenting with the issue, I'm not much closer to a solution, despite its limited nature. I can see this is problem that besets all technologies, not just MariaDB or databases, so I have probably just missed the answer somewhere. Stackoverflow is desperately trying to guide to a solution with new suggestion features I hadn't seen before, but unfortunately they are no help, like the other suggested answers.

Your test case is flawed. You are feeding in decimal digits, and not testing just the transfer of FLOAT to DOUBLE.
UPDATE tbl SET double_col = float_col will always copy exactly the same value. This because the DOUBLE representation is a superset of the FLOAT representation (53 vs 24 bits of precision; etc).
Literal, with decimal places: UPDATE tbl SET double_col = 123.456 will mangle the number because of rounding from decimal to DOUBLE. Ditto for float_col. Furthermore, the mangled results will be different!
Hole number literal: UPDATE tbl SET double_col = 14867199700 will be stored exactly. But if you put that same literal into a FLOAT, it will be rounded to 24 bits, so it cannot be stored exactly. You lose exactness at about 7 significant digits for FLOAT and about 16 for DOUBLE. The literal in this example has 9 significant digits (after ignoring trailing zeros).
That's just a sampling of the nightmares you can get into.
You must consider FLOAT and DOUBLE to be approximate. You should never compare for equality; you don't know what might have messed with the last bit of the value.
Also, you should not try to guess when MySQL will perform expressions in DECIMAL instead of DOUBLE.
And, keep in mind that division is usually imprecise due to rounding to some number of bits or decimals.
The "mantissa" of 14864199700 is
1.10111010111111001101100 (binary of FLOAT : 24 bits including 'hidden' leading bit)
1.1011101011111100110110000000101000000000000000000000 (binary of DOUBLE)
^ ^ (lost in FLOAT)
Each of those is multiplied by the same power of 2. The DOUBLE gets exactly 14864199700. The FLOAT lost the bits pointed to.
You can play around with such at https://gregstoll.dyndns.org/~gregstoll/floattohex/
Believe it or not, things used to be worse. People would be billed for $0.00 -- due to rounding errors. Or results of what should have been 1+1 showed as 1.99999999.

Related

When to use float vs decimal

I'm building this API, and the database will store values that represent one of the following:
percentage
average
rate
I honestly have no idea how to represent something that the range is between 0 and 100% in numbers. Should it be
0.00 - 1.00
0.00 - 100.00
any other alternative that I don't know
Is there a clear choice for that? A global way of representing on databases something that goes from 0 to 100% percent? Going further, what's the correct that type for it, float or decimal?
Thank you.

I'll take the opposite stance.
FLOAT is for approximate numbers, such as percentages, averages, etc. You should do formatting as you display the values, either in app code or using the FORMAT() function of MySQL.
Don't ever test float_value = 1.3; there are many reasons why that will fail.
DECIMAL should be used for monetary values. DECIMAL avoids a second rounding when a value needs to be rounded to dollars/cents/euros/etc. Accountants don't like fractions of cents.
MySQL's implementation of DECIMAL allows 65 significant digits; FLOAT gives about 7 and DOUBLE about 16. 7 is usually more than enough for sensors and scientific computations.
As for "percentage" -- Sometimes I have used TINYINT UNSIGNED when I want to consume only 1 byte of storage and don't need much precision; sometimes I have used FLOAT (4 bytes). There is no datatype tuned specifically for percentage. (Note also, that DECIMAL(2,0) cannot hold the value 100, so technically you would need DECIMAL(3,0).)
Or sometimes I have used a FLOAT that held a value between 0 and 1. But then I would need to make sure to multiply by 100 before displaying the "percentage".
More
All three of "percentage, average, rate" smell like floats, so that would be my first choice.
One criterion for deciding on datatype... How many copies of the value will exist?
If you have a billion-row table with a column for a percentage, consider that TINYINT would take 1 byte (1GB total), but FLOAT would take 4 bytes (4GB total). OTOH, most applications do not have that many rows, so this may not be relevant.
As a 'general' rule, "exact" values should use some form of INT or DECIMAL. Inexact things (scientific calculations, square roots, division, etc) should use FLOAT (or DOUBLE).
Furthermore, the formatting of the output should usually be left to the application front end. That is, even though an "average" may compute to "14.6666666...", the display should show something like "14.7"; this is friendlier to humans. Meanwhile, you have the underlying value to later decide that "15" or "14.667" is preferable output formatting.
The range "0.00 - 100.00" could be done either with FLOAT and use output formatting or with DECIMAL(5,2) (3 bytes) with the pre-determination that you will always want the indicated precision.

I would generally recommend against using float. Floating point numbers do represent numbers in base-2, which causes some (exact) numbers to be round-up in operations or comparisons, because they just cannot be accurately stored in base-2. This may lead to suprising behaviors.
Consider the following example:
create table t (num float);
insert into t values(1.3);
select * from t;
| num |
| --: |
| 1.3 |
select * from t where num = 1.3;
| num |
| --: |
Base-2 comparison of number 1.3 fails. This is tricky.
In comparison, decimal provide an accurate representation of finite numbers within their range. If you change float to decimal(2, 1) in the above example, you do get the expected results.

I recommend using decimal(5,2) if you're going to store it in the same way you'll display it since decimal is for preserving the exact precision. (See https://dev.mysql.com/doc/refman/8.0/en/fixed-point-types.html)
Because floating-point values are approximate and not stored as exact values, attempts to treat them as exact in comparisons may lead to problems. They are also subject to platform or implementation dependencies.
(https://dev.mysql.com/doc/refman/8.0/en/floating-point-types.html)
A floating-point value as written in an SQL statement may not be the same as the value represented internally.
For DECIMAL columns, MySQL performs operations with a precision of 65 decimal digits, which should solve most common inaccuracy problems.
https://dev.mysql.com/doc/refman/8.0/en/problems-with-float.html

Decimal :
In case of financial applications it is better to use Decimal types because it gives you a high level of accuracy and easy to avoid rounding errors
Double :
Double Types are probably the most normally used data type for real values, except handling money.
Float :
It is used mostly in graphic libraries because very high demands for processing powers, also used situations that can endure rounding errors.
Reference: http://net-informations.com/q/faq/float.html

Difference between float and decimal are the precision. Decimal can 100% accurately represent any number within the precision of the decimal format, whereas Float, cannot accurately represent all numbers.
Use Decimal for e.g. financial related value and use float for e.g. graphical related value

mysql> create table numbers (a decimal(10,2), b float);
mysql> insert into numbers values (100, 100);
mysql> select #a := (a/3), #b := (b/3), #a * 3, #b * 3 from numbers \G
*********************************************************************
#a := (a/3): 33.333333333
#b := (b/3): 33.333333333333
#a + #a + #a: 99.999999999000000000000000000000
#b + #b + #b: 100
The decimal did exactly what's supposed to do on this cases, it
truncated the rest, thus losing the 1/3 part.
So for sums, the decimal is better, but for divisions, the float is
better, up to some point, of course. I mean, using DECIMAL will not give
you "fail-proof arithmetic" in any means.
I hope this will help.

In tsql:
Float, 0.0 store as 0 and it dont require to define after decimal point digit, e.g. you dont need to write Float(4,2).
Decimal, 0.0 store as 0.0 and it has option to define like decimal(4,2), I would suggest 0.00-1.00, by doing this you can calculate value of that percent without multiply by 100, and if you report then set data type of that column as percent as MS Excel and other platform view like 0.5 -> 50%.

What will be the constraints of the values that can be entered in the colum that was declared as FLOAT?

I am building a web app, and in some section in it a teacher inserts the expected results of a scientific experiment. These results must be very accurate, they might come like this 0.4933546522886728. And after searching for a while, FLOAT seems to be the right datatype to store these answers in the database. As known FLOAT columns in mysql can be declared like this FLOAT(n, d), where n is the total number of digits in the number and d is the number of digits after the decimal point. So, I do not know the number of digits the teacher will enter. So, what would happen if I declared it like this FLOAT. The thing that made me think of this is this quote from the mysql documentation.
For maximum portability, code requiring storage of approximate numeric data values should use FLOAT or DOUBLE PRECISION with no specification of precision or number of digits.
And what would be the maximum and minimum of the values to be entered in this FLOAT column.
I also thought of using VARCHAR and store the exact number that the teacher enters and then according to the nature of the number that in the database number that the student enters to be compared with the right answer will be manipulated to match the other number.
For example if the teacher enters 1.23451 and the student enters 1.4235123, my code will make it 1.42351.

The (n,d) on the end of FLOAT and DECIMAL does not make sense. All it does is cause an extra rounding.
FLOAT provides about 7 significant decimal digits of precision and a modestly big exponent range. 0.4933546522886728 will be stored as about 0.4933546xxxxx, with the extra digits being noise.
That number can be stored in a DOUBLE, with a rounding error after 53 bits (about 16 digits) of precision.
There are very few scientific measurements that need more digits than available in the precision of FLOAT.
You can INSERT ... VALUES ( 0.4933546522886728 ) and put that into a FLOAT. It will get rounded to 24 significant bits. Ditto for 4933546522886.728 . Or 0.0000000004933546522886728 . Or 4.933546522886728e20 or 4.933546522886728e-20 .
Take whatever numbers you are given and simply put them in the INSERT without worrying about precision or scaling.
VARCHAR is the wrong way to go for numbers and dates, unless you want to store the raw input before it has been converted into the internal format.

MySQL JSON stores different floating point value

How are floating point values in JSON data columns rounded in MySQL (5.7)?
I am having trouble finding a good resource to know how to solve my issue.
Here's what happens:
CREATE TABLE someTable (jdoc JSON);
INSERT INTO someTable VALUES('{"data":14970.911769838869}');
Then select the rows:
SELECT * from someTable;
I get data back with a different final digit:
'{"data": 14970.911769838867}'
Any idea why this happens? Can I adjust the data in a way to prevent this or is there a rounding precision issue?

Double precision floating point has about 16 decimal digits of precision. Your number has 17 digits, so it can't be represented exactly in floating point, and you get round-off error in the last digit.
See How many significant digits have floats and doubles in java?
The question is about Java, but just about everything uses the same IEEE 754 floating point format, so the answer applies pretty generally.

Storing statistical data, do I need DECIMAL, FLOAT or DOUBLE?

I am creating for fun, but I still want to approach it seriously, a site which hosts various tests. With these tests I hope to collect statistical data.
Some of the data will include the percentage of the completeness of the tests as they are timed. I can easily compute the percentage of the tests but I would like true data to be returned as I store the various different values concerning the tests on completion.
Most of the values are, in PHP floats, so my question is, if I want true statistical data should I store them in MYSQL as FLOAT, DOUBLE or DECIMAL.
I would like to utilize MYSQL'S functions such as AVG() and LOG10() as well as TRUNCATE(). For MYSQL to return true data based off of my values that I insert, what should I use as the database column choice.
I ask because some numbers may or may not be floats such as, 10, 10.89, 99.09, or simply 0.
But I would like true and valid statistical data to be returned.
Can I rely on floating point math for this?
EDIT
I know this is a generic question, and I apologise extensively, but for non mathematicians like myself, also I am not a MYSQL expert, I would like an opinion of an expert in this field.
I have done my research but I still feel I have a clouded judgement on the matter. Again I apologise if my question is off topic or not suitable for this site.

This link does a good job of explaining what you are looking for. Here is what is says:
All these three Types, can be specified by the following Parameters (size, d). Where size is the total size of the String, and d represents precision. E.g To store a Number like 1234.567, you will set the Datatype to DOUBLE(7, 3) where 7 is the total number of digits and 3 is the number of digits to follow the decimal point.
FLOAT and DOUBLE, both represent floating point numbers. A FLOAT is for single-precision, while a DOUBLE is for double-precision numbers. A precision from 0 to 23 results in a 4-byte single-precision FLOAT column. A precision from 24 to 53 results in an 8-byte double-precision DOUBLE column. FLOAT is accurate to approximately 7 decimal places, and DOUBLE upto 14.
Decimal’s declaration and functioning is similar to Double. But there is one big difference between floating point values and decimal (numeric) values. We use DECIMAL data type to store exact numeric values, where we do not want precision but exact and accurate values. A Decimal type can store a Maximum of 65 Digits, with 30 digits after decimal point.
So, for the most accurate and precise value, Decimal would be the best option.

Unless you are storing decimal data (i.e. currency), you should use a standard floating point type (FLOAT or DOUBLE). DECIMAL is a fixed point type, so can overflow when computing things like SUM, and will be ridiculously inaccurate for LOG10.
There is nothing "less precise" about binary floating point types, in fact, they will be much more accurate (and faster) for your needs. Go with DOUBLE.

Decimal : Fixed-Point Types (Exact Value). Use it when you care about exact precision like money.
Example: salary DECIMAL(8,2), 8 is the total number of digits, 2 is the number of decimal places. salary will be in the range of -999999.99 to 999999.99
Float, Double : Floating-Point Types (Approximate Value). Float uses 4 bytes to represent value, Double uses 8 bytes to represent value.
Example: percentage FLOAT(5,2), same as the type decimal, 5 is total digits and 2 is the decimal places. percentage will store values between -999.99 to 999.99.
Note that they are approximate value, in this case:
Value like 1 / 3.0 = 0.3333333... will be stored as 0.33 (2 decimal place)
Value like 33.009 will be stored as 33.01 (rounding to 2 decimal place)

Put it simply, Float and double are not as precise as decimal. decimal is recommended for money related number input.(currency and salary).
Another point need to point out is: Do NOT compare float number using "=","<>", because float numbers are not precise.

Linger: The website you mention and quote has IMO some imprecise info that made me confused. In the docs I read that when you declare a float or a double, the decimal point is in fact NOT included in the number. So it is not the number of chars in a string but all digits used.
Compare the docs:
"DOUBLE PRECISION(M,D).. Here, “(M,D)” means than values can be stored with up to M digits in total, of which D digits may be after the decimal point. For example, a column defined as FLOAT(7,4) will look like -999.9999 when displayed"
http://dev.mysql.com/doc/refman/5.1/en/floating-point-types.html
Also the nomenclature in misleading - acc to docs: M is 'precision' and D is 'scale', whereas the website takes 'scale' for 'precision'.
Thought it would be useful in case sb like me was trying to get a picture.
Correct me if I'm wrong, hope I haven't read some outdated docs:)

Float and Double are Floating point data types, which means that the numbers they store can be precise up to a certain number of digits only.
For example for a table with a column of float type if you store 7.6543219 it will be stored as 7.65432.
Similarly the Double data type approximates values but it has more precision than Float.
When creating a table with a column of Decimal data type, you specify the total number of digits and number of digits after decimal to store, and if the number you store is within the range you specified it will be stored exactly.
When you want to store exact values, Decimal is the way to go, it is what is known as a fixed data type.

Simply use FLOAT. And do not tack on '(m,n)'. Do display numbers to a suitable precision with formatting options. Do not expect to get correct answers with "="; for example, float_col = 0.12 will always return FALSE.
For display purposes, use formatting to round the results as needed.
Percentages, averages, etc are all rounded (at least in some cases). That any choice you make will sometimes have issues.
Use DECIMAL(m,n) for currency; use ...INT for whole numbers; use DOUBLE for scientific stuff that needs more than 7 digits of precision; use FLOAT` for everything else.
Transcendentals (such as the LOG10 that you mentioned) will do their work in DOUBLE; they will essentially never be exact. It is OK to feed it a FLOAT arg and store the result in FLOAT.
This Answer applies not just to MySQL, but to essentially any database or programming language. (The details may vary.)
PS: (m,n) has been removed from FLOAT and DOUBLE. It only added extra rounding and other things that were essentially no benefit.

how many digits in FLOAT?

I've looked all over and can't find this answer.
How many actual digits are there for a MySQL FLOAT?
I know (think?) that it truncates what's in excess of the FLOAT's 4 byte limit, but what exactly is that?

From the manual (emphasis mine):
For FLOAT, the SQL standard permits an optional specification of the
precision (but not the range of the exponent) in bits following the
keyword FLOAT in parentheses. MySQL also supports this optional
precision specification, but the precision value is used only to
determine storage size. A precision from 0 to 23 results in a 4-byte
single-precision FLOAT column. A precision from 24 to 53 results in an
8-byte double-precision DOUBLE column.
So up to 23 bits of precision for the mantissa can be stored in a FLOAT, which is equivalent to about 7 decimal digits because 2^23 ~ 10^7 (8,388,608 vs 10,000,000). I tested it here. You can see that 12 decimal digits are returned, of which only the first 7 are really accurate.

for those of you who think that MySQL treats floats the same as, for example JAVA, I got some SHOCKING news: MySQL degrades the available accuracy which is possible to a float, in order to hide from you decimal places which might be incorrect! Check this out:
JAVA:
public static void main(String[] args) {
long i = 16777225;
DecimalFormat myFormatter = new DecimalFormat("##,###,###");
float iAsFloat = Float.parseFloat("" + i);
System.out.println("long i = " + i + " becomes " + myFormatter.format(iAsFloat));
}
the output is
long i = 16777225 becomes 16,777,224
So far, so normal. Our example integer is just above 2^24 = 16777216. Due to the 23 bit mantissa, between 2^23 and 2^24, a float can hold every integer. Then from 2^24 to 2^25, it can hold only even numbers, from 2^25 to 2^26 only numbers divisible by 4 and so on (also in the other direction: from 2^22 to 2^23, it can hold all multiples of 0.5). As long as the exponent isn't out of range, that's the rule of what a float can store.
16777225 is odd, so the "float version" is one off, because in that range (from 2^24 to 2^25) the "step size" of the float is 2.
And now, what does MySQL make of it.
Here is the fiddle, in case you don't believe me (I wouldn't)
http://www.sqlfiddle.com/#!2/a42e79/1
CREATE TABLE IF NOT EXISTS `test` (
`test` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `test`(`test`) VALUES (16777225)
SELECT * FROM `test`
result:
16777200
the result is off by 25 rather than 1, but has the "advantage" of being divisible by 100. Thanks a lot.
I think I understand the "philosophy" behind this utter nonsense, but I can't say I approve. Here is the "reason":
They don't want you to see the decimal places which could be wrong, which they accomplish by rounding the "actual" float value (as it is in JAVA and according to the industry standard) to some suitable power of ten.
In the example, if we leave it as it is, the last digit is wrong, without being a zero, and we can't have that.
Then, if we round to multiples of ten, the correct value would be 16777230, while what the "actual" float would be rounded to 16777220. Now, the 7th digit is wrong (it wasn't wrong before, but now it is.) And it's not zero. We can't have that. Better round to multiples of 100. Now both the correct value and the "actual" float value round to 16777200. So you see only the 6 correct digits. You don't want to know the "24" at the end, telling you (since the step size is 2 in that range) that your original number must have been between 1677723 and 1677725. No, you don't want to know that; those 2 numbers differ in the 7th digit after rounding to the 7th digit, so you can't know the "proper" 7th digit, and hence you want to stop at the 6th digit. Anyway, that's what they think you want at MySQL.
So their goal is to round to some number of decimal digits (namely, 6), such those 6 digits are always "correct", in that you'd have gotten the same 6 digits if you'd rounded the original exact number (before converting it to a float) to 6 digits. And since log_base10(2^23) = 6.92, rounded down 6, I can see why they think that this will always work. Tragically, not even that is true.
example:
i = 33554450
the number is between 2^25 and 2^26, so the "float version" (that is the JAVA float version, not the MySQL float version) of it is the closest multiple of 4 (the smaller one, if it's right in the middle), so that is
i_as_float = 33554448
i rounded to 6 decimals (i.e. to multiples of 100, since it's an 8 digit number) gives 33554500.
i_as_float rounded to 6 decimals gives 33554400
Oops! those differ at the 6th digit! But don't tell the MySQL people. They might just start "improving" 16777200 towards 16777000.
UPDATE
other databases don't do it like that.
fiddle: http://www.sqlfiddle.com/#!15/d9144/1

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008