Storing statistical data, do I need DECIMAL, FLOAT or DOUBLE? - mysql

I am creating for fun, but I still want to approach it seriously, a site which hosts various tests. With these tests I hope to collect statistical data.
Some of the data will include the percentage of the completeness of the tests as they are timed. I can easily compute the percentage of the tests but I would like true data to be returned as I store the various different values concerning the tests on completion.
Most of the values are, in PHP floats, so my question is, if I want true statistical data should I store them in MYSQL as FLOAT, DOUBLE or DECIMAL.
I would like to utilize MYSQL'S functions such as AVG() and LOG10() as well as TRUNCATE(). For MYSQL to return true data based off of my values that I insert, what should I use as the database column choice.
I ask because some numbers may or may not be floats such as, 10, 10.89, 99.09, or simply 0.
But I would like true and valid statistical data to be returned.
Can I rely on floating point math for this?
EDIT
I know this is a generic question, and I apologise extensively, but for non mathematicians like myself, also I am not a MYSQL expert, I would like an opinion of an expert in this field.
I have done my research but I still feel I have a clouded judgement on the matter. Again I apologise if my question is off topic or not suitable for this site.

This link does a good job of explaining what you are looking for. Here is what is says:
All these three Types, can be specified by the following Parameters (size, d). Where size is the total size of the String, and d represents precision. E.g To store a Number like 1234.567, you will set the Datatype to DOUBLE(7, 3) where 7 is the total number of digits and 3 is the number of digits to follow the decimal point.
FLOAT and DOUBLE, both represent floating point numbers. A FLOAT is for single-precision, while a DOUBLE is for double-precision numbers. A precision from 0 to 23 results in a 4-byte single-precision FLOAT column. A precision from 24 to 53 results in an 8-byte double-precision DOUBLE column. FLOAT is accurate to approximately 7 decimal places, and DOUBLE upto 14.
Decimal’s declaration and functioning is similar to Double. But there is one big difference between floating point values and decimal (numeric) values. We use DECIMAL data type to store exact numeric values, where we do not want precision but exact and accurate values. A Decimal type can store a Maximum of 65 Digits, with 30 digits after decimal point.
So, for the most accurate and precise value, Decimal would be the best option.

Unless you are storing decimal data (i.e. currency), you should use a standard floating point type (FLOAT or DOUBLE). DECIMAL is a fixed point type, so can overflow when computing things like SUM, and will be ridiculously inaccurate for LOG10.
There is nothing "less precise" about binary floating point types, in fact, they will be much more accurate (and faster) for your needs. Go with DOUBLE.

Decimal : Fixed-Point Types (Exact Value). Use it when you care about exact precision like money.
Example: salary DECIMAL(8,2), 8 is the total number of digits, 2 is the number of decimal places. salary will be in the range of -999999.99 to 999999.99
Float, Double : Floating-Point Types (Approximate Value). Float uses 4 bytes to represent value, Double uses 8 bytes to represent value.
Example: percentage FLOAT(5,2), same as the type decimal, 5 is total digits and 2 is the decimal places. percentage will store values between -999.99 to 999.99.
Note that they are approximate value, in this case:
Value like 1 / 3.0 = 0.3333333... will be stored as 0.33 (2 decimal place)
Value like 33.009 will be stored as 33.01 (rounding to 2 decimal place)

Put it simply, Float and double are not as precise as decimal. decimal is recommended for money related number input.(currency and salary).
Another point need to point out is: Do NOT compare float number using "=","<>", because float numbers are not precise.

Linger: The website you mention and quote has IMO some imprecise info that made me confused. In the docs I read that when you declare a float or a double, the decimal point is in fact NOT included in the number. So it is not the number of chars in a string but all digits used.
Compare the docs:
"DOUBLE PRECISION(M,D).. Here, “(M,D)” means than values can be stored with up to M digits in total, of which D digits may be after the decimal point. For example, a column defined as FLOAT(7,4) will look like -999.9999 when displayed"
http://dev.mysql.com/doc/refman/5.1/en/floating-point-types.html
Also the nomenclature in misleading - acc to docs: M is 'precision' and D is 'scale', whereas the website takes 'scale' for 'precision'.
Thought it would be useful in case sb like me was trying to get a picture.
Correct me if I'm wrong, hope I haven't read some outdated docs:)

Float and Double are Floating point data types, which means that the numbers they store can be precise up to a certain number of digits only.
For example for a table with a column of float type if you store 7.6543219 it will be stored as 7.65432.
Similarly the Double data type approximates values but it has more precision than Float.
When creating a table with a column of Decimal data type, you specify the total number of digits and number of digits after decimal to store, and if the number you store is within the range you specified it will be stored exactly.
When you want to store exact values, Decimal is the way to go, it is what is known as a fixed data type.

Simply use FLOAT. And do not tack on '(m,n)'. Do display numbers to a suitable precision with formatting options. Do not expect to get correct answers with "="; for example, float_col = 0.12 will always return FALSE.
For display purposes, use formatting to round the results as needed.
Percentages, averages, etc are all rounded (at least in some cases). That any choice you make will sometimes have issues.
Use DECIMAL(m,n) for currency; use ...INT for whole numbers; use DOUBLE for scientific stuff that needs more than 7 digits of precision; use FLOAT` for everything else.
Transcendentals (such as the LOG10 that you mentioned) will do their work in DOUBLE; they will essentially never be exact. It is OK to feed it a FLOAT arg and store the result in FLOAT.
This Answer applies not just to MySQL, but to essentially any database or programming language. (The details may vary.)
PS: (m,n) has been removed from FLOAT and DOUBLE. It only added extra rounding and other things that were essentially no benefit.

Related

Data type for price with positive number max. 4 digits before and after the comma [duplicate]

I setup a database/website recently where the members have points scored against them.
There are 3 points fields (corresponding to different activities). And the Sum of those 3 fields = their total points.
Initially, I understood they'd always be whole numbers not totally more than 30. So I set the point fields to INT
Now they need to be able to have quarter (.25) and half points (.5) assigned.
Am I best to change these points fields to FLOAT(2,2)?
I would use a DECIMAL(4,2). 4 is the precision (the total number of digits); 2 is the scale (the number of digits to the right of the decimal point).
From the MySQL Reference:
Fixed-Point (Exact-Value) Types
The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data. In MySQL, NUMERIC is implemented as DECIMAL, so the following remarks about DECIMAL apply equally to NUMERIC.
Alternately, you could just store an int that represents 4 times the "actual" score.
Example: 4.25 would be represented in the database as 17.
depending on what you are doing it may be easier to store the points as .25->1, .5->2, 1->4 (as in number of quaters) that way you can use an int still and just format the output when it is displayed.
Short answer: Yes.
Yes if you want to have decimal points you can either yes FLOAT(M,D) , REAL(M,D) or DOUBLE PRECISION(M,D) however there is some know issues involved with MySQL Float which is more or less depending on the Platform.
There is automated rounding with FLOAT field which could be a bad or good thing depending on what you want for example if you insert 999.00009 into a FLOAT(7,4) column, the approximate result is 999.0001.
you can use DECIMAL(M,D)(fixed point representation) for accuracy otherwise Float is also a good choice.

What will be the constraints of the values that can be entered in the colum that was declared as FLOAT?

I am building a web app, and in some section in it a teacher inserts the expected results of a scientific experiment. These results must be very accurate, they might come like this 0.4933546522886728. And after searching for a while, FLOAT seems to be the right datatype to store these answers in the database. As known FLOAT columns in mysql can be declared like this FLOAT(n, d), where n is the total number of digits in the number and d is the number of digits after the decimal point. So, I do not know the number of digits the teacher will enter. So, what would happen if I declared it like this FLOAT. The thing that made me think of this is this quote from the mysql documentation.
For maximum portability, code requiring storage of approximate numeric data values should use FLOAT or DOUBLE PRECISION with no specification of precision or number of digits.
And what would be the maximum and minimum of the values to be entered in this FLOAT column.
I also thought of using VARCHAR and store the exact number that the teacher enters and then according to the nature of the number that in the database number that the student enters to be compared with the right answer will be manipulated to match the other number.
For example if the teacher enters 1.23451 and the student enters 1.4235123, my code will make it 1.42351.
The (n,d) on the end of FLOAT and DECIMAL does not make sense. All it does is cause an extra rounding.
FLOAT provides about 7 significant decimal digits of precision and a modestly big exponent range. 0.4933546522886728 will be stored as about 0.4933546xxxxx, with the extra digits being noise.
That number can be stored in a DOUBLE, with a rounding error after 53 bits (about 16 digits) of precision.
There are very few scientific measurements that need more digits than available in the precision of FLOAT.
You can INSERT ... VALUES ( 0.4933546522886728 ) and put that into a FLOAT. It will get rounded to 24 significant bits. Ditto for 4933546522886.728 . Or 0.0000000004933546522886728 . Or 4.933546522886728e20 or 4.933546522886728e-20 .
Take whatever numbers you are given and simply put them in the INSERT without worrying about precision or scaling.
VARCHAR is the wrong way to go for numbers and dates, unless you want to store the raw input before it has been converted into the internal format.

Best data type to store money values in MySQL

I want to store many records in a MySQL database. All of them contains money values. But I don't know how many digits will be inserted for each one.
Which data type do I have to use for this purpose?
VARCHAR or INT (or other numeric data types)?
Since money needs an exact representation don't use data types that are only approximate like float. You can use a fixed-point numeric data type for that like
decimal(15,2)
15 is the precision (total length of value including decimal places)
2 is the number of digits after decimal point
See MySQL Numeric Types:
These types are used when it is important to preserve exact precision, for example with monetary data.
You can use DECIMAL or NUMERIC both are same
The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data. In MySQL, NUMERIC is implemented as DECIMAL, so the following remarks about DECIMAL apply equally to NUMERIC. : MySQL
i.e. DECIMAL(10,2)
Good read
I prefer to use BIGINT, and store the values in by multiply with 100, so that it will become integer.
For e.g., to represent a currency value of 93.49, the value shall be stored as 9349, while displaying the value we can divide by 100 and display. This will occupy less storage space.
Caution:
Mostly we don't perform currency * currency multiplication, in case if we are doing it then divide the result with 100 and store, so that it returns to proper precision.
It depends on your need.
Using DECIMAL(10,2) usually is enough but if you need a little bit more precise values you can set DECIMAL(10,4).
If you work with big values replace 10 with 19.
If your application needs to handle money values up to a trillion then this should work: 13,2
If you need to comply with GAAP (Generally Accepted Accounting Principles) then use: 13,4
Usually you should sum your money values at 13,4 before rounding of the output to 13,2.
At the time this question was asked nobody thought about Bitcoin price. In the case of BTC, it is probably insufficient to use DECIMAL(15,2). If the Bitcoin will rise to $100,000 or more, we will need at least DECIMAL(18,9) to support cryptocurrencies in our apps.
DECIMAL(18,9) takes 12 bytes of space in MySQL (4 bytes per 9 digits).
We use double.
*gasp*
Why?
Because it can represent any 15 digit number with no constraints on where the decimal point is. All for a measly 8 bytes!
So it can represent:
0.123456789012345
123456789012345.0
...and anything in between.
This is useful because we're dealing with global currencies, and double can store the various numbers of decimal places we'll likely encounter.
A single double field can represent 999,999,999,999,999s in Japanese yens, 9,999,999,999,999.99s in US dollars and even 9,999,999.99999999s in bitcoins
If you try doing the same with decimal, you need decimal(30, 15) which costs 14 bytes.
Caveats
Of course, using double isn't without caveats.
However, it's not loss of accuracy as some tend to point out. Even though double itself may not be internally exact to the base 10 system, we can make it exact by rounding the value we pull from the database to its significant decimal places. If needed that is. (e.g. If it's going to be outputted, and base 10 representation is required.)
The caveats are, any time we perform arithmetic with it, we need to normalize the result (by rounding it to its significant decimal places) before:
Performing comparisons on it.
Writing it back to the database.
Another kind of caveat is, unlike decimal(m, d) where the database will prevent programs from inserting a number with more than m digits, no such validations exists with double. A program could insert a user inputted value of 20 digits and it'll end up being silently recorded as an inaccurate amount.
If GAAP Compliance is required or you need 4 decimal places:
DECIMAL(13, 4)
Which supports a max value of:
$999,999,999.9999
Otherwise, if 2 decimal places is enough:
DECIMAL(13,2)
src: https://rietta.com/blog/best-data-types-for-currencymoney-in/
Indeed this relies on the programmer's preferences. I personally use: numeric(15,4) to conform to the Generally Accepted Accounting Principles (GAAP).
Try using
Decimal(19,4)
this usually works with every other DB as well
Storing money as BIGINT multiplied by 100 or more with the reason to use less storage space makes no sense in all "normal" situations.
To stay aligned with GAAP it is sufficient to store currencies in DECIMAL(13,4)
MySQL manual reads that it needs 4 bytes per 9 digits to store DECIMAL.
https://dev.mysql.com/doc/refman/8.0/en/precision-math-decimal-characteristics.html
DECIMAL(13,4) represents 9 digits + 4 fraction digits (decimal places) => 4 + 2 bytes = 6 bytes
compare to 8 bytes required to store BIGINT.
There are 2 valid options:
use integer amount of currency minor units (e.g. cents)
represent amount as decimal value of the currency
In both cases you should use decimal data type to have enough significant digits. The difference can be in precision:
even for integer amount of minor units it's better to have extra precisions for accumulators (account for accumulating 10% fees from 1-cent operations)
different currencies have different number of decimals, cryptocurrencies have up to 18 decimals
The number of decimals can change over time due to inflation
Source and more caveats and facts.
Multiplies 10000 and stores as BIGINT, like "Currency" in Visual Basic and Office. See https://msdn.microsoft.com/en-us/library/office/gg264338.aspx

Mysql - which datatype is best for values from 0 to 30.25 with no more than 2 decimal places of precision?

I setup a database/website recently where the members have points scored against them.
There are 3 points fields (corresponding to different activities). And the Sum of those 3 fields = their total points.
Initially, I understood they'd always be whole numbers not totally more than 30. So I set the point fields to INT
Now they need to be able to have quarter (.25) and half points (.5) assigned.
Am I best to change these points fields to FLOAT(2,2)?
I would use a DECIMAL(4,2). 4 is the precision (the total number of digits); 2 is the scale (the number of digits to the right of the decimal point).
From the MySQL Reference:
Fixed-Point (Exact-Value) Types
The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data. In MySQL, NUMERIC is implemented as DECIMAL, so the following remarks about DECIMAL apply equally to NUMERIC.
Alternately, you could just store an int that represents 4 times the "actual" score.
Example: 4.25 would be represented in the database as 17.
depending on what you are doing it may be easier to store the points as .25->1, .5->2, 1->4 (as in number of quaters) that way you can use an int still and just format the output when it is displayed.
Short answer: Yes.
Yes if you want to have decimal points you can either yes FLOAT(M,D) , REAL(M,D) or DOUBLE PRECISION(M,D) however there is some know issues involved with MySQL Float which is more or less depending on the Platform.
There is automated rounding with FLOAT field which could be a bad or good thing depending on what you want for example if you insert 999.00009 into a FLOAT(7,4) column, the approximate result is 999.0001.
you can use DECIMAL(M,D)(fixed point representation) for accuracy otherwise Float is also a good choice.

MySQL FLOAT & decimals

Datatype of field in the DB is FLOAT and the value is 18.7. I'd like to store and display this on page as 18.70. Whenever I enter the extra 0 it still only stores it as 18.7
How can I store the extra 0 ? I can change the data type of the field.
In a FLOAT column, what MySQL stores for 18.7, is actually:
01000001 10010101 10011001 10011010
which, being retrieved from the DB and converted back into your display format, is 18.7.
In reality, the stored value is a binary fraction represented by the decimal number 18.70000076293945 which you can see by issuing this query:
CREATE TABLE t_f (value FLOAT);
INSERT
INTO t_f
VALUES (18.7);
SELECT CAST(value AS DECIMAL(30, 16))
FROM t_f;
IEEE-754 representation of number stores them as binary fractions, so a value like 0.1 can only be represented with continued fraction and hence be not exact.
DECIMAL, on the other hand, stores decimal digits, packing 9 digits into 4 bytes.
Floating point types do not store the number of insignificant zeros on the left side of a number before decimal digit or on the right side of the number after the decimal digit. You'll need to use a string-based type (or store the precision in a separate field) if you want to store the exact numeric string entered by the user and be able to distinguish 12.7 from 12.70. You can, however, round things that you display by two digits in your application.
if two decimal points needed use:
decimal(n,2); where n>=2
the decimal data type will persist the decimal points formatting and gives more accurate results than float and double data types.
Are you attempting to store a currency as a float? If so, please use a decimal with more decimal digits than 2.
You really want fixed-point arithmetic on currencies.
This is just very broad rule of thumb and my own observation, but in regular business logic as serialized in a database, you almost never want floating point. I know there are lots of exceptions, but I'm suspicious whenever I see a float typed column in a table because of this. I'd be interested in what others have found.