I have the following SQL query.
SELECT SUM(final_insurance_total) as total
FROM `leads`
GROUP BY leads.status
I have a single row of data in the lead table with a value for final_insurance_total of 458796. The data type for final_insurance_total is float.
For some reason, MySQL is summing a single row as "458796.375".
If I change the query to
SELECT (final_insurance_total) as total
FROM `leads`
GROUP BY leads.status
the correct value is returned. What in the world is going on?
The FLOAT and DOUBLE types in MySQL (as well as in other databases and programming language runtimes) are represented in a special way, which leads to the values stored being approximations, not exact values. See MySQL docs, as well as general information on floating-point arithmetics.
In order to store and operate with exact values, use the type DECIMAL (see https://dev.mysql.com/doc/refman/5.1/en/precision-math-decimal-characteristics.html).
EDIT: I have run some tests, and while floating-point precision errors are quite common, this particular one looks to be specific to the implementation of SUM() in MySQL. In other words, it is a bug that has been there for a long time. In any case, you should use DECIMAL as your field type.
FLOAT does not guarantee precision where any calculation is made. If you use a simple SELECT, no calculation is made, so you get the original value. But if you use SUM(), even with one row, at least one addition is executed (0 + current_value).
Do you really need FLOAT? For example, if you have 2 decimal digits, you could use INT and multiply all values by 100 before all INSERTs. When SELECTing results, you will divide by 100.
If the user is not a sysadmin and cannot change the datatype of the field such as FLOAT, the user can use CAST to produce the desired output.
Related
I have members on my team who are tossing around the idea to make every column in the database a string including numeric columns.
I know that sorting becomes an issue.
What are the other downsides of making a numeric column a string?
The major issue is that users can put broken data into the columns -- data that is not numeric. That is simply not possible with the correct type. Although you could add a check constraint for every numeric column, that seems like a lot of work.
The scenario is: You have a query that works and has worked for a long time. All of a sudden, someone puts a non-numeric value into the column. The query breaks. And because the query was (probably) using implicit conversion, it is really hard to tell where the problem is.
Let me just say: I am speaking from experience here.
Other problems are:
Comparisons don't work as expected: '0' <> '0.0'.
Comparisons don't work as expected: '9' > '100'.
Comparisons don't work as expected: '.1' < '0.01'.
Sorting doesn't work as expected.
The code is filled with (unnecessary and typically implicit) conversions.
Some databases, such as SQL Server, overload operators so '1' + '1' <> '2'.
Some databases overload operators, so current_timestamp + 1 is valid but current_timestamp + '1' is invalid.
A comparison in a query can affect index usage. So, strcol = 1 ends up converting strcol to a number, which typically precludes the use of an index. On the other hand, intcol = '1' ends up converting the constant to a number, which still allows the index to be used. I do not recommend mixing types in comparisons, though.
Space is a wash, because in many cases the string representation might be smaller than the number representation. It depends in that case. There is a slight hit on indexing, because fixed length keys are usually more efficient.
If you mix types, things get worse -- because that affects the optimizer.
Some things that are composed of numbers are not necessarily numeric. You can usually tell the difference easily: does it make sense to perform arithmetic operations on the value? Or another indicator: do leading zeros make sense?
it will take more space
indexes will also take more space and be less efficient
ordering will not work correctly (e.g. "10" < "2")
any numeric operations will not work correctly (e.g. 10% more than x)
having said all this, fields like SSN, phone number, etc. that appear numeric but are not really numbers should be strings.
In general, if the numeric column is an ID and never used for calculations, it is probably OK. If the numbers are "measures", like amount or quantity, I would not recommend it as you most likely would want to do calculations at some point (like SUM, AVG, etc)
I got this type of issue to an externally designed db faced lots of challenges:
Conversion of date, numeric columns during querying
Indexing took more space and has slower performance
I save very large numbers in my database (usually with 50+ digits) and then need to run queries like (id is the label of column where the numbers are saved):
WHERE id % 2 = 0
I tried to use varchar data type for this column and although no error is generated while running the query, the returned result is mathematically wrong (the returned ids are not even).
does MySQL convert varchar to int while running my query and so is overflow the reason of the mistake in results?
what is the best choice for saving such large numbers on which I can do arithmetic operation latter? if decimal is the best candidate then what if I need to save the numbers with 100 digits?
Your only choice is to cast to a numeric/decimal value explicitly. In MySQL, that supports up to 65 digits of precision.
Here is a db<>fiddle with an example. Or an example using %.
I have a table with high precision value, stored as Float. When I query the table for that value it returns rounded off value, rounded to 1st digit. But when I run the below query I am getting the value that I have stored,
SELECT MY_FLOAT_COL*1 FROM MY_TABLE;
What's going on inside Mysql?
If you want to store exact values, you'd use the DECIMAL data types.
By manual of FLOAT:
The FLOAT and DOUBLE types represent approximate numeric data values. MySQL uses four bytes for single-precision values
The thing to mention here is approximation.
You can read more on floats here: http://dev.mysql.com/doc/internals/en/floating-point-types.html
I'm trying to store a currency value in MySQL (InnoDb) and I need to write queries to aggregate some records (sum) and I'm having problem with the precision of the the output!
I've set the field's type to double and my values are some what precise but the MySQL's operators are not as precise as I need. For what it is worth, PHP's default operators are not precise enough either but there's bc* functions in PHP which can do the trick.
I was wondering if there's any way to tune the precision of MySQL operators? Including aggregation functions?
For the record, storing to and retrieving from MySQL won't affect my values which means double is an ideal type for my fields.
Since money needs an exact representation don't use data types that are only approximate like double which is a floating-point. You can use a fixed-point numeric data type for that like
numeric(15,2)
15 is the precision (total length of value including decimal places)
2 is the number of digits after decimal point
See MySQL Numeric Types:
These types are used when it is important to preserve exact precision, for example with monetary data.
Good day, I am confused with the datatype for MySQL.
I am using decimal as apparently it is the safest bet for accuracy in a business application. However, I find that when fields are returned I have values of 999999999.99, where my datatype is DECIMAL(10,2). So the actual value has overflowed outside the (10, 2) parameter.
Would it be correct that even though I have specified 10 places before the comma and 2 places after the comma. MySQL still stores the complete number?
Also would it be possible to turn off the maximum amount of digits displayed before and after the comma?
Would it be correct that even though I have specified 10 places before the comma and 2 places after the comma. MySQL still stores the complete number?
No, it wouldn't.
First, you specified 10 digits altogether; two are to the right of the decimal point, and eight are to the left.
Standard SQL requires that DECIMAL(5,2) be able to store any value with five digits and two decimals, so values that can be stored in the salary column range from -999.99 to 999.99.
Second, MySQL will silently convert the least significant digits to scale if there are more than two. That will probably look like MySQL truncates, but the actual behavior is platform-dependent. It will raise an error if you supply too many of the most significant digits.
Finally, when you're working with databases, the number of digits displayed has little to do with what a data type is or with what range of values it stores.