How to avoid double calculations within case statements of MySQL? - mysql

I am wondering if it is possible to avoid a second equal calculation within a CASE statement of MySQL 5.7?
CASE
WHEN char_length(cat.DESCRIPTION) > 0 THEN char_length(cat.DESCRIPTION)
ELSE ''
END AS D_LENGTH
The second char_length seems redundant to me and might be reducing query performance. Is there a way to improve this?

Since you seem to want to display empty string when the character length of the column be zero, you could try using a TRIM trick here:
SELECT TRIM(LEADING '0' FROM CHAR_LENGTH(cat.DESCRIPTION)) AS D_LENGTH
FROM yourTable cat;
This works because whenever the character length of the description be greater than zero, it would never have any leading zeroes. When that length is actually zero, the call to TRIM above would just strip off the single zero, leaving behind an empty string.
Regarding your current approach, no, there isn't much you can do directly to avoid the double call to CHAR_LENGTH. But, as shown above, there are ways out which completely avoid the duplication.

You may try to use intermediate user-defined variable:
CASE
WHEN (#tmp:=char_length(cat.DESCRIPTION)) > 0
THEN #tmp
ELSE ''
END AS D_LENGTH
On "clear" model (a table with one varchar column, data lengths 20-250, 1kk rows, no indices, the whole table is cached) it takes ~15% less time to execute on my system (5.11-5.32s against 5.97-6.23s).
The query produces a warning "1287 Setting user variables within expressions is deprecated and will be removed in a future release. Consider alternatives: 'SET variable=expression, ...', or 'SELECT expression(s) INTO variables(s)'." - ignore it.

First, the performance cost of evaluating any expression is much less than the cost of the rest of the query. So don't bother with such optimization.
Second, to rise to the challenge:
mysql> SELECT COALESCE(NULLIF(CHAR_LENGTH('asdf'), 0), 'empty');
+---------------------------------------------------+
| COALESCE(NULLIF(CHAR_LENGTH('asdf'), 0), 'empty') |
+---------------------------------------------------+
| 4 |
+---------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT COALESCE(NULLIF(CHAR_LENGTH(''), 0), 'empty');
+-----------------------------------------------+
| COALESCE(NULLIF(CHAR_LENGTH(''), 0), 'empty') |
+-----------------------------------------------+
| empty |
+-----------------------------------------------+
1 row in set (0.00 sec)

Related

How to store a decimal calcaulation result in mysql and retrive it back as they were in memory

MySQL documentation says :
The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data.
I did this test on that column decimal_column DECIMAL(31,30).
insert into tests (decimal_column) values(1/3);
then inspecting what has been stored gives this
select * from tests ;
> result : 0.333333333000000000000000000000
then reversing the math operation with this query gives this
select decimal_column*3 from test;
> result: 0.999999999000000000000000000000
I was expecting to get integer "1" as we do it on our calculators and in an excel sheet ! like this
#calculator or excel sheet
>input: 1 / 3
>result: 0.3333333333333333333333333333333333
>input: * 3
>result: 1
1- why MySQL didn't store the exact binary representation of (1 / 3) so I can use that result again in my calculations as they are in memory like a calculator or an excel sheet.
2- How to store in mysql the result of (1/3) as they are in the memory during calculation time, So I can retrieve the exact value back and do something like 3 * $storedValue to result in 1 as integer as we do in a calculator or excel sheet.
The problem is not in storage. In your example, the value was broken before storing it in the table.
Unfortunately, if you write 1/3, that will be calculated (and inserted) using the default representation:
SELECT 1/3
0.333333333
which, as you can see, has insufficient precision.
An additional problem is that when you send a constant (1, or 3) to the server, yo do so using a library or a connector, which can take liberties with the value. For example, it might believe that "1" and "3" are integers and their result is to be treated as an integer. So you get "1/3 = 0", but "1./3 = 0.333333", because the dot in "1." makes the connector realize that it needs to use its default floating point. And you get only six 3's because the "default floating point" of the connector has 6 digits. Then you store it into the database, but it is too late. You're storing with high precision a value that has been truncated to low precision.
You can try casting constants from the beginning. Instead of "1", you use the casting of 1 as a sufficiently large decimal. I'm using your 31,30 here, but check that you don't need to store larger numbers. Possibly, "31,20" would be better.
mysql> SELECT 1/3 UNION SELECT CAST(1 AS DECIMAL(31,30))/CAST(3 AS DECIMAL(31,30));
+----------------------------------+
| 1/3 |
+----------------------------------+
| 0.333333333000000000000000000000 |
| 0.333333333333333333333333333333 |
+----------------------------------+
2 rows in set (0.00 sec)
It is very awkward, but results should be better. Also, I think that it's only necessary to cast one value in an expression; MySQL will then promote all involved quantities as necessary. So, adding CAST(0 AS DECIMAL(x,y)) to sums and CAST(1 AS DECIMAL(x,y)) to multiplications might be enough.
mysql> SELECT 3*CAST(1 AS DECIMAL(31,30))/CAST(3 AS DECIMAL(31,30));
+-------------------------------------------------------+
| 3*CAST(1 AS DECIMAL(31,30))/CAST(3 AS DECIMAL(31,30)) |
+-------------------------------------------------------+
| 1.000000000000000000000000000000 |
+-------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT CAST(1 AS DECIMAL(31,30))*1/3;
+----------------------------------+
| CAST(1 AS DECIMAL(31,30))*1/3 |
+----------------------------------+
| 0.333333333333333333333333333333 |
+----------------------------------+
1 row in set (0.00 sec)
Note that this doesn't work because multiplication has higher precedence:
mysql> SELECT CAST(0 AS DECIMAL(31,30))+1/3;
+----------------------------------+
| CAST(0 AS DECIMAL(31,30))+1/3 |
+----------------------------------+
| 0.333333333000000000000000000000 |
+----------------------------------+
1 row in set (0.00 sec)
It depends on what you need the information for.
If it is stored for calculation and storage, calculate the result (0.33 instead of 1/3) and store that as decimal(1,5). But you can't easily calculate it backwards if you want to display it.
If it is stored to be displayed at a later time but never to be modified again (or at least not fast) you can store it as varchar but that will break sorting.
Or you could store the different elements (positives, negatives, totals, ... whatever) as decimal(5,0) and display / calculate it while using it.
And of course you can combine the above if you want to get the edge out of the computing time while selecting.
MySQL does its internal fractional arithmetic using 64-bit IEEE 754 floating point numbers.
They are generally approximations. It's entirely unreasonable to expect IEEE floating point, or decimal arithmetic for that matter, to achieve exact equality when doing
3*(1/3) == 1
That's just not the way computer arithmetic works.
There's also no way to store an exact representation of the value 1/3, unless you happen to be using an exotic computing system that stores rational (fractional) numbers in the form of (numerator,denominator) pairs. MySQL isn't such a system.
Most calculators implicitly round their results to the number of digits they can display. Excel, too, contains a formatting module. You can choose the format for a cell by pressing -1. The formatting module rounds these floating point numbers. You can achieve the same effect in MySQL using the ROUND() function. That function doesn't change the stored value, but it does render it in a way that conceals the tiny errors inherent in IEEE 754 floating arithmetic.
SELECT ROUND(decimal_column, 2) AS decimal_column
(Don't accounting people have to learn this stuff in school?)

How to know CAST failed in MySQL

Could somebody tell me how I can detect if a cast failed in MySQL using CAST() function?
These two lines return the same value: 0.
SELECT CAST('Banana' AS UNSIGNED INTEGER) AS 'CAST1';
SELECT CAST('0' AS UNSIGNED INTEGER) AS 'CAST2';
You can use regular expressions to validate the data before the conversion:
select (case when val regexp '^[0-9]+$' then cast(val as unsigned integer) end)
The SHOW WARNINGS statement and the ##WARNINGS system variable are the built in methods to do this. There is no mechanism to automatically upgrade all warnings to errors, but there are some things you can do.
You may want to start MySQL with the --show-warnings option, although that might just display the count of warnings with the row count. I can't recall anymore. I don't know if there is a my.ini option for this option. There's also the --log-warnings option, which I believe does have an option in the ini/cnf file. If you're executing a script or using the CLI, the \W command turns show warnings on and \w turns them off for (IIRC) the current connection.
You may also want to look at the SQL mode. TRADITIONAL is probably the most like a normal RDBMS, but it's kind of a rats nest of options. The STRICT modes are what you're most likely after, but read through that page. Most apps built on MySQL take advantage of the (non-deterministic) GROUP BY extensions that bite just about everybody moving to or away from MySQL, and TRADITIONAL enables ONLY_FULL_GROUP_BY, which effectively disables those extensions and the RDBMS doesn't support the OVER() clause. I don't know if silently succeeding at typecasting will abort a transaction even in traditional/strict mode, however.
MySQL is kind of a mine field of these kinds of issues (e.g., zero dates) so it kind of has a poor reputation with DBAs, especially those who worked with v3.x or v4.x.
You could e.g. check the warning_count variable:
MySQL [test]> SELECT CAST(0 AS UNSIGNED INTEGER) AS 'CAST1', ##warning_count;
+-------+-----------------+
| CAST1 | ##warning_count |
+-------+-----------------+
| 0 | 0 |
+-------+-----------------+
1 row in set (0.01 sec)
MySQL [test]> SELECT CAST('Banana' AS UNSIGNED INTEGER) AS 'CAST1', ##warning_count;
+-------+-----------------+
| CAST1 | ##warning_count |
+-------+-----------------+
| 0 | 1 |
+-------+-----------------+
1 row in set, 1 warning (0.00 sec)
There's a caveate though: the warning count is only reset per statement, not per result row,
so if CAST() gets executed mutiple times, e.g. for each result row, the counter will go up
on each failed invocation.
Also warnings don't seem to get reset on successful queries that don't touch any tables,
so in the example above a 2nd
SELECT CAST(0 AS UNSIGNED INTEGER) AS 'CAST1', ##warning_count;
will still show 1 warning, while e.g.
SELECT CAST(0 AS UNSIGNED INTEGER) AS 'CAST1', ##warning_count
FROM mysql.user LIMIT 1;
will correctly reset it to 0 ...
Well, you could incorporate ##warning_count but somehow create workaround for its buggy functionality.
Take a look at this code below. Yes, I know it's ugly, but it works.
SELECT
IF(WarningCount = 0, ConversionResult, NULL)
FROM (
SELECT
CAST('banana' AS DECIMAL(10, 6)) AS ConversionResult
, ##warning_count AS WarningCount
FROM <any non empty table>
LIMIT 1
) AS i;
In inner SELECT I'm getting 1 row (LIMIT 1) from any existing table. I'm converting string ('banana') and get WarningCount. In outer SELECT I'm checking WorkingCount and if it's equal to 0 (conversion successful) then returning converted value.
I would suggest to use such function:
drop function if exists to_number;
delimiter $$
create function to_number (number varchar(10)) returns int
begin
declare error_message varchar(45);
if (number regexp ('^[+-]?[0-9]*([0-9]\\.|[0-9]|\\.[0-9])[0-9]*(e[+-]?[0-9]+)?$'))
then
return number;
else
set error_message = concat('The given value "', number, '" is not a number.');
signal sqlstate '45000' set message_text = error_message;
end if;
end;
It rises an error if the given value is not number or returns the number value.
If you are trying to determine how many values in a varchar are numbers you can try:
select count(*),
sum(is_num)
from (select case when cast(cast(ar_number as unsigned) as char) = ar_number then 1 else 0 end as is_num
from the_table) as t1;
SQL Server supports the try_cast function

DOUBLE vs DECIMAL in MySQL

OK, so I know there are tons of articles stating I shouldn't use DOUBLE to store money on a MySQL database, or I'll end up with tricky precision bugs. The point is I am not designing a new database, I am ask to find way to optimise an existing system. The newer version contains 783 DOUBLE typed columns, most of them used to store money or formula to compute money amount.
So my first opinion on the subject was I should highly recommend a conversion from DOUBLE to DECIMAL in the next version, because the MySQL doc and everybody say so. But then I couldn't find any good argument to justify this recommandation, for three reasons :
We do not perform any calculation on the database. All operations are done in Java using BigDecimal, and MySQL is just used as a plain storage for results.
The 15 digits precision a DOUBLE offers is plenty enough since we store mainly amounts with 2 decimal digits, and occasionaly small numbers wit 8 decimal digits for formula arguments.
We have a 6 years record in production with no known issue of bug due to a loss of precision on the MySQL side.
Even by performing operations on a 18 millons rows table, like SUM and complex multiplications, I couldn't perform a bug of lack of precision. And we don't actually do this sort of things in production. I can show the precision lost by doing something like
SELECT columnName * 1.000000000000000 FROM tableName;
But I can't figure out a way to turn it into a bug at the 2nd decimal digit. Most of the real issues I found on the internet are 2005 and older forum entries, and I couldn't reproduce any of them on a 5.0.51 MySQL server.
So as long as we do not perform any SQL arithmetic operations, which we do not plan to do, are there any issue we should expect from only storing and retreiving a money amount in a DOUBLE column ?
Actually it's quite different. DOUBLE causes rounding issues. And if you do something like 0.1 + 0.2 it gives you something like 0.30000000000000004. I personally would not trust financial data that uses floating point math. The impact may be small, but who knows. I would rather have what I know is reliable data than data that were approximated, especially when you are dealing with money values.
The example from MySQL documentation http://dev.mysql.com/doc/refman/5.1/en/problems-with-float.html (i shrink it, documentation for this section is the same for 5.5)
mysql> create table t1 (i int, d1 double, d2 double);
mysql> insert into t1 values (2, 0.00 , 0.00),
(2, -13.20, 0.00),
(2, 59.60 , 46.40),
(2, 30.40 , 30.40);
mysql> select
i,
sum(d1) as a,
sum(d2) as b
from
t1
group by
i
having a <> b; -- a != b
+------+-------------------+------+
| i | a | b |
+------+-------------------+------+
| 2 | 76.80000000000001 | 76.8 |
+------+-------------------+------+
1 row in set (0.00 sec)
Basically if you sum a you get 0-13.2+59.6+30.4 = 76.8. If we sum up b we get 0+0+46.4+30.4=76.8. The sum of a and b is the same but MySQL documentation says:
A floating-point value as written in an SQL statement may not be the same as the value represented internally.
If we repeat the same with decimal:
mysql> create table t2 (i int, d1 decimal(60,30), d2 decimal(60,30));
Query OK, 0 rows affected (0.09 sec)
mysql> insert into t2 values (2, 0.00 , 0.00),
(2, -13.20, 0.00),
(2, 59.60 , 46.40),
(2, 30.40 , 30.40);
Query OK, 4 rows affected (0.07 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select
i,
sum(d1) as a,
sum(d2) as b
from
t2
group by
i
having a <> b;
Empty set (0.00 sec)
The result as expected is empty set.
So as long you do not perform any SQL arithemetic operations you can use DOUBLE, but I would still prefer DECIMAL.
Another thing to note about DECIMAL is rounding if fractional part is too large. Example:
mysql> create table t3 (d decimal(5,2));
Query OK, 0 rows affected (0.07 sec)
mysql> insert into t3 (d) values(34.432);
Query OK, 1 row affected, 1 warning (0.10 sec)
mysql> show warnings;
+-------+------+----------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------+
| Note | 1265 | Data truncated for column 'd' at row 1 |
+-------+------+----------------------------------------+
1 row in set (0.00 sec)
mysql> select * from t3;
+-------+
| d |
+-------+
| 34.43 |
+-------+
1 row in set (0.00 sec)
We have just been going through this same issue, but the other way around. That is, we store dollar amounts as DECIMAL, but now we're finding that, for example, MySQL was calculating a value of 4.389999999993, but when storing this into the DECIMAL field, it was storing it as 4.38 instead of 4.39 like we wanted it to. So, though DOUBLE may cause rounding issues, it seems that DECIMAL can cause some truncating issues as well.
"are there any issue we should expect from only storing and retreiving a money amount in a DOUBLE column ?"
It sounds like no rounding errors can be produced in your scenario and if there were, they would be truncated by the conversion to BigDecimal.
So I would say no.
However, there is no guarantee that some change in the future will not introduce a problem.
From your comments,
the tax amount rounded to the 4th decimal and the total price rounded
to the 2nd decimal.
Using the example in the comments, I might foresee a case where you have 400 sales of $1.47. Sales-before-tax would be $588.00, and sales-after-tax would sum to $636.51 (accounting for $48.51 in taxes). However, the sales tax of $0.121275 * 400 would be $48.52.
This was one way, albeit contrived, to force a penny's difference.
I would note that there are payroll tax forms from the IRS where they do not care if an error is below a certain amount (if memory serves, $0.50).
Your big question is: does anybody care if certain reports are off by a penny? If the your specs say: yes, be accurate to the penny, then you should go through the effort to convert to DECIMAL.
I have worked at a bank where a one-penny error was reported as a software defect. I tried (in vain) to cite the software specifications, which did not require this degree of precision for this application. (It was performing many chained multiplications.) I also pointed to the user acceptance test. (The software was verified and accepted.)
Alas, sometimes you just have to make the conversion. But I would encourage you to A) make sure that it's important to someone and then B) write tests to show that your reports are accurate to the degree specified.

Mysql Like + Wild Card vs Equals Operator

I recently just fixed a bug in some of my code and was hoping someone could explain to me why the bug occurred.
I had a query like this:
SELECT * FROM my_table WHERE my_field=13
Unexpectedly, this was returning rows where my_field was equal to either 13 or 13a. The fix was simple, I changed the query to:
SELECT * FROM my_table WHERE my_field='13'
My question is, is this supposed to be the case? I've always thought that to return a similar field, you would use something like:
SELECT * FROM my_table WHERE my_field LIKE '13%'
What is the difference between LIKE + a Wild Card vs an equals operator with no quotes?
This statement returns rows for my_field = '13a':
SELECT * FROM my_table WHERE my_field=13
Because MySQL performs type conversion from string to number during the comparison, turning '13a' to 13. More on that in this documentation page.
Adding quotes turns the integer to a string, so MySQL only performs string comparison. Obviously, '13' cannot be equal to '13a'.
The LIKE clause always performs string comparison (unless either one of the operands is NULL, in which case the result is NULL).
My guess would be that since you didn't enclose it in quotes, and the column was a char/varchar column, MySQL tried to do an implicit conversion of the varchar column to an int.
If one of the rows in that table contained a value that couldn't be converted to an int, you would probably get an error. Also, because of the conversion, any indexes you might have had on that column would not be used either.
This has to do with types and type conversion. With my_field=13 , 13 is an integer, while my_field is in your case likely some form of text/string. In such a case, mysql will try to convert both to a floating point number and compare those.
So mysql tries to convert e,g, "13a" to a float, which will which be 13, and 13 = 13
In my_field = '13' , both operands are text and will be compared as text using =
In my_field like '13%' both operands are also text and will be compared as such using LIKE, where the special % means a wildcard.
You can read about the type conversion mysql uses here.
This is because the MySQL type conversion works this way. See here: http://dev.mysql.com/doc/refman/5.0/en/type-conversion.html
It releases a warning as well. see the code below
mysql> select 12 = '12bibo';
+---------------+
| 12 = '12bibo' |
+---------------+
| 1 |
+---------------+
1 row in set, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+--------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: '12bibo' |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)
Looks like someone raised a bug as well: http://bugs.mysql.com/bug.php?id=42241

ActiveRecord / MySQL Select Condition Comparing String Components

I have a string that is defined as one or more dot-separated integers like 12345, 543.21, 109.87.654, etc. I'm storing values in a MySQL database and then need to find the rows that compare with a provided value. What I want is to select rows by comparing each component of the string against the corresponding component of the input string. With standard string comparison in MySQL, here's where this breaks down:
mysql> SELECT '543.21' >= '500.21'
-> 1
mysql> SELECT '543.21' >= '5000.21'
-> 1
This is natural because the string comparison is a "dictionary" comparison that doesn't account for string length, but I want a 0 result on the second query.
Is there a way to provide some hint to MySQL on how to compare these? Otherwise, is there a way to hint to ActiveRecord how to do this for me? Right now, the best solution I have come up with is to select all the rows and then filter the results using Ruby's split and reject methods. (The entire data set is quite small and not likely to grow terribly much for the foreseeable future, so it is a reasonable option, but if there's a simpler way I'm not considering I'd be glad to know it.)
You can use REPLACE to remove dots and CAST to convert string to integer:
SELECT CAST(REPLACE("543.21", ".", "") AS SIGNED) >= CAST(REPLACE("5000.21", ".", "") AS SIGNED)
mysql> SELECT '543.21' >= '5000.21';
+-----------------------+
| '543.21' >= '5000.21' |
+-----------------------+
| 1 |
+-----------------------+
1 row in set (0.00 sec)
mysql> SELECT '543.21'+0 >= '5000.21'+0;
+---------------------------+
| '543.21'+0 >= '5000.21'+0 |
+---------------------------+
| 0 |
+---------------------------+
1 row in set (0.00 sec)
This indeed only works for valid floats. Doing it for more then 1 dot would require a LOT of comparing of SUBSTRING_INDEX(SUBSTRING_INDEX(field, '.', <positionnumber you're comparing>), '.', -1) (with a manual repeat for the maximum number of position's you are comparing)