Broken unicode after simple concat + left mysql commands - mysql

I found some very strange mysql behavior.
If I run the following command:
mysql> select left(concat("A", "B®"), 3);
Then the output is as expected:
+-----------------------------+
| left(concat("A", "B®"), 3) |
+-----------------------------+
| AB® |
+-----------------------------+
1 row in set (0.00 sec)
However, if I change "A" with some number (1 in this case):
mysql> select left(concat(1, "B®"), 3);
The unicode character "®" becomes corrupted:
+---------------------------+
| left(concat(1, "B®"), 3) |
+---------------------------+
| 1B? |
+---------------------------+
1 row in set (0.00 sec)
Anybody knows how to explain this strange behavior and how to avoid it?
The example above is only a reproduction, in the real life it's a concat of numbers together with strings unknown ahead (not hard-coded strings).
Thanks a lot!

Mysql doesn't convert integer to strings literally. It converts number into the binary representation of it, which is not the same. "if the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent binary string form; if you want to avoid that, you can use an explicit type cast, as in this example:
SELECT CONCAT(CAST(int_col AS CHAR), char_col);
Refer this for details.
I would also like to read from others if someone has different opinion.

Related

whats wrong in below query Select CONVERT(xml,'<x>' + Replace(A.name,':','</x><x>')+'</x>' ) as xDim from Erecharge;

I want to convert string to xml column ..
I used below query for that :
Select CONVERT(xml,'<x>' + Replace(A.name,':','</x><x>')+'</x>' ) as xDim from Erecharge;
but it shows error of incorrect sql syntax..
I want to know whats wrong in above query
I also tried this:
Select Cast('<x>' + Replace(A.name,':','</x><x>')+'</x>' as XML) as xDim from Erecharge;
check the manual that corresponds to your MySQL server version for the right syntax to use near 'XML) as xDim from Erecharge'
This means that XML is incorrect in a expression like this:
CAST('foo' AS XML)
As per the docs, the values allowed for CAST type do not include XML.
Additionally, using the + operator on strings is just a convoluted way to render zero:
mysql> SELECT 'a' + 'b';
+-----------+
| 'a' + 'b' |
+-----------+
| 0 |
+-----------+
1 row in set, 2 warnings (0.00 sec)
It's not entirely clear what you're trying to do. MySQL has XML Functions but it doesn't have XML data types. If you just want to produce a string that happens to contain XML code then you need to CONCAT():
mysql> SELECT CONCAT('<date>', CURRENT_TIMESTAMP, '</date>') AS foo;
+----------------------------------+
| foo |
+----------------------------------+
| <date>2018-10-12 11:44:29</date> |
+----------------------------------+
1 row in set (0.00 sec)
... but of course you still need to ensure that angle brackets and similar stuff don't break the XML. CDATA may help. (No idea about XML functions, I'm not familiar with them.)

MySQL Precison Issues in DECIMAL NUMERIC data type

In writing a function for scientific application, I ran into issues. I traced it back to MySQL's lack of precison.
Here is the page from the official documentation which claims that The maximum number of digits for DECIMAL is 65 - http://dev.mysql.com/doc/refman/5.6/en/fixed-point-types.html . It also describes how the value will be rounded if it exceeds the specified precison.
Here is reproducible code (a mysql stored function) to test it -
DELIMITER $$
DROP FUNCTION IF EXISTS test$$
CREATE FUNCTION test
(xx DECIMAL(30,25)
)
RETURNS DECIMAL(30,25)
DETERMINISTIC
BEGIN
DECLARE result DECIMAL(30,25);
SET result = 0.339946499848118887e-4;
RETURN(result);
END$$
DELIMITER ;
If you save the code above in a file called test.sql, you can run it by executing the following in mysql prompt -
source test.sql;
select test(0);
It produces the output -
+-----------------------------+
| test(0) |
+-----------------------------+
| 0.0000339946499848118900000 |
+-----------------------------+
1 row in set (0.00 sec)
As you can see, the number is getting rounded at the 20th digit, and then five zeroes are being added to it to get to the required/specified precison. That is cheating.
Am I mistaken, or is the documentation wrong?
This happens because mysql treats 0.339946499848118887e-4 as float and treats 0.0000339946499848118887 as fixed point.
mysql> select cast( 0.339946499848118887e-4 as DECIMAL(30, 25));
+----------------------------------------------------+
| cast( 0.339946499848118887e-4 as DECIMAL(30, 25)) |
+----------------------------------------------------+
| 0.0000339946499848118900000 |
+----------------------------------------------------+
1 row in set (0.00 sec)
mysql> select cast( 0.0000339946499848118887 as DECIMAL(30, 25));
+-----------------------------------------------------+
| cast( 0.0000339946499848118887 as DECIMAL(30, 25)) |
+-----------------------------------------------------+
| 0.0000339946499848118887000 |
+-----------------------------------------------------+
1 row in set (0.00 sec)
As described in the mysql documentation on precision math - expression handling -
If any approximate values are present, the expression is approximate and is evaluated using floating-point arithmetic.
Quoting from, the documentation on numerical types,
Two numbers that look similar may be treated differently. For example, 2.34 is an exact-value (fixed-point) number, whereas 2.34E0 is an approximate-value (floating-point) number.
I don't know anything about SQL, but my guess would be this line:
SET result = 0.339946499848118887e-4;
If MySQL is anything like other languages I know, then this will first evaluate the right-hand side, and then assign the value to result. No matter what type result is declared to be or what precision it's declared to have, it wouldn't matter if the right-hand side has already lost precision when being evaluated. This is almost surely what is happening here.
I can reproduce your results, but If I change that line to
SET result = cast('0.339946499848118887e-4' as decimal(30, 25));
(casting from a string instead of from a floating-point constant of unspecified precision) then I correctly get
+-----------------------------+
| test(0) |
+-----------------------------+
| 0.0000339946499848118887000 |
+-----------------------------+
1 row in set (0.00 sec)
as desired. So that's your fix.
BTW, the documentation that scale in DECIMAL(precision, scale) cannot be greater than 30 seems to be in section 12.19.2. DECIMAL Data Type Changes:
The declaration syntax for a DECIMAL column is DECIMAL(M,D). The
ranges of values for the arguments in MySQL 5.6 are as follows:
M is the maximum number of digits (the precision). It has a range of 1
to 65. (Older versions of MySQL permitted a range of 1 to 254.)
D is the number of digits to the right of the decimal point (the
scale). It has a range of 0 to 30 and must be no larger than M.

MySQL concat() and lower() weirdness

Any idea why this works sensibly*:
mysql> select lower('AB100c');
+-----------------+
| lower('AB100c') |
+-----------------+
| ab100c |
+-----------------+
1 row in set (0.00 sec)
But this doesn't?
mysql> select lower(concat('A', 'B', 100,'C'));
+----------------------------------+
| lower(concat('A', 'B', 100,'C')) |
+----------------------------------+
| AB100C |
+----------------------------------+
1 row in set (0.00 sec)
*sensibly = 'the way I think it should work.'
As stated on MySql String functions:
LOWER(str)
LOWER() is ineffective when applied to
binary strings (BINARY, VARBINARY,
BLOB).
CONCAT(str1,str2,...)
Returns the string that results from
concatenating the arguments. May have
one or more arguments. If all
arguments are nonbinary strings, the
result is a nonbinary string. If the
arguments include any binary strings,
the result is a binary string. A
numeric argument is converted to its
equivalent binary string form; if you
want to avoid that, you can use an
explicit type cast.
In your code you are passing 100 as a numeric so concat will return a binary string and lower is ineffective when applied to binary strings that's why it's not get converted. If you want to convert you can try this:
select lower(concat('A', 'B', '100','C'));
lower is used to convert STRINGS to lowercase. But your value 100 is considered numeric. If you want to still achieve the result of lower case conversion, you should enclose the number in quotes like this:
select lower(concat('A', 'B', '100','C'));
I've tested this and it works fine.
And here is an other example with CONCAT and LIKE
LOWER(CONCAT(firstname, ' ', lastname)) LIKE LOWER('%my name%')

About mysql regex,how do I search and return string use mysql regex

My table filed's value is "<script type="text/javascript"src="http://localhost:8080/db/widget/10217EN/F"></script>",
I want to analyse this string and fetch the id 10217,how to do use mysql regex?
I know python regex group function can return the id 10217,but i'm not familiar with mysql regex.
Please help me,Thank you very much.
MySQL regular expressions do not support subpattern extraction. You will probably have better luck iterating over all of the rows in your database and storing the results in a new column.
As far as I know, you can't use MySQL's REGEXP for substring retrieval; it is designed for use in WHERE clauses and is limited to returning 0 or 1 to indicate failure or success at a match.
Since your pattern is pretty well defined, you can probably retrieve the id with a query that uses SUBSTR and LOCATE. It will be a bit of a mess since SUBSTR wants the start index and the length of the substring (it would be easier if it took the end index). Perhaps you could use TRIM to chop off the unwanted trailing part.
This query get the Id from the field
SELECT substring_index(SUBSTRING_INDEX(testvar,'/',-3),'EN',1) from testtab;
where as testtab - is table name , testvar - is field name
inner substring get string starts with last 3 / which is
mysql> SELECT SUBSTRING_INDEX(testvar,'/',-3) from testtab;
+----------------------------+
| SUBSTRING_INDEX(testvar,'/',-3) |
+----------------------------+
| 10217EN/F"> |
| 10222EN/F"> |
+----------------------------+
2 rows in set (0.00 sec)
outer substring get
mysql> SELECT substring_index(SUBSTRING_INDEX(testvar,'/',-3),'EN',1) from testtab;
+----------------------------------------------------+
| substring_index(SUBSTRING_INDEX(testvar,'/',-3),'EN',1) |
+----------------------------------------------------+
| 10217 |
| 10222 |
+----------------------------------------------------+
2 rows in set (0.00 sec)

ActiveRecord / MySQL Select Condition Comparing String Components

I have a string that is defined as one or more dot-separated integers like 12345, 543.21, 109.87.654, etc. I'm storing values in a MySQL database and then need to find the rows that compare with a provided value. What I want is to select rows by comparing each component of the string against the corresponding component of the input string. With standard string comparison in MySQL, here's where this breaks down:
mysql> SELECT '543.21' >= '500.21'
-> 1
mysql> SELECT '543.21' >= '5000.21'
-> 1
This is natural because the string comparison is a "dictionary" comparison that doesn't account for string length, but I want a 0 result on the second query.
Is there a way to provide some hint to MySQL on how to compare these? Otherwise, is there a way to hint to ActiveRecord how to do this for me? Right now, the best solution I have come up with is to select all the rows and then filter the results using Ruby's split and reject methods. (The entire data set is quite small and not likely to grow terribly much for the foreseeable future, so it is a reasonable option, but if there's a simpler way I'm not considering I'd be glad to know it.)
You can use REPLACE to remove dots and CAST to convert string to integer:
SELECT CAST(REPLACE("543.21", ".", "") AS SIGNED) >= CAST(REPLACE("5000.21", ".", "") AS SIGNED)
mysql> SELECT '543.21' >= '5000.21';
+-----------------------+
| '543.21' >= '5000.21' |
+-----------------------+
| 1 |
+-----------------------+
1 row in set (0.00 sec)
mysql> SELECT '543.21'+0 >= '5000.21'+0;
+---------------------------+
| '543.21'+0 >= '5000.21'+0 |
+---------------------------+
| 0 |
+---------------------------+
1 row in set (0.00 sec)
This indeed only works for valid floats. Doing it for more then 1 dot would require a LOT of comparing of SUBSTRING_INDEX(SUBSTRING_INDEX(field, '.', <positionnumber you're comparing>), '.', -1) (with a manual repeat for the maximum number of position's you are comparing)