MySql 8 convert 7bit to text - mysql

I have 7bit hex and I need to show as string.
my 7bit: cdb27b1e569701
it shuold be:
converting to 8bit - 4d656e73616a65
converting to text (UTF8) - Mensaje (using the following: SELECT CONVERT(x'4d656e73616a6520' USING utf8);)
Couldnt find how to convert from cdb27b1e569701 to 4d656e73616a65.
thanks,
Koby

Related

CICS TS(DFHJS2LS): Chinese characters are getting corrupted when received into MAINFRAME from POSTMAN tool

We have developed a webservice having CICS as the HTTP SERVER (service provider). This Webservice takes the input JSON (which has both English and Chinese characters) from any client/POSTMAN tool and will be processed in Mainframe (CICS).
DFHJS2LS: JSON schema to high-level language conversion for request-response services
We are using this proc - DFHJS2LS to enable webservices in Mainframe. ThisI BM provided procedure does the conversion of JSON to MF copybook and vice-versa. Also it converts the UTF-8 code unit into UTF-16 when it reaches mainframe copybook.
Issue:
The issue what we face now is on the Chinese characters. The Chinese characters which we pass in JSON are not getting converted properly and they are getting corrupted when it is received inside mainframe. The conversion from UTF-8 to UTF-16 is not happening (this is my suspect).
市 - this is the chinese character passed in JSON (POSTMAN).
Expected value in Mainframe copybook is 5E02(UTF-16 - hex value)
but we got 00E5 00B8 0082(UTF-8 hex value)
we have tried all header values and still no luck.....
content type = application/json
charset=UTF-8 / UTF-16
Your inputs are much appreciated in addressing this DBCS/unicode/chinese character issue.
In the COBOL are you declaring the filed that will receive the Chinese characters as Pic G :
01 China-Test-Message.
03 Msg-using-pic-x Pic X(10).
03 Msg-using-pic-g Pic G(4) Usage Display-1.
Try "USAGE NATIONAL" which shoul dmap to UTF-16 which is probably the code page for the chinese character.
RTFM here:-
https://www.ibm.com/support/knowledgecenter/SS6SG3_6.3.0/pg/concepts/cpuni01.html
The chinese conversion is resolved once we changed our HTTP header to this -
Content-Type = application/json;charset=UTF-8
thanks everyone for the support.

R: How can I get correctly special characters from MySQL?

I'm newbie in R and I'm creating an statistics graphics with ggplot2. Everything is OK but I have found that some characters are not printing correctly like ÁÉÍÓÚ or Ç or Ñ or Ü.
When I get the data from MySQL I get incorrect characters. Like you can see in the next image. In this image the character "Ñ" or "Ç" is replaced by
How can I get these special characters correctly from MySQL?
Edit I:
One screencap from my database where you can see that the special characters are saved correctly:

Postgres psql output text wrapping when converting to json

I have a strange behavior in psql that makes long base64 encoded text break with newline when converting to json string
I encode my text in base64 as followed:
db=> select encode('-------------------------------------------------------------------'::bytea, 'base64');
encode
------------------------------------------------------------------------------
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t+
LS0tLS0tLS0tLQ==
Here the output text is wrapped, that's not actually a problem.
But then if I convert this base64 encoded text to a json string using to_json():
db=> select to_json(encode('-------------------------------------------------------------------'::bytea, 'base64')::text);
to_json
--------------------------------------------------------------------------------------------------
"LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t\nLS0tLS0tLS0tLQ=="
Here the base64 encoded text in json has a new line character (\n) near the end which totally breaks the base64 code when decoding (Thanks Laurenz Albe for the correction!). The new line character gives me trouble later in my program and I'm searching a solution to fix it in psql.
I've tried using the /pset format command or setting PAGER="less -SF" psql ... (from other stackoverflow issue: Disable wrapping in Psql output) but without success.
The only solution I've found (and a very dirty one) is to do:
db=> select to_json(regexp_replace((select to_json(encode('----------------------------------------------------------'::bytea, 'base64')::text))::text, '(\\n|")', '', 'g'));
to_json
------------------------------------------------------------------------------------
"LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ=="
Here I convert to JSON string (with to_json()) then I remove the JSON string quotes and new line characters (with regexp_replace()) and then re-converting to JSON again to get the expected result.

How to detect parts of unicode in a string in Python

I call an api to get some info, and sometime the response has examples like below.
"address": "BOULEVARD DU MÃ\u0089ROU - SN PEÃ\u008fRE, "
How can I detect these and convert them to the latin letters? I want to upload this data to a MYSQL Database. Right now it throws the following warning.
Warning: (1366, "Incorrect string value: '\\xC2\\x88ME A...' for column 'address' at row 1")
I'm using pymysql, to insert this info to the DB.
The example data was original encoded as UTF8, but decoded as latin1. You can reverse the process to fix it, or read it from the source using utf8 to begin with:
>>> s = "BOULEVARD DU MÃ\u0089ROU - SN PEÃ\u008fRE, "
>>> s.encode('latin1').decode('utf8')
'BOULEVARD DU MÉROU - SN PEÏRE, '
you can use the .encode() str function:
>>> "BOULEVARD DU MÃ\u0089ROU - SN PEÃ\u008fRE, ".encode("latin-1)
'BOULEVARD DU MÉROU - SN PEÏRE, '
Though be aware if the API response contains any UTF-8 characters that cannot be encoded in "latin-1" then you'll hit a UnicodeEncodeError
If at all possible, rather than do this you'll probably want to change the character set of your mysql database to UTF-8
It looks like you have multiple errors -- "double encoding" and unicode "codepoints". Hence, it is hard to unravel what things went wrong.
It would be better to go back to the source and fix the encoding at each stage -- not to try to encode/decode after the mess is made. In almost all cases no conversion code is needed if you specify UTF-8 at every stage.
Here are some notes on what to do in Python: http://mysql.rjweb.org/doc.php/charcoll#python
The hex for É should be C389 and the hex for Ï should be C38F. There should be no \uxxxx except in HTML. Even in HTML, is is normally better to simply use the utf8 encoding since HTML can handle such.

How to decode base64 unicode string using T-SQL

Can't decode turkish characters in base64 string.
Base64 string = "xJ/DvGnFn8Onw7bDlsOHxLDEnsOcw5w="
When I decode it must be like this : 'ğüişçöÖÇİĞÜÜ'
I try to decode like this :
SELECT CAST(
CAST(N'' AS XML).value('xs:base64Binary("xJ/DvGnFn8Onw7bDlsOHxLDEnsOcw5w=")' , 'VARBINARY(MAX)')
AS NVARCHAR(MAX)
) UnicodeEncoding ;
Based on this answer : Base64 encoding in SQL Server 2005 T-SQL
But have response like this : '鿄볃앩쎟쎧쎶쎖쒇쒰쎞쎜'
Base64 string is correct because when I try decode in Base64decode.org it works.
Is there any way to decode turkish characters?
Your base-64 encoded data contains an UTF-8 string. MS SQL doesn't support UTF-8, only UTF-16, so it fails for any characters outside of ASCII.
The solution is to either send the data as nvarchar right away, or to encode the string as UTF-16 (and send it as varbinary or base-64, as needed).
Based on Erlang documentation, this might require an external library, unicode: http://www.erlang.org/doc/apps/stdlib/unicode_usage.html
Basically, the default seems to be UTF-8, you need to specify UTF-16 manually. UTF-16 support seems a bit clunky, but it should be quite doable.