Can't encode chinese properly in console using Rstudio

Can't encode chinese properly in console using Rstudio - mysql

I am using RODBC to connect mysql with R like below:
library(RODBC)
channel=odbcConnect("MySQL_ODBC_AIRFORECASTSYSTEM",uid="root",pwd = "3896123ray")
sql="select * from region_station"
ttt=sqlQuery(channel,query = sql)`
Ss you can see I've put the result into a data.frame, "ttt", and I can use View(ttt) to show the contain, and the Chinese shows properly.
However, when I use ttt[,2] trying to show the second column in console, it became like this:
Any help would be greatly appreciated.
The first column is the Chinese, the second column is outcome of mysql's hex(), and the third column is the result in Rstudio's console.
二林站 E4BA8CE69E97E7AB99 鈭\x9e\xab\x99
南投站 E58D97E68A95E7AB99 \xe5\x8d\x8a\xab\x99
埔里站 E59F94E9878CE7AB99 \xe5\x9f\x87\xab\x99
大里站 E5A4A7E9878CE7AB99 憭折\x87\xab\x99
彰化站 E5BDB0E58C96E7AB99 敶啣\x8c\xab\x99
忠明站 E5BFA0E6988EE7AB99 敹\x98\xab\x99

950 seems to be Big5. For example
CONVERT(BINARY('大里站') USING big5) --> 憭折
which agrees with one of your dumps.
So...
SET NAMES big5;
(or however you specify the CHARACTER SET to MySQL from Rstudio)
or change the LC values to be utf8.

Related

Unable to retrieve values from JSON data on iSeries (IBM Midrange)

Apologies, in advance, for my lack of knowledge in the area of JSON.
I have created a table called DSITATR on iSeries using SQL which has 2 columns -
UPC# (12char),
ATRDTA (CLOB 4M)
I then manually inserted a row into the table using the following SQL
insert into DSITATR values ('123456789991', '
{"PACKS":6, "KG" : 0.367, "DESC" : "Buzz Toy "}')
I have written a pgm in SQLRPGLE to retrieve the whole CLOB value and it seems to come back as expected:
{"PACKS":6, "KG" : 0.367, "DESC" : "Buzz Toy "}
However I cannot seem to be able to extract a value using JSON_VALUE e.g.
SELECT JSON_VALUE(ATRDTA, '$.DESC') as Description FROM DSITATR
All I get back is a NULL value. I'm probably doing something stupid but as far as I can see the JSON is correct. Any ideas anyone?
Andre Hill

Worked fine for me...
Where are you trying to run the SELECT?
EDIT
I'm also at 7.2, so that's not your issue.
JSON_VALUE() defaults to RETURNING CLOB(2G) CCSID 1208
CCSID 1208 is UTF-8 (Unicode)
If you're using the older Client Access 5250 emulator, The green screen 5250 STRSQL will have problems showing that.
The "Run SQL Scripts" component of either the old Client Access or the current Access Client Solutions. Handles unicode just fine as shown in my screenshot.
RPGIV (ILE RPG) handles unicode just fine, but you need to declare that the data is in unicode
dcl-s myJson varchar(200000) ccsid(1208);
again trying to display/debug using an old 5250 emulator would be problematic. The PC based RDi debugger wouldn't have a problem.
Last but not least...you could have the system return the data in your prefered EBCDIC CCISD (37=US English)
SELECT
JSON_VALUE(ATRDTA, '$.DESC'
returning varchar(100) ccsid(37)
) as Description
FROM DSITATR

SISS MSSQL to MySQL with different collation is not copying finnish letter å

I don't think title could be more described better as tl;dr, because problem is a bit deeper.
I've got two databases (finnish language):
MSSQL (collation: SQL_Latin1_General_CP437_CI_AI)
MySQL (collation: utf_general_ci)
I've created BI project in vs2017, connected two databases and transfered tables from one to another, no problem. Except for 1 letter: "å" - instead it was "?". I cannot change any database collation so I am trying to find a way to transfer words with this letter.
What I've tried:
OLD DB Source -> ODBC Destination
Point "1" with "Data Conversion" block in between (with code page 1252)
Script Component, in which I have tried:
Insert with "_latin"
sql= "INSERT INTO db.words(Name) VALUES(_latin1'å')";
byte[] b = Encoding.UTF8.GetBytes(sql);
odbcCmd = new OdbcCommand(Encoding.UTF8.GetString(b), odbcConn);
odbcCmd.ExecuteNonQuery();
Insert without it
sql= "INSERT INTO db.words(Name) VALUES('å')";
byte[] b = Encoding.UTF8.GetBytes(sql);
odbcCmd = new OdbcCommand(Encoding.UTF8.GetString(b), odbcConn);
odbcCmd.ExecuteNonQuery();
Diferent ways of encoding
byte[] bytes = Encoding.GetEncoding(1252).GetBytes("å");
var myString = Encoding.GetEncoding(1252).GetString(bytes);
byte[] bytes2 = Encoding.Default.GetBytes("å");
var myString2 = Encoding.Default.GetString(bytes2);
Insert with COLLATE which got me error
insert into db.words(Name) values ("å" COLLATE latin1_swedish_ci) ;
and error:
System.Data.Odbc.OdbcException: „ERROR [HY000] [MySQL][ODBC 5.3(a) Driver][mysqld-5.7.21-log]COLLATION 'latin1_swedish_ci' is not valid for CHARACTER SET 'cp1250'”
Here is interesting part:
I can make insert with this letter in MySQL Workbench without a problem, and it will be inserted, but when I try to pass it from one database to another it is lost. I've set Data Viewers between Data Conversion and the letter was still there, and also when debugging script it was after encoding in string that were inserted to database.
Maybe someone got any idea what else I can try, because I feel like I have tried everything, and feel that the resolve of this problem is really close, but I just don't see it.

CP1250 does not include å; CP437 and utf8 do include it.
COLLATE is irrelevant -- it applies only to comparing and sorting.
Don't use any encode/conversion functions; instead, specify how the data is encoded.
I see 'code' -- but what is the encoding for the source in that language and/or editor?
Show us the hex of any strings in question.
Which direction are you trying to transfer?
What are the connection parameters for each database?

SAS pass through - Extract from MySQL does not work

I'm trying to build a Data Integration job uses pass through to extract data from a view in a MySQL database.
Wev'e been using pass through a lot in the project, mostly extracting data from Redshift,
however with MySQL I was not able to do make it work properly.
It keeps complaining a table is missing even though when pass through is off, view is found and data is extracted...
tried every trick I know, starting from enabling case-sensitive DBMS object names, to manually remove single/double quotes from the statement just in case MySQL confuses confuses it with something else...
No luck.
ODBC driver is [MySQL][ODBC 5.3(a) Driver][mysqld-5.5.53].
Ran on a Windows environment.
Any idea how to solve this?
Thank you in advance.
EDIT
So, first of all, one correction (even though not that important - I extract from a view, not a table).
This is the code generated by SAS Create Table transformation, pass through enabled. I only put an asterisk instead of the full list of columns:
proc sql;
connect to ODBC
(
READBUFF=10000 DATASRC="cmp.web_api" AUTHDOMAIN="MYSQL_CMP_Auth"
);
create table work."W7ZZZKOC"n as
select
*
from connection to ODBC
(
select
V_BI_ACCOUNT.ACCOUNT_NAME,
V_BI_ACCOUNT.ACQUISITION_SOURCE__C,
V_BI_ACCOUNT.ZUORA__ACTIVE__C,
V_BI_ACCOUNT.ADDRESS_LINE_1__C,
V_BI_ACCOUNT.ADDRESS_LINE_2__C,
V_BI_ACCOUNT.ADDRESS_LINE_3__C,
V_BI_ACCOUNT.AGREEMENT_DATE,
V_BI_ACCOUNT.AGREEMENT_LEGAL_CLAUSE_1__C,
V_BI_ACCOUNT.AGREEMENT_LEGAL_CLAUSE_2__C,
V_BI_ACCOUNT.PERSONBIRTHDATE,
V_BI_ACCOUNT.BLOCKED_REASON__C,
V_BI_ACCOUNT.BRAND__C,
V_BI_ACCOUNT.CPN__C,
V_BI_ACCOUNT.ACCCREATEDBYID,
V_BI_ACCOUNT.ACCCREATEDDATE,
V_BI_ACCOUNT.CURRENCY_PREFERENCE__C,
V_BI_ACCOUNT.CUSTOMER_FULL_NAME__PC,
V_BI_ACCOUNT.ACCOUNTID,
V_BI_ACCOUNT.ZUORA__CUSTOMERPRIORITY__C,
V_BI_ACCOUNT.DELIVERY_SALUTATION__C,
V_BI_ACCOUNT.DISPLAY_NAME,
V_BI_ACCOUNT.PERSONEMAIL,
V_BI_ACCOUNT.EMAILKEY__C,
V_BI_ACCOUNT.FACEBOOKKEY,
V_BI_ACCOUNT.FIRSTNAME,
V_BI_ACCOUNT.GENDER__C,
V_BI_ACCOUNT.PHONE,
V_BI_ACCOUNT.ACCLASTACTIVITYDATE,
V_BI_ACCOUNT.ACCLASTMODIFIEDDATE,
V_BI_ACCOUNT.LASTNAME,
V_BI_ACCOUNT.OTHER_EMAIL__C,
V_BI_ACCOUNT.PI_TYPE__C,
V_BI_ACCOUNT.ACCPARENTID,
V_BI_ACCOUNT.POSTCODE__C,
V_BI_ACCOUNT.PRIMARY_ACCOUNT_OF_THIS_CUSTOMER,
V_BI_ACCOUNT.ACCPRIMARY__C,
V_BI_ACCOUNT.ACCREASON_FOR_STATUS__C,
V_BI_ACCOUNT.ZUORA__SLA__C,
V_BI_ACCOUNT.ZUORA__SLASERIALNUMBER__C,
V_BI_ACCOUNT.SALUTATION,
V_BI_ACCOUNT.ACCSYSTEMMODSTAMP,
V_BI_ACCOUNT.PERSONTITLE,
V_BI_ACCOUNT.ZUORA__UPSELLOPPORTUNITY__C,
V_BI_ACCOUNT.X_CODE__C,
V_BI_ACCOUNT.ZUORA__ACCOUNT_ID__C,
V_BI_ACCOUNT.ZUORA__PAYMENTMETHODID__C,
V_BI_ACCOUNT.CITY,
V_BI_ACCOUNT.ORIGINAL_CREATED_DATE,
V_BI_ACCOUNT.SOURCE_SYSTEM_ID,
V_BI_ACCOUNT.STATUS,
V_BI_ACCOUNT.ZUORA__CONTACT_ID,
V_BI_ACCOUNT.ACCISDELETED,
V_BI_ACCOUNT.BILLING_ACCOUNT_NAME,
V_BI_ACCOUNT.ACZCREATEDDATE,
V_BI_ACCOUNT.ACZSYSTEMMODSTAMP,
V_BI_ACCOUNT.ACZLASTACTIVITYDATE,
V_BI_ACCOUNT.ZUORA__ACCOUNT__C,
V_BI_ACCOUNT.ZUORA__ACCOUNTNUMBER__C,
V_BI_ACCOUNT.ZUORA__AUTOPAY__C,
V_BI_ACCOUNT.ZUORA__BALANCE__C,
V_BI_ACCOUNT.ZUORA__CREDITCARDEXPIRATION__C,
V_BI_ACCOUNT.ZUORA__CURRENCY__C,
V_BI_ACCOUNT.ZUORA__MRR__C,
V_BI_ACCOUNT.ZUORA__PAYMENTTERM__C,
V_BI_ACCOUNT.ZUORA__PURCHASEORDERNUMBER__C,
V_BI_ACCOUNT.ZUORA__LASTINVOICEDATE__C,
V_BI_ACCOUNT.COUNTRY_NAME,
V_BI_ACCOUNT.COUNTRY_CODE,
V_BI_ACCOUNT.FAVOURITE_FOOTBALL_CLUB,
V_BI_ACCOUNT.COUNTY
from
web_api.V_BI_ACCOUNT as V_BI_ACCOUNT
);
%rcSet(&sqlrc);
disconnect from ODBC;
quit;
And again, when I extract data without pass through - works successfully,

I found out the problem was a column name exceeds 32 positions.
As SAS supports up column names up to 32,
the query fails to find PRIMARY_ACCOUNT_OF_THIS_CUSTOMER as the original column name is PRIMARY_ACCOUNT_OF_THIS_CUSTOMER__C.
EDIT
One more thing I found out is, MySQL doesn't like specifying schema name nor aliases.
Therefore,
From clause to only specify table name i.e : 'from v_bi_account' rather than 'web_api.v_bi_account'
and do not use aliases i.e use 'from v_bi_account' rather than 'from v_bi_account as v_bi_account'
Thank you guys so much for your help.

MySQL CONCAT returns incorrect result when used with variable in Dbeaver

Is there any known issue about using variables in CONCAT or am I making a mistake in below query?
set #m := '2016';
select concat('2015','-',#m);
Expected result is 2015-2016, but strangely it returns
2015F201
I tested many other variations with and without using variables, it works as expected without variables, but return similiar 'unexpected' results when used with variables.

I'm using DBeaver as SQL client, it somehow thinks that the result of that query is binary:
select concat('2015','-',#m);
and show it incorrectly: 2015F201 (not exactly hexadecimal)
When I change settings under Preferences window, Common / Result Sets / Binaries / Binary Data formatter to String, it shows correctly.

MySQL: Where condition doesn't seem to work properly

I have a table called Traduction with these two rows :
francais |espagnol |allemand |anglais
-------------+-----------------+---------------+----------------
ORANGE litée |NARANJA ENCAJADA |ORANGEN GELEGT |ORANGE 1 LAYER
ORANGE LITEE |NARANJA ENCAJADA |ORANGEN GELEGT |ORANGE 1 LAYER
My query is :
SELECT * FROM T_TRADUCTION where francais= 'ORANGE LITEE';
This query returns two rows of the table, whereas it should return only the record with ORANGE LITEE value (not ORANGE litée).
I don't understand why.

Change your database collate to latin1_general_cs
Set your database DEFAULT CHARACTER to latin1
Now execute your query.
SELECT * FROM T_TRADUCTION where francais= 'ORANGE LITEE';

Try to correct it like this :
SELECT * FROM T_TRADUCTION where francais='ORANGE litée';
Best Regards.

Getting encoding right is tricky, there are too many layers: Browser,
Page,
PHP,
MySQL.
You need to check in what encoding the data flow at each layer.
Check HTTP headers, headers.
Check what's really sent in body of the request.
Don't forget that MySQL has encoding almost everywhere:
Database
Tables
Columns
Server as a whole
Client
Make sure that there's the right one everywhere.
From manual>
SET NAMES indicates what character set the client will use to send SQL statements to the server. Thus, SET NAMES 'cp1251' tells the server, “future incoming messages from this client are in character set cp1251.” It also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use a SELECT statement.)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Can't encode chinese properly in console using Rstudio - mysql

950 seems to be Big5. For example CONVERT(BINARY('大里站') USING big5) --> 憭折 which agrees with one of your dumps. So... SET NAMES big5; (or however you specify the CHARACTER SET to MySQL from Rstudio) or change the LC values to be utf8.

Related

Unable to retrieve values from JSON data on iSeries (IBM Midrange)

SISS MSSQL to MySQL with different collation is not copying finnish letter å

SAS pass through - Extract from MySQL does not work

MySQL CONCAT returns incorrect result when used with variable in Dbeaver

MySQL: Where condition doesn't seem to work properly

Categories

Resources