JdbcTemplate do not mapping result to map : related with windows ascii - mysql

I got a dump file from MSSQL. It is encoded with euckr and has some windows ascii character like ^F, ^D, M.
What I am trying to do is ...
LOAD DATA LOCAL INFILE '{My CSV FILE}' INTO TABLE '{TARGET TABLE}' CHARACTER SET euckr FIELDS TERMINATED BY '|:' - push csv to MYSQL
read the data from MYSQL with jdbcTemplate on java source code
After LOAD ..., I can see the data in workbench and it looks normal.(It does not display any special characters I mentioned above.)
However, when execute jdbcTemplate.queryForMap, it could not push the result to Map and I assume MS ascii is the reason.
Error message is (I typed this since windows console does not able to copy)
org.springframwwork.dao.TransientDataAccessResourceException:
PreparedStatedmentCallback; SQL [SELECT * FROM TARGET_TABLE];
Value '^A4 data1 1999-00-00^Fabc^D0000^A0^#...'
How can I eliminate this special characters?
Do I request new MSSQL dump file without those? (I do not know is it possible to eliminate in MSSQL. I have no experience with MSSQL)
Is there anyway to do some works before jdbctemplate mapping result?
Thanks.
FYI,
Mysql encoding is UTF8, and version is 5.6.35

I am not sure, but in my experiment ,,,
LOAD DATA LOCAL INFILE in Windows makes some weird characters like that.
Execute same query in OSX or Linux(In my case, CentOS mysql client) looks fine.(Do not insert characters like ^M)

Related

Issue with retrieving special characters of sql inserted entities from MySQL using in Spring Boot application

I created a MySQL database from a script, along with inserting rows of data that include special characters such as č, š and ž.
For that reason I set default character set on schema in the script:
CREATE SCHEMA IF NOT EXISTS schemaname DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci ;
Now if I go through MySQL console and try retrieving data from a table with these special characters:
SELECT name FROM tablename;
I'll get a response with these special characters:
+-------------------------------------------------+
| name |
+-------------------------------------------------+
| Test of š, and č, and ž |
+-------------------------------------------------+
I've also created a Spring Boot application (hibernate with Spring Boot Starter Data JPA) in which I implemented rest API for entities that I have in my database.
The problem arises when I try to retrieve (previously inserted) entities using JPARepository (EntityRepository.findAll()). Instead of correctly written special characters which are normal in MySQL console, I get:
Å¡, ž and such characters (should be š, ž)
However, if I store a new entity using the Rest API with special characters and retrieve that one, the special characters are fine (š, ž).
Any idea as to why there is a difference between these and how I might solve the issue?
Without having to insert every entity through Rest API instead of MySQL inserts.
My application.properties file contains (I don't think the issue is here, since entities created through rest API are fine):
spring.datasource.url=jdbc:mysql://localhost:3306/schemaname?useSSL=false&allowPublicKeyRetrieval=true&useUnicode=true&characterEncoding=UTF-8
Could be an issue: The .sql files were created on Windows, but I am running sql server (in docker) and my application on Linux (all .sql files have UTF-8 character encoding).
The solution I found was actually really simple, can't believe I missed this.
I just had to put N in front of every string with special character in the .sql file. Example:
INSERT INTO X(FIELD1) VALUE (N'ščž');
Simply means I'm passing a NCHAR/NVARCHAR/NTEXT value instead of normal CHAR/VARCHAR/TEXT.

LOAD DATA FROM S3 command failing because of timestamp

I'm running the "LOAD DATA FROM S3" command to load a CSV file from S3 to Aurora MySQL. The command works fine if run it in the Mysql Workbench (it gives me the below exception as warnings though but still inserts the dates fine), but when I run it in Java I get the following exception:
com.mysql.cj.jdbc.exceptions.MysqlDataTruncation:
Data truncation: Incorrect datetime value: '2018-05-16T00:31:14-07:00'
Is there a workaround? Is there something I need to setup on the mysql side or in my app to make this transformation seamless? Should I somehow run a REPLACE() command on the timestamp?
Update 1:
When I use REPLACE to remove the "-07:00" from the time original timestamp (2018-05-16T00:31:14-07:00) it loads the data appropriately. Here's my load statement:
LOAD DATA FROM S3 's3://bucket/object.csv'
REPLACE
INTO TABLE sample
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(#myDate)
SET `created-date` = replace(#myDate, '-07:00', ' ');
For obvious reasons it's not a good solution. Why would the LOAD statement work in the mysql workbench and not in my java code? Can I set some parameter to make it work? Any help is appreciated!!
The way I solved it is by using mysql's SUBSTRING function in the 'SET' part of the LOAD DATA query (instead of the 'replace'):
SUBSTRING(#myDate, 1, 10)
This way the trailing '-07:00' was removed (I actually opted to remove the time as well, since I didn't need it, but you can use it for TIMESTAMPS as well.

Converting to UTF8 in Notepad++ Causes Errors with Integers in MySQL Imports?

I am currently working on a project that requires a large data migration for a company. We are in the process of planning and testing data imports from an existing Access database to a MySQL database for a CRM they will be using.
We have encountered errors with importing (using Load Data Infile) exported data in .csv format, when the records have accented or special characters due to the files being imported being in ANSI format (the rest of the MySQL database is all in UTF8). I managed to fix this issue by using the Convert to UTF8 functionality in Notepad++, but this was before I knew we needed the existing primary key ID's from the Access database to be imported as well.
Doing this same process with the added ID's causes a MySQL error to throw:
Error Code: 1366. Incorrect integer value: '135' for column 'id' at row 1
Is there a way to convert all this data to UTF8 without having integer values throw errors?
Convert the file to UTF-8 without BOM and try again :)
The trick is that at beginning of the UTF-8 file there is a BOM sequence and your number 135 at the beginning of the file is actually 0xEF 0xBB 0xBF 1 3 5 what causes error in TSV importer unaware of UTF-8.

MySql to SPSS via ODBC- cannot retrieve strings

I am using MySQL 5.6, SPSS 22 and ODBC GUI with Actual ODBC Pack (for Mac OS X) 3.2.1 on Mavericks OS.
I am able to connect to the database, select the table and even the fields . The table has about 20 string variables and 10 numeric. All looks normal as I go through each step.
When I retrieve the data into SPSS, all the numeric variables import fine. The strings are a garbled mess. (See attachment). However, you can see on the variable view, all the string variables names are fine.
I rebooted and restarted both Mysql and SPSS and got the same results.
Any suggestions?
I can't make out what the strings look like from the picture, but your description sounds like there is an encoding problem. Try changing the Unicode and locale settings (Edit > Options > Language) in Statistics or find out what the encoding is in the database and try to match that.
It is an encoding issue. In SPSS, without a data set loaded, go to Edit > Options > Language and change the character encoding for data setting to Locale's writing system. Then run your database query again.

MySQL LOAD DATA INFILE with fields terminated by non-ASCII character

I have a lowercase thorn separated file that I need to load into a MySQL database (5.1.54) using the LOAD DATA INFILE ... query.
The file I'm trying to load is located on the same server as the MySQL database, and I'm issuing the query from a Windows machine using SQLYog, which uses the MySQL C client library.
I'm having some major issues, I've tried using the FIELDS TERMINATED BY 0x00FE syntax using all the variations of the thorn character I can think of, and I've tried changing the character set of the connection (SET NAMES ...), but I consistently get the warning...
Warning Code : 1638
Non-ASCII separator arguments are not fully supported
...and all the data loads into the first column.
Is there any way around this at all? Or am I resigned to pre-processing the file with sed to replace all the thorn's with a more sensible character before loading?
I have succeeded to load this data with Data Import tool (CSV format) in dbForge Studio for MySQL. I just set 'Þ' as custom delimiter. The import from the CSV format is fully supported in free Express Edition.
I decided to fix the file by replacing the non-ASCII character with a character that MySQL's LOAD DATA INFILE ... would understand.
Use od to get the octal byte value of the offending character - od -b file.log - in this case it's 376.
Use grep to make sure the character you want to replace it with doesn't already exist in the file - grep -n '|' file.log.
Use sed and printf to replace the non-ASCII character - sed -i 's/'$(printf '\376')'/|/g' file.log.