MySQL LOAD DATA INFILE with fields terminated by non-ASCII character - mysql

I have a lowercase thorn separated file that I need to load into a MySQL database (5.1.54) using the LOAD DATA INFILE ... query.
The file I'm trying to load is located on the same server as the MySQL database, and I'm issuing the query from a Windows machine using SQLYog, which uses the MySQL C client library.
I'm having some major issues, I've tried using the FIELDS TERMINATED BY 0x00FE syntax using all the variations of the thorn character I can think of, and I've tried changing the character set of the connection (SET NAMES ...), but I consistently get the warning...
Warning Code : 1638
Non-ASCII separator arguments are not fully supported
...and all the data loads into the first column.
Is there any way around this at all? Or am I resigned to pre-processing the file with sed to replace all the thorn's with a more sensible character before loading?

I have succeeded to load this data with Data Import tool (CSV format) in dbForge Studio for MySQL. I just set 'Þ' as custom delimiter. The import from the CSV format is fully supported in free Express Edition.

I decided to fix the file by replacing the non-ASCII character with a character that MySQL's LOAD DATA INFILE ... would understand.
Use od to get the octal byte value of the offending character - od -b file.log - in this case it's 376.
Use grep to make sure the character you want to replace it with doesn't already exist in the file - grep -n '|' file.log.
Use sed and printf to replace the non-ASCII character - sed -i 's/'$(printf '\376')'/|/g' file.log.

Related

JdbcTemplate do not mapping result to map : related with windows ascii

I got a dump file from MSSQL. It is encoded with euckr and has some windows ascii character like ^F, ^D, M.
What I am trying to do is ...
LOAD DATA LOCAL INFILE '{My CSV FILE}' INTO TABLE '{TARGET TABLE}' CHARACTER SET euckr FIELDS TERMINATED BY '|:' - push csv to MYSQL
read the data from MYSQL with jdbcTemplate on java source code
After LOAD ..., I can see the data in workbench and it looks normal.(It does not display any special characters I mentioned above.)
However, when execute jdbcTemplate.queryForMap, it could not push the result to Map and I assume MS ascii is the reason.
Error message is (I typed this since windows console does not able to copy)
org.springframwwork.dao.TransientDataAccessResourceException:
PreparedStatedmentCallback; SQL [SELECT * FROM TARGET_TABLE];
Value '^A4 data1 1999-00-00^Fabc^D0000^A0^#...'
How can I eliminate this special characters?
Do I request new MSSQL dump file without those? (I do not know is it possible to eliminate in MSSQL. I have no experience with MSSQL)
Is there anyway to do some works before jdbctemplate mapping result?
Thanks.
FYI,
Mysql encoding is UTF8, and version is 5.6.35
I am not sure, but in my experiment ,,,
LOAD DATA LOCAL INFILE in Windows makes some weird characters like that.
Execute same query in OSX or Linux(In my case, CentOS mysql client) looks fine.(Do not insert characters like ^M)

How to have column with character value equal to the enclosing character value in mysql load data in file

I'm using mysqlimport,which uses LOAD DATA INFILE command. My question is the following: Assume I have --fields-enclosed-by='"', and that I have column with values which have double quoted string, such as "5" object" (which stands for 5 inches). The problem is that when mysql encounter the double quote string after the 5, it treats it as the enclosing character, and things are messed up. How to use mysqlimport with such values? I don't want to just use another character to enclosing, because this other character as well may occur in the data. So what is a general solution for this?
I guess it is will be different this way to import csv.
To solve above issue in another way,
Export or get or convert old data into sql format rather than csv format.
Import the same sql data using mysql command line tool.
mysql -hservername -uusername -p'password' dbname < 'path to you sql imported file.sql'

Not able to display Chinese characters after loading it to Postgres DB

I have a source file which contains Chinese characters. After loading that file into a table in Postgres DB, all the characters are garbled and I'm not able to see the Chinese characters. The encoding on Postgres DB is UTF-8. I'm using the psql utility on my local mac osx to check the output. The source file was generated from mySql db using mysqldump and contains only insert statements.
INSERT INTO "trg_tbl" ("col1", "col2", "col3", "col4", "col5", "col6", "col7", "col7",
"col8", "col9", "col10", "col11", "col12", "col13", "col14",
"col15", "col16", "col17", "col18", "col19", "col20", "col21",
"col22", "col23", "col24", "col25", "col26", "col27", "col28",
"col29", "col30", "col31", "col32", "col33")
VALUES ( 1, 1, '与é<U+009D>žç½‘_首页&频é<U+0081>“页顶部广告ä½<U+008D>(946×90)',
'通æ <U+008F>广告(Leaderboard Banner)',
0,3,'',946,90,'','','','',0,'f',0,'',NULL,NULL,NULL,NULL,NULL,
'2011-08-19 07:29:56',0,0,0,'',NULL,0,NULL,'CPM',NULL,NULL,0);
What can I do to resolve this issue?
The text was mangled before producing that SQL statement. You probably wanted the text to start with 与 instead of the "Mojibake" version: 与. I suggest you fix the dump either to produce utf8 characters or hex. Then the load may work, or there may be more places to specify utf8, such as SET NAMES or the equivalent.
Also, for Chinese, CHARACTER SET utf8mb4 is preferred in MySQL.
é<U+009D>ž is so mangled I don't want to figure out the second character.

Issue with MySql and the Greek characters using mysql for excel

I used mysql for excel in order to import data from excel to mysql DB.
When I used the "Select" command in Mysql Workbench I realised that the greek Characters appeared like question marks "?"
Then I saved the excel file as .csv and opened it with notepad++ in order to encode it with utf8.
Then i used the following command and again the problem with the greek chars made even worse.
LOAD DATA LOCAL INFILE 'c:/working.csv'
INTO TABLE tablexxx
CHARACTER SET UTF8
FIELDS TERMINATED BY ';' ENCLOSED BY '"' LINES TERMINATED BY '\r\n';
Can you please help me out. Dead End here!!!
Try the greek character set
CHARACTER SET CP869;
But usually it should work with UTF-8 maybe you need to change the settings in notepad++ or in the MySQL Workbench and the other programs you desire to use.

Change the column data delimiter on mysqldump output

I'm looking to change to formatting of the output produced by the mysqldump command in the following way:
(data_val1,data_val2,data_val3,...)
to
(data_val1|data_val2|data_val3|...)
The change here being a different delimiter. This would then allow me to (in python) parse the data lines using a line.split("|") command and end up with the values correctly split (as opposed to doing line.split(",") and have values that contain commas be split into multiple values).
I've tried using the --fields-terminated-by flag, but this requires the --tab flag to be used as well. I don't want use the --tab flag as it splits the dump into several files. Does anyone know how to alter the delimiter that mysqldump uses?
This is not a good idea. Instead of using string.split() in Python, use the csv module to properly parse CSV data, which may be enclosed in quotes and may have internal , which aren't delimiters.
import csv
MySQL dump files are intended to be used as input back into MySQL. If you really want pipe-delimited output, use the SELECT INTO OUTFILE syntax instead with the FIELDS TERMINATED BY '|' option.