inserting csv files into mysql db - mysql

I have csv file with delimiter ';'. It looks following (third column should be empty):
id;theme_name;description
5cbde2fe-b70a-5245-bbde-c2504a4bced1;DevTools;allow web developers to test and debug their code. They are different from website builders and integrated development environments (IDEs) in that they do not assist in the direct creation of a webpage, rather they are tools used for testing the user interface of a website or web application.
c8bfc406-aaa6-5cf9-94c3-09fc54b934e7;AI;
Here is my script for inserting data from csv into db:
mysql -u ${MYSQL_USER} --password=${MYSQL_PASSWORD} ${MYSQL_DATABASE} --local-infile=1 -h ${MYSQL_HOST} -e"LOAD DATA LOCAL INFILE '/tmp/init_data/$file' INTO TABLE $table_name FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY '\"' IGNORE 1 LINES";
When I'm making SELECT statement, Im getting carriage return (\r) in last column in response:
Here is response from mysql
{
"themeName": "DevTools",
"description": "allow web developers to test and debug their code. They are different from website builders and integrated development environments (IDEs) in that they do not assist in the direct creation of a webpage, rather they are tools used for testing the user interface of a website or web application.\r"
}, {
"themeName": "AI",
"description": "\r"
}
When I add delimiter ';' after last column in csv file, carriage return disappeared from response.
for example: c8bfc406-aaa6-5cf9-94c3-09fc54b934e7;AI;;
Why mysql add \r into third column ?
Is there any possible way how to solve it ? (except replace in select statement)
Thanks

I bet your CSV file comes from Windows. Those files have \r\n at the end of every line.
Add this to your command ...
LINES TERMINATED BY '\\r\\n'
and things should work. If they don't try this
LINES TERMINATED BY '\r\n'
On UNIX-derived systems (like Linux) the default LINES TERMINATED BY '\n' works.
On Mac systems you need LINES TERMINATED BY '\r'.
If you add a trailing ; column separator you create a fourth column in your CSV. But the table you're loading only has three columns, so LOAD DATA INFILE ignores that fourth column, which has a \r in it.
Why this difference, you ask? Old-timey teletype devices (the original terminals for MS-DOS and UNIX) needed a Return code to move the print head back to the first column, and a Newline code to move the paper up one line. The UNIX Bell Labs team decided their tty driver would add the extra Return code so lines ended with a particular single character. MS-DOS's team (umm, Bill Gates) left them both in.
Why Mac with just Return? Maybe somebody knows.

Answer:
According to #O.Jones answer I needed to add LINES TERMINATED BY '\r' but alsoI need to add \n
mysql -u ${MYSQL_USER} --password=${MYSQL_PASSWORD} ${MYSQL_DATABASE} --local-infile=1 -h ${MYSQL_HOST} -e"LOAD DATA LOCAL INFILE '/tmp/init_data/$file' INTO TABLE $table_name FIELDS TERMINATED BY ';' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES";

Related

merging type of EOL, (End of File)

I have some files, They're text files and I imported them via LOAD DATA mysql command into a database table. But I have problem with some of them.
All of them are 6236 lines:
$ wc -l ber.mensur.txt
6236 ber.mensur.txt
When I import ber.mensure.txt , only I have 1611 record in my table. But other files have 6236 row.
My LOAD DATA command is :
LOAD DATA INFILE '/home/mohsen/codes/haq/translation-tmp/ber.mensur.txt'
INTO TABLE tr_tmp
FIELDS TERMINATED BY ''
ENCLOSED BY '"'
LINES TERMINATED BY '\n' (aya);
I use linux and I'm force to \n for end of line(EOL).
When I examine my databse, Some records contains more than one line. I think my end of lines has problem.
Do you have any solution to solve it?
UPDATE:
My file is here
By the way, vim can know my txt file as 6236 lines.
You can do that via :
fd = open(YOURFILE,'r')
lines = readlines()
It works good.

How to LOAD csv file with extra empty row?

I've got the CSV file which I need to load to mysql database. Problem is that every row of data is followed by empty row:
open_test_uuid|open_uuid|asn|bytes_download
O0037c645-0c7b-4bd0-a6dc-1983e6d0f814|Pf13e1f22-92f6-4a49-9bd3-2882373d0266|25255|11704458
O0037c645-0c7b-4bd0-a6dc-1983e6d0f814|Pf13e1f22-92f6-4a49-9bd3-2882373d0266|25255|11704458
I tried differend combinations of LOAD command but nothing works.
LOAD DATA LOCAL INFILE '/Users/BauchMAC/Desktop/details_201301.csv'
INTO TABLE netztest
FIELDS TERMINATED BY '|'
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n\r' <--HERE should be some magic line
IGNORE 1 ROWS;
I have no idea how to solve that...
Thank you for every idea
You could try
LINES TERMINATED BY '\n\n'
or
LINES TERMINATED BY '\r\n\r\n'
and if that doesn't work you could pre-process the file before feed it to MySQL.
If you're on linux you could run this:
cat file.csv | grep -v '^\s*$' > fixed.csv
UPDATE
Perhaps a little Perl will do the trick:
perl -e '$_=`cat file.csv`; s/[\r\n]+/\n/g; print'

First load of a txt file using the sql command line

I work locally with WAMPSERVER2.5
I built my dadabase on OpenOfficeCalc: db.txt
- Character set: Unicode (UTF-8)
- Field separator: ,
- Text separator: "
I want to import "db.txt" in a table named "db_pma" which is in the database named "hct" (in PhpMyAdmin)
"db.txt" contains 2 fields:
- string_field (character string, max:30)
- numeric_field (sample: 5862314523685.256325632)
Questions:
1) Do I need to first create in PhpMyAdmin the structure of the table ("db_pma") which will contain the data?
2) I tried this code to import "db.txt"
mysql -h localhost -u root
LOAD DATA LOCAL INFILE 'db.txt'
INTO TABLE dp_pma
FIELDS TERMINATED BY ','
LINES TERMINATED BY 'AUTO';
It doesn't work.
Could you help me?
Use LINES TERMINATED BY '\r\n' (for Windows) or LINES TERMINATED BY '\n' (for Linux) instead.

How to import a csv file containing backslashes into MySQL

I'm trying to import a large csv file wiht 27797 rows into MySQL. Here is my code:
load data local infile 'foo.csv' into table bar fields terminated by ',' enclosed by '"' lines terminated by '\n' ignore 1 lines;
It works fine. However, some rows of this file containing backslashes (\), for example:
"40395383771234304","40393156566585344","84996340","","","2011-02-23 12:59:44 +0000","引力波宇宙广播系统零号控制站","#woiu 太好了"
"40395151830421504","40392270645563392","23063222","","","2011-02-23 12:58:49 +0000","引力波宇宙广播系统零号控制站","#wx0 确切地讲安全电压是\""不高于36V\""而不是\""36V\"", 呵呵. 话说要如何才能测它的电压呢?"
"40391869477158912","40390512645124096","23063222","","","2011-02-23 12:45:46 +0000","引力波宇宙广播系统零号控制站","#wx0 这是别人的测量结果, 我没验证过. 不过麻麻的感觉的确是存在的, 而且用适配器充电时麻感比用电脑的前置USB接口充电高"
"15637769883","15637418359","35192559","","","2010-06-07 15:44:15 +0000","强互作用力宇宙探测器","#Hc95 那就不是DOS程序啦,只是个命令行程序,就像Android里的adb.exe。$ adb push d:\hc95.tar.gz /tmp/ $ adb pull /system/hc95/eyes d:\re\"
After importing, lines with backslashes will be broken.
How could I fix it? Should I use sed or awk to substitute all \ with \ (within 27797 rows...)? Or this can be fixed by just modifying the SQL query?
This is abit more of a discussion than a direct answer. Do you need the double quotes in the middle of the values in the final data (in the DB)? The fact that you have a large amount of data to munge doesn't present any problems at all.
The "" thing is what Oracle does for quotes inside strings. I think whatever built that file attempted to escape the quote sequence. This is the string manual for MySQL. Either of these is valid::
select "hel""lo", "\"hello";
I would tend to do the editing separately to the import, so it easier/faster to see if things worked. If your text file is less than 10MB, it shouldn't take more than a minute to update it via sed.
sed -e 's/\\//' foo.csv
From your comments, you can set the escape char to be something other than '\'.
ESCAPED BY 'char'
This means the loader should verbatim add the values. If it gets too complicated, if you base64() the data before you insert it, this will stop any tools from breaking the UTf8 sequences.
What I did in a similar situation was to create a java string first in a test application. Then compile the test class and fix any errors that I found.
For example:
`String me= "LOAD DATA LOCAL INFILE 'X:/access.log/' REPLACE INTO TABLE `logrecords"+"`\n"+
"FIELDS TERMINATED BY \'|\'\n"+
"ENCLOSED BY \'\"\'\n"+
"ESCAPED BY \'\\\\\'\n"+
"LINES TERMINATED BY \'\\r\\n\'(\n"+
"`startDate` ,\n"+
"`IP` ,\n"+
"`request` ,\n"+
"`threshold` ,\n"+
"`useragent`\n"+
")";
System.out.println("" +me);
enter code here

SQL Remote Connection with file read

So I am trying to access my servers databse reomtely and have it run commands to export several tables all to individual csv files. So what I have is a commmand line command parameters that look like this:
mysql -h 198.xxx.xxx.xxx -u user-p < file.txt
The contents of file.txt looks like this:
SELECT * FROM log
INTO OUTFILE 'C:\USERS\username\Desktop\log.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
SELECT * FROM permission_types
INTO OUTFILE 'C:\USERS\username\Desktop\permission_types.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
SELECT * FROM personal_info_options
INTO OUTFILE 'C:\USERS\username\Desktop\personal_info_options.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
I am not sure that I have the syntax for this right or if this is even possible I have been doing a bunch of research trying to get examples. People usually tell you concept but never seem to give you the code you need to test them its always something like:
mysql -h localhost -u user-p < somefile and they don't show you contents
of a file for example
I am running windows 7, I installed WAMPServer and it has MYSQL version 5.5.24, which I am access via commandline. I am not sure about the FILEDS TERMINATED BY or the ENCLOSED BY or LINES TERMINATED BY... do I need those at all? Will that actually save to my local machine? I am nervous about running this script I don't want to make a mistake and mess up the database. Also is .txt ok for the script file?
Any help you can give would be great.
I am not sure that I have the syntax for this right or if this is even possible I have been doing a bunch of research trying to get examples.
Your syntax is correct, except that each SELECT statement should be terminated with a semicolon. Note that you will also need to specify the database in which your tables reside—it's easiest to do this as an argument to mysql:
mysql -h 198.xxx.xxx.xxx -u user-p mydb < file.txt
I am not sure about the FILEDS TERMINATED BY or the ENCLOSED BY or LINES TERMINATED BY... do I need those at all?
As documented under SELECT ... INTO Syntax:
Here is an example that produces a file in the comma-separated values (CSV) format used by many programs:
SELECT a,b,a+b INTO OUTFILE '/tmp/result.txt'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM test_table;
As explained under LOAD DATA Syntax:
If you specify no FIELDS or LINES clause, the defaults are the same as if you had written this:
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
LINES TERMINATED BY '\n' STARTING BY ''
It goes on to explain:
Conversely, the defaults cause SELECT ... INTO OUTFILE to act as follows when writing output:
Write tabs between fields.
Do not enclose fields within any quoting characters.
Use “\” to escape instances of tab, newline, or “\” that occur within field values.
Write newlines at the ends of lines.
Note that because LINES TERMINATED BY '\n' is the default, you could omit that clause; but the FIELDS clauses are necessary for CSV output.
Will that actually save to my local machine?
No. As documented under SELECT ... INTO Syntax:
The SELECT ... INTO OUTFILE 'file_name' form of SELECT writes the selected rows to a file. The file is created on the server host, so you must have the FILE privilege to use this syntax. file_name cannot be an existing file, which among other things prevents files such as /etc/passwd and database tables from being destroyed. The character_set_filesystem system variable controls the interpretation of the file name.
The SELECT ... INTO OUTFILE statement is intended primarily to let you very quickly dump a table to a text file on the server machine. If you want to create the resulting file on some other host than the server host, you normally cannot use SELECT ... INTO OUTFILE since there is no way to write a path to the file relative to the server host's file system.
However, if the MySQL client software is installed on the remote machine, you can instead use a client command such as mysql -e "SELECT ..." > file_name to generate the file on the client host.
It is also possible to create the resulting file on a different host other than the server host, if the location of the file on the remote host can be accessed using a network-mapped path on the server's file system. In this case, the presence of mysql (or some other MySQL client program) is not required on the target host.
I am nervous about running this script I don't want to make a mistake and mess up the database.
SELECT statements only read data from the database and do not make any changes to its content: thus they cannot "mess up the database".
Also is .txt ok for the script file?
Any extension will work: neither the MySQL client nor server software see it (your operating system reads the file and sends its content to the client program, which in turn sends that content to the server; the operating system is ambivalent to the file extension here).
On Windows, a .txt extension will associate the file with a text editor (e.g. Notepad) so that it can be readily opened for editing. Personally I would prefer .sql as it more accurately describes the file's content, and I would then associate that extension with a suitable editor—but none of that is necessary.