I have a large .csv file which I want to import into a MySQL database. I want to use the LOAD DATA INFILE statement on the basis of its speed.
Fields are terminated by -|-. Lines are terminated by |--. Currently I am using the following statement:
LOAD DATA LOCAL INFILE 'C:\\test.csv' INTO TABLE mytable FIELDS TERMINATED BY '-|-' LINES TERMINATED BY '|--'
Most rows look something like this: (Note that the strings are not enclosed by any characters.)
goodstring-|--|-goodstring-|-goodstring-|-goodstring|--
goodstring-|--|-goodstring-|-goodstring-|-|--
goodstring-|-goodstring-|-goodstring-|-goodstring-|-|--
goodstring is a string that does not contain - as a character. As you can see the second or last column might be empty. Rows like the above do not cause any problems. However the last column may contain - characters. There might be a row that looks something like this:
goodstring-|--|-goodstring-|-goodstring-|---|--
The string -- in the last column causes problems. MySQL detects six instead of five columns. It inserts a single - character into the fifth column and truncates the sixth. The correct DB row should be ("goodstring", NULL, "goodstring", "goodstring", "--").
A solution would be to tell MySQL to regard everything after the fourth field has been terminated as part of the fith column (up until the line is terminated). Is this possible with LOAD DATA INFILE? Are there methods that yield the same result, do not require the source file to be edited and perform about as fast as LOAD DATA INFILE?
This is my solution:
LOAD DATA
LOCAL INFILE 'C:\\test.csv'
INTO TABLE mytable
FIELDS TERMINATED BY '-|-'
LINES TERMINATED BY '-\r\n'
(col1, col2, col3, col4, #col5, col6)
SET #col5 = (SELECT CASE WHEN col6 IS NOT NULL THEN CONCAT(#col5, '-') ELSE LEFT(#col5, LENGTH(#col5) - 2) END);
It will turn a row like this one:
goodstring-|--|-goodstring-|-goodstring-|-|--
Into this:
("goodstring", "", "goodstring", "goodstring", NULL)
And a bad row like this one:
goodstring-|--|-goodstring-|-goodstring-|---|--
Into this:
("goodstring", "", "goodstring", "goodstring", "")
I simply drop the last column after the import.
Related
I have imported a CSV file where a specific column has a decimal number.
In the original excel file (before saving it to a CSV), the first number of the column shows up as 218,790. When I choose the cell, the number shows up as 218790.243077911.
In the CSV file the number shows up as 218790 and when I choose the cell it is 218,790.
When I import the file on mySQL and show the table I created, the number shows up as 218.000000000.
Here is the code I used:
create table Apolo_Test(
Leads decimal (15,9)
);
LOAD DATA LOCAL INFILE 'C:/Users/SCRIPTS/file.csv'
INTO TABLE Apolo_Test
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 7 ROWS
;
I tried updating the format with this :
update Apolo_Test set Leads = format(Leads, 10, 'de_DE');
but it did not work. I have never had a case where files had a comma before. I guess it is the UK version of numerical fields.
How is it possible to make it work on mySQL without using any MACROS in excel?
UPD:
It works but I get some warnings although I double checked the csv file and the fields :
create table Apolo_Test(
Ad_Group varchar(50),
Impacts int,
Leads decimal (10,3)
);
LOAD DATA LOCAL INFILE 'C:/Users/me/Desktop/SCRIPTS/11/Adalyser.csv'
INTO TABLE Apolo_Test
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 7 ROWS
(Ad_Group, Impacts, #Leads)
SET Leads = replace(#Leads, ',', '');
;
alter table Apolo_Test ADD IPL decimal (10,6) after Leads;
update Apolo_Test set IPL=Impacts/Leads;
select * from Apolo_Test;
You have to use this syntax:
LOAD DATA LOCAL INFILE 'C:/path/to/mytable.txt' IGNORE
INTO TABLE mytable
FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\r\n'
(int_col, #float_col)
SET float_col = replace(#float_col, ',', '.');
For more information read here
The thousands-separator should not matter when moving data around -- Excel internal values and CSV files and MySQL internal values do not include it. Only "formatted" output includes it. And you should not use formatted output for moving numbers around.
Be careful with locale, such as de_DE.
The German "218.790" is the same as English "218,790".
"218790.243077911" is likely to be what Excel had internally for the number.
"218,790" is likely to be the English representation on the screen; note the English thousands separator.
In the CSV file the number shows up as 218790 and when I choose the cell it is 218,790.
What do you mean? Perhaps that there no comma or dot in the file, itself? But what you mean by "choose the cell"?
I can't see how to get "218.000000000" without truncation going on somewhere.
I have a text file which has the following content (I have only shown the first few lines to illustrate that). They are in the form of key-value pair.
FIELD_A="Peter Kibbon",FIELD_B=31,FIELD_C="SCIENCE"
FIELD_A="James Gray",FIELD_B=28,FIELD_C="ARTS"
FIELD_A="Michelle Fernado",FIELD_B=25,FIELD_C="SCIENCE"
I want to import these data in a MySQL database using LOAD DATA FILE syntax to speed up the process. Is there any way that I can specify something like a field-prefix so that it can read the "value" part of each field.
I do not want to use MULTIPLE insert by parsing each line and each field, as this would slow down the process quite a bit.
If you know that all fields will be specified on each row and they are always in the same order, you can do something like this:
LOAD DATA INFILE 'your_file'
INTO TABLE table_name
FIELDS TERMINATED BY ','
(#col1_variable, #col2_variable, #col3_variable)
SET column1 = REPLACE(#col1_variable, 'FIELD_A=', ''),
column2 = REPLACE(#col2_variable, 'FIELD_B=', ''),
column3 = REPLACE(#col3_variable, 'FIELD_C=', '');
You load the content of the file in variables first, then operate on those variables and assign the result to your columns.
Read more about it here.
I have a problem within load a CSV file into MySQL database
the CSV file is like this:
stuID,stuName,degreeProg
6902101,A001,null
6902102,A002,null
6902103,A003,null
6902104,A004,null
6902105,A005,null
I have write a script like this:
LOAD DATA LOCAL INFILE 'demo.csv' INTO TABLE `table`
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(`col1`, `col2`, `col3`)
What troubles me is that:
the third column in file is null but when loading into the table, it becomes 'null' (the string)
at the end of the file, there is a extra empty line, which will be also loaded and assigned with null
How should I write the script to deal with those 2 questions? (It is forbidden to modify the csv file) (and it's better to try to reduce the warning from MySQL when runs this script )
1) one option is to have the LOAD DATA assign the value of the third field (i.e. the string 'null') into a user defined variable, and use the"SET col = expr"form to assign a value to the columncol3`.
As an example:
(`col1`, `col2`, #field3)
SET col3 = IF(#field3='null',NULL,#field3)
2) There's no way to have MySQL LOAD DATA "skip" the last record in the file. To have MySQL ignore the last line, that would be better handled outside MySQL. For example, have MySQL LOAD DATA read from a named pipe, and have a separate concurrent process read the CSV file and write to that named pipe.
If you could modify the CSV file, simply add FIELDS ENCLOSED BY '"' and change null to NULL (upper case) to get them to load as NULL. Alternatively, use \N to load in NULL.
Also, obviously, delete the empty line at the end (which is most likely causing the warnings):
stuID,stuName,degreeProg
6902101,A001,\N
6902102,A002,\N
6902103,A003,\N
6902104,A004,\N
6902105,A005,\N
I'm trying to load data into a mysql table using LOAD DATA LOCAL INFILE using the code below.
Mysql:
LOAD DATA INFILE '/var/www/vhosts/domain.com/httpdocs/test1.csv' INTO TABLE temp_table FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (recloc,client_acc)
Edit: changed LOAD DATA LOCAL INFILE to LOADA DATA INFILE, removed SET id=null, added IGNORE 1 LINES
I'm getting no errors and no imported records. I believe the issue is related to the column names but i'm having a hard time fully understanding what those names should be. Should they be the actual column names within the CSV? or the field names in the DB Table? I would also like the have an auto_incremented primary key (id).
CSV:
recloc,client_acc
"NLGSX3","CORPORATE"
"7SC3BA","QUALITY ASSURANCE"
"3B9OHF","90717-6710"
Any suggestions to what I may be doing wrong? thanks!
Column names in CSV are not necessary, so you should add IGNORE 1 LINES clause.
Columns in your query (recloc,client_acc) need to match columns in table.
First column from CSV will be inserted into recloc, second into client_acc.
If you don't specifu AUTO_INCREMENT column in the statement, but there is one in the table, it should fill automatically.
Short and sweet solution for excel to mysql data import:
Working good for txt file formats.
IN DETAIL:
tbl name=t1
feilds are= name varchar,email varchar;
text.txt file <<== this text file first lines table column names:
name, email
"n1", "e1" next line
"n2", "e2" next line
"n3", "e3" next line
"n4", "e4" next line
"n5", "e5" next line
"n6", "e6" next line
"n7", "e7" next line
pls ignore next line statements
SQL query in wamp
LOAD DATA INFILE 'c:/wamp/www/touch/text.txt' INTO TABLE t1 FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES(name,email)
For this commnad run successfully we have create folders for separately.
Real one is
C:\wamp\mysql\data\wamp\www\touch\text.txt <<==pysical file path is.
But we mention c:/wamp/touch/text.txt
I have created a database and a table. I have also created all the fields I will be needing. I have created 46 fields including one that is my ID for the row. The CSV doesn't contain the ID field, nor does it contain the headers for the columns. I am new to all of this but have been trying to figure this out. I'm not on here being lazy asking for the answer, but looking for directions.
I'm trying to figure out how to import the CSV but have it start importing data starting at the 2nd field, since I'm hoping the auto_increment will fill in the ID field, which is the first field I created.
I tried these instructions with no luck. Can anyone offer some insight?
The column names of your CSV file must match those of your table
Browse to your required .csv file
Select CSV using LOAD DATA options
Check box 'ON' for Replace table data with file
In Fields terminated by box, type ,
In Fields enclosed by box, "
In Fields escaped by box, \
In Lines terminated by box, auto
In Column names box, type column name separated by , like column1,column2,column3
Check box ON for Use LOCAL keyword.
Edit:
The CSV file is 32.4kb
The first row of my CSV is:
Test Advertiser,23906032166,119938,287898,,585639051,287898 - Engager - 300x250,88793551,Running,295046551,301624551,2/1/2010,8/2/2010,Active,,Guaranteed,Publisher test,Maintainer test,example-site.com,,All,All,,Interest: Dental; custom geo zones: City,300x250,-,CPM,$37.49 ,"4,415","3,246",3,0,$165.52 ,$121.69 ,"2,895",805,0,0,$30.18 ,$37.49 ,0,$0.00 ,IMPRESSIONBASED,NA,USD
You can have MySQL set values for certain columns during import. If your id field is set to auto increment, you can set it to null during import and MySQL will then assign incrementing values to it. Try putting something like this in the SQL tab in phpMyAdmin:
LOAD DATA INFILE 'path/to/file.csv' INTO TABLE your_table FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' SET id=null;
Please look at this page and see if it has what you are looking for. Should be all you need since you are dealing with just one table. MYSQL LOAD DATA INFILE
So for example you might do something like this:
LOAD DATA INFILE 'filepath' INTO TABLE 'tablename' FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' (column2, column3, column4);
That should give you an idea. There are of course more options that can be added as seen in the above link.
be sure to use LOAD DATA LOCAL INFILE if the import file is local. :)