Is it possible to perform functions during LOAD DATA INFILE without creating a stored procedure? - mysql

I have two columns, q1 and q2, that I'd like to sum together and put in the destination column, q.
The way I do it now, I put the data in an intermediate table, then sum during loading, but I'm wondering if it's possible to do it during extraction instead?
Here's my script:
LOAD DATA INFILE 'C:\temp\foo.csv'
INTO TABLE new_foo
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(q1,q1)
INSERT INTO foo (q) SELECT q1+q2 AS q
FROM foo_temp;

Try:
LOAD DATA INFILE 'C:\temp\foo.csv'
INTO TABLE `foo`
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(#`q1`, #`q2`)
SET `q` = #`q1` + #`q2`;

Related

ignore first two characters on a column while importing csv to mysql

I am trying to import a csv file to mysql table, But I need to remove First two characters on particular column before importing to mysql.
This is my statment :
string strLoadData = "LOAD DATA LOCAL INFILE 'E:/park/Export.csv' INTO TABLE tickets FIELDS terminated by ',' ENCLOSED BY '\"' lines terminated by '\n' IGNORE 1 LINES (SiteId,DateTime,Serial,DeviceId,AgentAID,VehicleRegistration,CarPark,SpaceNumber,GpsAddress,VehicleType,VehicleMake,VehicleModel,VehicleColour,IssueReasonCode,IssueReason,NoticeLocation,Points,Notes)";
Column IssueReasoncode' has data like 'LU12' , But i need to remove the first 2 characters it should have only integers on it and not alpha numeric .
I need to remove 'LU' from that column.
Is it possible to write like this on left(IssueReasonCode +' '2). This column is varchar(45) and cant be changed now because of large data on it.
Thanks
LOAD DATA INFILE has the ability to perform a function on the data for each column as you read it in (q.v. here). In your case, if you wanted to remove the first two characters from the IssueReasonCode column, you could use:
RIGHT(IssueReasonCode, CHAR_LENGTH(IssueReasonCode) - 2)
to remove the first two characters. You specify such column mappings at the end of the LOAD DATA statement using SET. Your statement should look something like the following:
LOAD DATA LOCAL INFILE 'E:/park/Export.csv' INTO TABLE tickets
FIELDS terminated by ','
ENCLOSED BY '\"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(SiteId, DateTime, Serial, DeviceId, AgentAID, VehicleRegistration, CarPark, SpaceNumber,
GpsAddress, VehicleType, VehicleMake, VehicleModel, VehicleColour, IssueReasonCode,
IssueReason, NoticeLocation, Points, Notes)
SET IssueReasonCode = RIGHT(IssueReasonCode, CHAR_LENGTH(IssueReasonCode) - 2)
Referencing this and quoting this example , you can try the below to see if it works
User variables in the SET clause can be used in several ways. The
following example uses the first input column directly for the value
of t1.column1, and assigns the second input column to a user variable
that is subjected to a division operation before being used for the
value of t1.column2:
LOAD DATA INFILE 'file.txt' INTO TABLE t1 (column1, #var1) SET
column2 = #var1/100;
string strLoadData = "LOAD DATA LOCAL INFILE 'E:/park/Export.csv' INTO TABLE tickets FIELDS terminated by ',' ENCLOSED BY '\"' lines terminated by '\n' IGNORE 1 LINES (SiteId,DateTime,Serial,DeviceId,AgentAID,VehicleRegistration,CarPark,SpaceNumber,GpsAddress,VehicleType,VehicleMake,VehicleModel,VehicleColour,#IRC,IssueReason,NoticeLocation,Points,Notes) SET IssueReasonCode = substr(#IRC,2) ;";

if exists update else insert csv data MySQL

I am populating a MySQL table with a csv file pulled from a third party source. Every day the csv is updated and I want to update rows in MySQL table if an occurrence of column a, b and c already exists, else insert the row. I used load data infile for the initial load but I want to update against a daily csv pull. I am familiar with INSERT...ON DUPLICATE, but not in the context of a csv import. Any advice on how to nest LOAD DATA LOCAL INFILE within INSERT...ON DUPLICATE a, b, c - or if that is even the best approach would be greatly appreciated.
LOAD DATA LOCAL INFILE 'C:\\Users\\nick\\Desktop\\folder\\file.csv'
INTO TABLE db.tbl
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 lines;
Since you use LOAD DATA LOCAL INFILE, it is equivalent to specifying IGNORE: i.e. duplicates would be skipped.
But
If you specify REPLACE, input rows replace existing rows. In other words, rows that have the same value for a primary key or unique index as an existing row.
So you update-import could be
LOAD DATA LOCAL INFILE 'C:\\Users\\nick\\Desktop\\folder\\file.csv'
REPLACE
INTO TABLE db.tbl
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 lines;
https://dev.mysql.com/doc/refman/5.6/en/load-data.html
If you need a more complicated merge-logic, you could import CSV to a temp table and then issue INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
I found that the best way to do this is to insert the file with the standard LOAD DATA LOCAL INFILE
LOAD DATA LOCAL INFILE
INTO TABLE db.table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 lines;
And use the following to delete duplicates. Note that the below command is comparing db.table to itself by defining it as both a and b.
delete a.* from db.table a, db.table b
where a.id > b.id
and a.field1 = b.field1
and a.field2 = b.field2
and a.field3 = b.field3;
To use this method it is essential that the id field is an auto incremental primary key.The above command then deletes rows that contain duplication on field1 AND field2 AND field3. In this case it will delete the row with the higher of the two auto incremental ids, this works just as well if we were to use < instead of >.

Different number of rows exporting a table to a csv file in MySQL

I have loaded a table in mysql (xampp) with around 40,000,000 rows, with it I created another table with around 6,000,000 rows and I exported it to a csv file using:
(SELECT ...)
UNION
(SELECT ...
FROM ctr_train0
INTO OUTFILE 'C:/.../file.csv'
FIELDS ENCLOSED BY '"' TERMINATED BY ',' ESCAPED BY '"'
LINES TERMINATED BY '\n');
no errors, but this command creates a csv file with around 200,000 rows less than the original table, What happens? How can I export all the 6,000,000 rows?. Thanks in advance.
My best guess -- given the limited information -- is the use of union. This removes duplicates from the output, and the duplicates are both between tables and within tables. So, if your data has duplicates, this removes them.
Try running the query with union all instead:
(SELECT ...)
UNION ALL
(SELECT ...
FROM ctr_train0
INTO OUTFILE 'C:/.../file.csv'
FIELDS ENCLOSED BY '"' TERMINATED BY ',' ESCAPED BY '"'
LINES TERMINATED BY '\n');

LOAD DATA INFILE id

I ran the following command:
LOAD DATA INFILE '/Users/Tyler/Desktop/players_20120318.txt' INTO TABLE players FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';
On this data:
PlayerId,IsActive,IsVisible,FirstName,LastName,HeightFeet,HeightInches,Weight,Birthday,Gender,HometownCity,HometownState,HometownZip,HometownCountry,HighSchoolId,HighSchoolIdTemp,HighSchoolGradYear,CollegeYear,Redshirted,Transferred,CollegeId,CollegeIdTemp,CollegeGradYear,OtherAccountId,PreviousCollegeId,CurrentTeamId,LateralRecommendationReason,LateralRecommendationLink,CreationDate,CreatedBy,LastModifiedDate,LastModifiedBy,TwitterLink,FacebookLink,PersonalWebsite,PlayerImage,FirstNameNickName,NeulionID,OtherTeamID,OtherSportTypeID,SourceDataTypeID,PlayerTypeID,LoadID,SameNameTeammate,SameNameSchoolMate,SD_SportID,SD_PlayerID,ZeroNCAAStats,ModifiedByPythonGame,Missing2011,Transfer2011,RecruitingClass
21,True,True,John,Frost,6,1,185,,M,Decatur,AL,35603,,{A0AD8B45-47E1-4039-85DF-756301035073},7453,2009,JR,False,False,{299F909C-88D9-4D26-8ADC-3EC1A66168BB},844,2013,{EBA5A9E6-E03E-4AE5-B9B8-264339EE9259},,0,,,2011-02-16 20:53:34.877000000,,2012-03-08 01:43:37.593000000,{5EBB0160-E69A-4EA2-89D5-932DD4D58632},,,,,,,45759,1,1,5,,,,,,,,,,
1344,True,True,Zach,Alvord,6,0,173,,M,Alpharetta,GA,30022,,{379BF463-67A9-480E-8FFB-9B50AD494953},11597,2010,SO,False,False,{7208C8FB-6780-4379-BC25-5DC5064C85FD},36,2014,{CDACD2C7-7667-406C-9662-02B378B00032},,0,,,2011-02-16 20:53:34.970000000,,2012-03-07 23:28:17.343000000,{5EBB0160-E69A-4EA2-89D5-932DD4D58632},,,,,,,45710,1,1,5,,,,,,,,,,
And mySQL was taking that first column (PlayerID) and assigning it to the id column. It was also shifting everything over one column (first name was filled in with last name).
Is this the expected behavior?
I believe that MySQL will properly insert the data by skipping the id column as long as it's set to auto_increment. Otherwise you can specify the columns individually as Bobby pointed out.
To avoid this problem, specify the columns you're loading data into and leave out the id field:
LOAD DATA INFILE '/Users/Tyler/Desktop/players_20120318.txt' INTO TABLE players (col1, col2, col3...) FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';

How to insert selected columns from a CSV file to a MySQL database using LOAD DATA INFILE

I have a CSV file which contains 10 columns. I want to select only some columns from that file and load them into a MySQL database using the LOAD DATA INFILE command.
Load data into a table in MySQL and specify columns:
LOAD DATA LOCAL INFILE 'file.csv' INTO TABLE t1
FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
(#col1,#col2,#col3,#col4) set name=#col4,id=#col2 ;
#col1,2,3,4 are variables to hold the csv file columns (assume 4 ) name,id are table columns.
Specify the name of columns in the CSV in the load data infile statement.
The code is like this:
LOAD DATA INFILE '/path/filename.csv'
INTO TABLE table_name
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
(column_name3, column_name5);
Here you go with adding data to only two columns(you can choose them with the name of the column) to the table.
The only thing you have to take care is that you have a CSV file(filename.csv) with two values per line(row). Otherwise please mention. I have a different solution.
Thank you.
LOAD DATA INFILE 'file.csv'
INTO TABLE t1
(column1, #dummy, column2, #dummy, column3, ...)
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '"'
LINES TERMINATED BY '\r\n';
Just replace the column1, column2, etc.. with your column names, and put #dummy anwhere there's a column in the CSV you want to ignore.
Full details here.
Example:
contents of the ae.csv file:
"Date, xpto 14"
"code","number","year","C"
"blab","15885","2016","Y"
"aeea","15883","1982","E"
"xpto","15884","1986","B"
"jrgg","15885","1400","A"
CREATE TABLE Tabletmp (
rec VARCHAR(9)
);
For put only column 3:
LOAD DATA INFILE '/local/ae.csv'
INTO TABLE Tabletmp
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 2 LINES
(#col1, #col2, #col3, #col4, #col5)
set rec = #col3;
select * from Tabletmp;
2016
1982
1986
1400
if you have number of columns in your database table more than number of columns in your csv you can proceed like this:
LOAD DATA LOCAL INFILE 'pathOfFile.csv'
INTO TABLE youTable
CHARACTER SET latin1 FIELDS TERMINATED BY ';' #you can use ',' if you have comma separated
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r\n'
(yourcolumn,yourcolumn2,yourcolumn3,yourcolumn4,...);
For those who have the following error:
Error Code: 1290. The MySQL server is running with the
--secure-file-priv option so it cannot execute this statement
You can simply run this command to see which folder can load files from:
SHOW VARIABLES LIKE "secure_file_priv";
After that, you have to copy the files in that folder and run the query with LOAD DATA LOCAL INFILE instead of LOAD DATA INFILE.