MySQL Format Date with a Load Data command - mysql

Hello I am trying to load data from a csv file but need to format the date when loading the file.
The date format of my csv file is: 1/1/2008 0:00
i'd like to format the date and to just the year with the load command
load data infile 'C:/mysql/ca_pop_educational_attainment.csv'
into table ca_pop.educational_attainment
fields terminated by ','
enclosed by '"'
ignore line 1
set year = STR_TO_DATE(#year,"%Y")
(ea_id, #year, age, educational_attainment, personal_income,pop_count)

The list of columns must come before the SET clause. That is, you list the columns or variables to import into, then following that, you can set individual columns.
Also, you need to use STR_TO_DATE() to parse the whole string into a date. Then you can wrap that in another function or expression to extract individual parts like the year.
Something like the following, but I have not tested it:
load data infile 'C:/mysql/ca_pop_educational_attainment.csv'
into table ca_pop.educational_attainment
fields terminated by ','
enclosed by '"'
ignore line 1
(ea_id, #year, age, educational_attainment, personal_income, pop_count)
set year = YEAR(STR_TO_DATE(#year,'%m/%d/%Y %k:%i'))

Related

How can I load blank/NULL values with LOAD DATA INFILE from the MySQL command line

I am using the following command from the MySQL command line to try and import a csv file into one of my tables:
LOAD DATA INFILE 'file path' INTO TABLE table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(BASEID, BIGID, StartDate, EndDate)
Some of my rows have no value for EndDate which I want to be represented as NULL values in the database.
Unfortunately when I execute the above command I get the following error:
for column 'EndDate' at row 141lue:
If I remove the rows with blank cells the command works, so it is clearly the blank values for EndDate which are causing the problem.
I have also tried changing my csv file so that the blank cells say NULL or \N. I have also tried the following command instead but I still get the same error message:
LOAD DATA INFILE 'file path' INTO TABLE Table
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(BASEID, BIGID, StartDate, #EndDate)
SET EndDate = nullif(#EndDate, ' ')
How can I load csv files which have some blank values? The suggest solutions I have seen on other posts don't seem to work as outlined above.
Is the issue that the value for the end date is missing, or that the column itself is missing? These are not the same thing. For the former case, I think LOAD DATA should be able to handle this, assuming that the target column for the end date can tolerate missing/null values.
What I suspect here is that some of your input lines look like this:
1,1,'2020-10-03'
That is, there is no fourth column present at all. If this be the case, then the most prudent thing to do here might be to run a simple regex over your input CSV flat file to fix these missing fourth column edge cases. You may try:
Find: ^([^,]+,[^,]+,'[^,]+')$
Replace: $1,
This would turn the sample line above into:
1,1,'2020-10-03',
Now, the date value is still missing, but at least LOAD DATA should detect that the line has four columns, instead of just three.

sql loader not loading all the needed rows due to new line character and enclosement character

I have a problem with how SQL loader manage the end of a column value. I was hoping to manage CR LF, the enclosement character and the separator character but it seems I can't find a solution!
The data I receive from the .csv file looks like this:
"C","I","FLAGS","LASTUPDATEDATE","BOEVERSION","C_OSUSER_UPDATEDBY","I_OSUSER_UPDATEDBY","C_OSUSER_PWF","DESCRIPTION","DURATION","ENDDATE","I_OSUSER_PWF","LASTSTATUSCHA","STARTDATE","DURATIONUNIT","TYPE","STATUS","C_BNFTRGHT_CONDITIONS","I_BNFTRGHT_CONDITIONS","C_CNTRCT1_CONDITION","I_CNTRCT1_CONDITION","EXTBLOCKTYPE","EXTBLOCKDURATIONUNIT","EXTBLOCKDURATION","EXTBLOCKDESCRIPTION","PARTITIONID"
"7680","423","PE","2015-07-06 11:42:10","0","1000","1506","","No benefits are payable for a Total Disability period during a Parental or Family-Related Leave, for a Total Disability occurring during this period.
","0","","","","","69280000","69312015","71328000","7285","402","","","","","","","1"
"7680","426","PE","2015-07-06 11:42:10","0","1000","1506","","""Means to be admitted to a Hospital as an in-patient for more than 18 consecutive hours.
""
","0","","","","","69280000","69312021","71328000","7285","402","","","","","","","1"
My ctl file is as follow:
Load Data
infile 'C:\2020-07-29-03-04-48-TolCondition.csv'
CONTINUEIF LAST != '"'
into table TolCondition
REPLACE
FIELDS TERMINATED BY "," ENCLOSED by '"'
(
C,
I,
FLAGS,
LASTUPDATEDATE DATE "YYYY-MM-DD HH24:MI:SS",
BOEVERSION,
C_OSUSER_UPDATEDBY,
I_OSUSER_UPDATEDBY,
C_OSUSER_PWF,
DESCRIPTION CHAR(1000),
DURATION,
ENDDATE DATE "YYYY-MM-DD HH24:MI:SS",
I_OSUSER_PWF,
LASTSTATUSCHA DATE "YYYY-MM-DD HH24:MI:SS",
STARTDATE DATE "YYYY-MM-DD HH24:MI:SS",
DURATIONUNIT,
TYPE,
STATUS,
C_BNFTRGHT_CONDITIONS,
I_BNFTRGHT_CONDITIONS,
C_CNTRCT1_CONDITION,
I_CNTRCT1_CONDITION,
EXTBLOCKTYPE,
EXTBLOCKDURATIONUNIT,
EXTBLOCKDURATION,
EXTBLOCKDESCRIPTION,
PARTITIONID)
Here is what I tried in the control file:
CONTINUEIF LAST != '"'
CONTINUEIF THIS PRESERVE (1:2) != '",'
"str X'220D0A'"
Here is the result I currently have with "CONTINUEIF LAST != '"'
Record 2: Rejected - Error on table TOLCONDITION, column DESCRIPTION.
second enclosure string not present
Record 3: Rejected - Error on table TOLCONDITION, column C.
no terminator found after TERMINATED and ENCLOSED field
Table TOLCONDITION:
1 Row successfully loaded.
2 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Is there any way to manage line break and enclosement character in SQL Loader? I dont understand why we can`t change how it sees rows. Instead of seeing a new row when there is a CR LF, can we tell it to concacenate values until the last enclosement character (chr34 in my case) + the separator character (, in my case) has been seen.
I really hope to find a way to resolve this issue without having to change the .csv file.
Thank you
If you are on 12c you can use the "FIELDS CSV WITH EMBEDDED" clause to tell sqlldr that some columns have the data file's end of line character embedded within. This will cause the column to be inserted as it is in the data file. More info here
Load Data
infile 'C:\2020-07-29-03-04-48-TolCondition.csv'
into table TolCondition
REPLACE
FIELDS CSV WITH EMBEDDED TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'

How to convert string "3.82384E+11" to BIGINT with MySQL?

I'm trying to save some ID values from CSV that are automatically converted to exponent numbers by Excel.
Like 382383816413 becomes 3.82384E+11. So I'm doing a full import into my MySQL database with:
LOAD DATA LOCAL INFILE
'file.csv'
INTO TABLE my_table
FIELDS TERMINATED BY ';'
ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(#`item_id`,
#`colum2`,
#`colum3`)
SET
item_id = #`item_id`;
I've tried using cast like:
CAST('3.82384E+11' as UNSIGNED) and it gives me just 3.
CAST('3.82384E+11' as BIGINT) and it doesn't work.
CAST('3.82384E+11' as UNSIGNED BIGINT) and gives me 3 again.
So, what's the better way to convert string exponent numbers to real big integers in MySQL?
Set column format as text instead of number in excel. Refer below link.
PHPExcel - set cell type before writing a value in it
My option was to convert the column with 3.82384E+11 to number in the excel file, so it get back to the original value. Then I export to CSV and use SQL query to import it fine.

Convert Numerical Fields in MySQL

I have imported a CSV file where a specific column has a decimal number.
In the original excel file (before saving it to a CSV), the first number of the column shows up as 218,790. When I choose the cell, the number shows up as 218790.243077911.
In the CSV file the number shows up as 218790 and when I choose the cell it is 218,790.
When I import the file on mySQL and show the table I created, the number shows up as 218.000000000.
Here is the code I used:
create table Apolo_Test(
Leads decimal (15,9)
);
LOAD DATA LOCAL INFILE 'C:/Users/SCRIPTS/file.csv'
INTO TABLE Apolo_Test
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 7 ROWS
;
I tried updating the format with this :
update Apolo_Test set Leads = format(Leads, 10, 'de_DE');
but it did not work. I have never had a case where files had a comma before. I guess it is the UK version of numerical fields.
How is it possible to make it work on mySQL without using any MACROS in excel?
UPD:
It works but I get some warnings although I double checked the csv file and the fields :
create table Apolo_Test(
Ad_Group varchar(50),
Impacts int,
Leads decimal (10,3)
);
LOAD DATA LOCAL INFILE 'C:/Users/me/Desktop/SCRIPTS/11/Adalyser.csv'
INTO TABLE Apolo_Test
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 7 ROWS
(Ad_Group, Impacts, #Leads)
SET Leads = replace(#Leads, ',', '');
;
alter table Apolo_Test ADD IPL decimal (10,6) after Leads;
update Apolo_Test set IPL=Impacts/Leads;
select * from Apolo_Test;
You have to use this syntax:
LOAD DATA LOCAL INFILE 'C:/path/to/mytable.txt' IGNORE
INTO TABLE mytable
FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\r\n'
(int_col, #float_col)
SET float_col = replace(#float_col, ',', '.');
For more information read here
The thousands-separator should not matter when moving data around -- Excel internal values and CSV files and MySQL internal values do not include it. Only "formatted" output includes it. And you should not use formatted output for moving numbers around.
Be careful with locale, such as de_DE.
The German "218.790" is the same as English "218,790".
"218790.243077911" is likely to be what Excel had internally for the number.
"218,790" is likely to be the English representation on the screen; note the English thousands separator.
In the CSV file the number shows up as 218790 and when I choose the cell it is 218,790.
What do you mean? Perhaps that there no comma or dot in the file, itself? But what you mean by "choose the cell"?
I can't see how to get "218.000000000" without truncation going on somewhere.

How to convert date in .csv file into SQL format before mass insertion

I have a csv file with a couple thousand game dates in it, but they are all in the MM/DD/YYYY format
2/27/2011,3:05 PM,26,14
(26 and 14 are team id #s), and trying to put them into SQL like that just results in 0000-00-00 being put into the date field of my table. This is the command I tried using:
LOAD DATA LOCAL INFILE 'c:/scheduletest.csv' INTO TABLE game
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(`date`, `time`, `awayteam_id`, `hometeam_id`);
but again, it wouldn't do the dates right. Is there a way I can have it convert the date as it tries to insert it? I found another SO question similar to this, but I couldn't get it to work.
Have you tried the following:
LOAD DATA LOCAL INFILE 'c:/scheduletest.csv' INTO TABLE game
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(#DATE_STR, `time`, `awayteam_id`, `hometeam_id`)
SET `date` = STR_TO_DATE(#DATE_STR, '%c/%e/%Y');
For more information, the documentation has details about the use of user variables with LOAD DATA (about half-way down - search for "User variables in the SET clause" in the page)
You can use variables to load the data from the csv into and run functions on them before inserting, like:
LOAD DATA INFILE 'file.txt'
INTO TABLE t1
(#datevar, #timevar, awayteam_id, hometeam_id)
SET date = STR_TO_DATE(#datevar, '%m/%d/%Y'),
SET time = etc etc etc;
My suggestion would be to insert the file into a temporary holding table where the date column is a character datatype. Then write a query with theSTR_TO_DATE conversion to move the data from the holding table to your final destination.
Convert field that you are using for the date to varchar type so it will play friendly with any format
Import CSV
Convert the dates to a valid mysql date format using something like:
UPDATE table SET field = STR_TO_DATE(field, '%c/%e/%Y %H:%i');
Then revert field type to date
Use a function to convert the format as needed.
I'm not an expert on MySQL, but http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_str-to-date looks promising.
If you can't do that in the load command directly, you may try creating a table that allows you to load all the values as VARCHAR and then to do an insert into your game table with a select statement with the appropriate conversion instead.
If you file is not too big, you can use the Excel function TEXT. If, for example, your date is in cell A2, then the formula in a temporary column next to it would be =TEXT(A2,"yyyy-mm-dd hh:mm:ss"). This will do it and then you can paste the values of the formula's result back into the column and then delete the temporary column.