Restrict invalid date format data using sql loader - sql-loader

I want to load data with date format as 'MM/DD/YYYY', but currently facing a challenge with below data in input file.
e.g. As per below code it's allowing 09/01/20 and 09/11/2020. but i want to reject the 09/01/20 as YYYY is not 4 digit.
How can I reject this data?
Code:
load data
infile 'file1.csv'
append
into table my_table
fields terminated by ',' TRAILING NULLCOLS
(Business_function,
Case_reference,
Sub_sequence,
Dialler_Master_Stream ,
Dialler_Call_Stream,
Dialler_Super_Stream,
Attempt_Number,
Dialled_Number,
Date_Called DATE "MM/DD/YYYY")

Are you sure you don't want the date format to be "MM/DD/RRRR" Which will handle the century part for you and make it 4 digits?
See The RR Datetime Format Element for more detailed information.

Related

In MYSQL, how to upload a csv file that contains a date in the format of '1/1/2020' properly into a DATE data type format (standard YYYY-MM-DD)

I have a column of data, let's call it bank_date, that I receive from an external vendor as a csv file every day. As such the dates in that column show as '1/1/2020'.
I am trying to upload that raw csv file directly to SQL daily. We used to store the SQL bank_date format as text, but we have converted it to a Data data type, and now it keeps zero'ing out every time, with some sort of truncate / "datetime value incorrect" error.
I have now tested 17 different versions of utilizing STR_TO_date (mostly), CAST, and CONVERT, and feel like I'm close, but I'm not quite getting the syntax right.
Also for reference, I did find 2 other workarounds that are successful, but my boss specifically wants it uploaded and converted directly through the import process (not manipulating the raw csv data) for safety reasons. For reference:
Workaround 1: Convert csv date column to the YYYY-MM-DD format and save file. The issue with this is that if you try to open that CSV file again, it auto-changes the date format back to the standard mm/dd/yyyy. If someone doesn't know to watch out for this and is re-opening the csv file to double check something, they're gonna find an error when they upload, and the problem is not easy to identify.
Workaround 2:Create an extra dummy_date column in the table that is formatted as a text data type and upload as normal. Then copy and paste the data into the correct bank_date column using a str_to_date function as follows: UPDATE dummy_date SET bank_date = STR_TO_DATE(dummy_date, ‘%c/%e/%Y’); The issue with this is that it just creates extra unnecessary data that can be confused when other people may not know that 1 of the columns is not intended for querying.
Here is my current code:
USE database_name;
LOAD DATA LOCAL INFILE 'C:/Users/Shelly/Desktop/Date Import.csv'
INTO TABLE bank_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(bank_date, bank_amount)
SET bank_date = str_to_date(bank_date,'%Y-%m-%d');
The "SET" line is what I cannot work out on syntax to convert a csv's 1/5/2020' to SQL's 2020-1-5 format. Every test I've made either produces 0000-00-00 or nulls the column cells. I'm thinking maybe I need to tell SQL how to understand the csv's format in order for it to know how to convert it. Newbie here and stuck.
You need to specify a format for a date that is in the file, not a "required" one:
SET bank_date = str_to_date(bank_date,'%c/%e/%Y');

date field values are wrong while importing csv into mysql

I am importing csv file into mysql using load data. My load data command is as mentioned below.
load data local infile 'D:/mydata.csv' into table mydb.mydata
fields terminated by ','
enclosed by '"'
lines terminated by '\r\n'
ignore 1 lines
(SrNo,SourceFrom,#var_updated,Title,First_Name,Middle_Name,Last_Name,Designation,Company_Name,#var_dob,Office_Mobile_No)
set updated = str_to_date(#var_updated,'%Y-%m-%d'), dob = str_to_date(#var_dob, '%Y-%m-%d');
I am getting different values in my "Updated" and "DOB" columns. Such values are different in my .csv file.
First image is from mysql workbench while another is of csv.
Also, I sat "office_mobile_no" column's format to 'number' in csv. But its showing number like this.
When I double click on it, then only it shows the real number like 9875461234. It imports the same in mysql too. How do I get original number in a specific column? Also why my imported date values are differ from csv's date columns?
A couple of points that I can see:
It looks from your screenshot like the data in your CSV file for "updated" is in d-m-Y format, but you're telling the import to look for Y-m-d. I think you need to change
set updated = str_to_date(#var_updated,'%Y-%m-%d')
to
set updated = str_to_date(#var_updated,'%d-%m-%Y')
And the same for DOB field as well, assuming your CSV has that in the same format.
You said I sat "office_mobile_no" column's format to 'number' in csv. CSV is a text file format, it doesn't store any information about how to display data. What you're seeing is just how Excel decides to display large numbers by default. You can change that, but your changes won't be saved when you save it to CSV, because the CSV file format doesn't include that sort of information. Try opening the file in Notepad++ and seeing the real format of the file.

select * INTO outfile generates date in wrong format

I am using the below sql to export data to a csv file:
select * INTO outfile 'customer.csv'
FIELDs terminated by ',' enclosed by '"'
LINES terminated by '\n'
from CUSTOMER customer
where customer.date_created < '2015-10-22 10:00:00';
I get this result in csv:
Problem is data doesn't import from this generated csv because the date format is different than in DB.
DB date format is yyyy-mm-dd hh:mm:ss. Also null values are replaced with \n which also fail when importing.
How can I generate the csv columns with correct i.e. yyyy-mm-dd hh:mm:ss date format and null/empty values?
Errors:
Incorrect datetime value: '23/07/2015 11:55' for column 'DATE_CREATED' at row 1
Incorrect integer value: '\N' for column 'column_name' at row 1
Note:
I am using mysql workbench to import the file.
I don't want to change the format/data directly in csv file.
Thanks
UPDATE:
Thanks to AdrianBR I realised I was opening the file with excel first which was overriding the date format hence wrong date format was showing even with notepad++.
\n is still a problem.
When opened with notepad++ for the first time it looks like this:
"100","0","2015-12-02 10:16:36","2015-12-02 10:16:36","0",\N,
The issue is most likely not coming from mysql.
It is most likely coming from the way Excel displays and later saves the dates.
to troublehsoot:
Open the file in a text editor such as notepad or notepad++ and check what the date looks like, if it's in ISO or not. It will probably be fine.
Now, if you open it in excel, it will be displayed in local format.
If you save the file now, you are likely to overwrite the ISO date format, with excel's local date format, making it not a valid importable mysql date anymore.
Moral: don't use excel when working with data, only use it to display charts. Excel makes assumptions about your data and messes with it in the most unexpected ways. Remember than 1.19 VAT tax rate? Excel seems to think it's the same as Jan 19. That integer ID? Excel thinks it's better off to write it in scientific notation and round it to first 4 digits. That Iso date? Excel thinks you are better off guessing which is the month and which is the date. That decimal point? surely you wanted comma as decimal, and dot as thousands separator instead. FTFY!
maybe specify the columns explicitly, and include the format in the select list like so:
TO_CHAR( mydate, 'yyyy-mm-dd hh24:mi:ss' )
edit:
as in:
SELECT my_id, my_val1, TO_CHAR( mydate, 'yyyy-mm-dd hh24:mi:ss' )
FROM mytable

MySql load data infile STR_TO_DATE returning blank?

i'm importing 1m+ records into my table from a csv file.
Works great using the load data local infile method.
However, the dates are all different formats.
A quick google lead me to this function:
STR_TO_DATE
However, when I implement that, I get nothing, an empty insert. here's my SQ cut down to include one date (I've 4 with the same issue) and generic column names:
load data local infile 'myfile.csv' into table `mytable`
fields terminated by '\t'
lines terminated by '\n'
IGNORE 1 LINES
( `column name 1`
, `my second column`
, #temp_date
, `final column`)
SET `Get Date` = STR_TO_DATE(#temp_date, '%c/%e/%Y')
If I do:
SET `Get Date` = #temp_date
The date from the csv is captured in the the format it was in the file.
However when I try the first method, my table column is empty. I've changed the column type to varchar (255) from timestamp to captre whatever is going in, but ultimatly, I want to capture y-m-d H:i:s (Not sure if STR_TO_DATE can do that?)
I'm also unsure as to why I need the # symbol.. google failed me there.
So, my questions are:
Why do I need the # symbol to use this function?
Should the data format ('%c/%e/%Y') be the format of the inputted data or my desired output?
Can I capture time in this way too?
sorry for the large post!
Back to Google for now...
Why do I need the # symbol to use this function?
The # symbol means that you are using a variable, so the read string isnt put right away into the table but into a memory pice that lets you operate with it before inserting it. More info in http://dev.mysql.com/doc/refman/5.0/en/user-variables.html
Should the data format ('%c/%e/%Y') be the format of the inputted data or my desired output?
Its the format of the inputted data, more info in http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_str-to-date
Can I capture time in this way too?
You should be able to as long as you chose the correct format, something like
STR_TO_DATE(#temp_date,'%c/%e/%Y %h:%i:%s');
I had this problem. What solved it for me was making sure I accounted for whitespace that weren't delimiters in my load file. So if ',' is the delimiter:
..., 4/29/2012, ...
might be interpreted as " 4/29/2012"
So should be
...,4/29/2012,...

How to convert date in .csv file into SQL format before mass insertion

I have a csv file with a couple thousand game dates in it, but they are all in the MM/DD/YYYY format
2/27/2011,3:05 PM,26,14
(26 and 14 are team id #s), and trying to put them into SQL like that just results in 0000-00-00 being put into the date field of my table. This is the command I tried using:
LOAD DATA LOCAL INFILE 'c:/scheduletest.csv' INTO TABLE game
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(`date`, `time`, `awayteam_id`, `hometeam_id`);
but again, it wouldn't do the dates right. Is there a way I can have it convert the date as it tries to insert it? I found another SO question similar to this, but I couldn't get it to work.
Have you tried the following:
LOAD DATA LOCAL INFILE 'c:/scheduletest.csv' INTO TABLE game
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(#DATE_STR, `time`, `awayteam_id`, `hometeam_id`)
SET `date` = STR_TO_DATE(#DATE_STR, '%c/%e/%Y');
For more information, the documentation has details about the use of user variables with LOAD DATA (about half-way down - search for "User variables in the SET clause" in the page)
You can use variables to load the data from the csv into and run functions on them before inserting, like:
LOAD DATA INFILE 'file.txt'
INTO TABLE t1
(#datevar, #timevar, awayteam_id, hometeam_id)
SET date = STR_TO_DATE(#datevar, '%m/%d/%Y'),
SET time = etc etc etc;
My suggestion would be to insert the file into a temporary holding table where the date column is a character datatype. Then write a query with theSTR_TO_DATE conversion to move the data from the holding table to your final destination.
Convert field that you are using for the date to varchar type so it will play friendly with any format
Import CSV
Convert the dates to a valid mysql date format using something like:
UPDATE table SET field = STR_TO_DATE(field, '%c/%e/%Y %H:%i');
Then revert field type to date
Use a function to convert the format as needed.
I'm not an expert on MySQL, but http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_str-to-date looks promising.
If you can't do that in the load command directly, you may try creating a table that allows you to load all the values as VARCHAR and then to do an insert into your game table with a select statement with the appropriate conversion instead.
If you file is not too big, you can use the Excel function TEXT. If, for example, your date is in cell A2, then the formula in a temporary column next to it would be =TEXT(A2,"yyyy-mm-dd hh:mm:ss"). This will do it and then you can paste the values of the formula's result back into the column and then delete the temporary column.