In MYSQL, how to upload a csv file that contains a date in the format of '1/1/2020' properly into a DATE data type format (standard YYYY-MM-DD) - mysql

I have a column of data, let's call it bank_date, that I receive from an external vendor as a csv file every day. As such the dates in that column show as '1/1/2020'.
I am trying to upload that raw csv file directly to SQL daily. We used to store the SQL bank_date format as text, but we have converted it to a Data data type, and now it keeps zero'ing out every time, with some sort of truncate / "datetime value incorrect" error.
I have now tested 17 different versions of utilizing STR_TO_date (mostly), CAST, and CONVERT, and feel like I'm close, but I'm not quite getting the syntax right.
Also for reference, I did find 2 other workarounds that are successful, but my boss specifically wants it uploaded and converted directly through the import process (not manipulating the raw csv data) for safety reasons. For reference:
Workaround 1: Convert csv date column to the YYYY-MM-DD format and save file. The issue with this is that if you try to open that CSV file again, it auto-changes the date format back to the standard mm/dd/yyyy. If someone doesn't know to watch out for this and is re-opening the csv file to double check something, they're gonna find an error when they upload, and the problem is not easy to identify.
Workaround 2:Create an extra dummy_date column in the table that is formatted as a text data type and upload as normal. Then copy and paste the data into the correct bank_date column using a str_to_date function as follows: UPDATE dummy_date SET bank_date = STR_TO_DATE(dummy_date, ‘%c/%e/%Y’); The issue with this is that it just creates extra unnecessary data that can be confused when other people may not know that 1 of the columns is not intended for querying.
Here is my current code:
USE database_name;
LOAD DATA LOCAL INFILE 'C:/Users/Shelly/Desktop/Date Import.csv'
INTO TABLE bank_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(bank_date, bank_amount)
SET bank_date = str_to_date(bank_date,'%Y-%m-%d');
The "SET" line is what I cannot work out on syntax to convert a csv's 1/5/2020' to SQL's 2020-1-5 format. Every test I've made either produces 0000-00-00 or nulls the column cells. I'm thinking maybe I need to tell SQL how to understand the csv's format in order for it to know how to convert it. Newbie here and stuck.

You need to specify a format for a date that is in the file, not a "required" one:
SET bank_date = str_to_date(bank_date,'%c/%e/%Y');

Related

date field values are wrong while importing csv into mysql

I am importing csv file into mysql using load data. My load data command is as mentioned below.
load data local infile 'D:/mydata.csv' into table mydb.mydata
fields terminated by ','
enclosed by '"'
lines terminated by '\r\n'
ignore 1 lines
(SrNo,SourceFrom,#var_updated,Title,First_Name,Middle_Name,Last_Name,Designation,Company_Name,#var_dob,Office_Mobile_No)
set updated = str_to_date(#var_updated,'%Y-%m-%d'), dob = str_to_date(#var_dob, '%Y-%m-%d');
I am getting different values in my "Updated" and "DOB" columns. Such values are different in my .csv file.
First image is from mysql workbench while another is of csv.
Also, I sat "office_mobile_no" column's format to 'number' in csv. But its showing number like this.
When I double click on it, then only it shows the real number like 9875461234. It imports the same in mysql too. How do I get original number in a specific column? Also why my imported date values are differ from csv's date columns?
A couple of points that I can see:
It looks from your screenshot like the data in your CSV file for "updated" is in d-m-Y format, but you're telling the import to look for Y-m-d. I think you need to change
set updated = str_to_date(#var_updated,'%Y-%m-%d')
to
set updated = str_to_date(#var_updated,'%d-%m-%Y')
And the same for DOB field as well, assuming your CSV has that in the same format.
You said I sat "office_mobile_no" column's format to 'number' in csv. CSV is a text file format, it doesn't store any information about how to display data. What you're seeing is just how Excel decides to display large numbers by default. You can change that, but your changes won't be saved when you save it to CSV, because the CSV file format doesn't include that sort of information. Try opening the file in Notepad++ and seeing the real format of the file.

select * INTO outfile generates date in wrong format

I am using the below sql to export data to a csv file:
select * INTO outfile 'customer.csv'
FIELDs terminated by ',' enclosed by '"'
LINES terminated by '\n'
from CUSTOMER customer
where customer.date_created < '2015-10-22 10:00:00';
I get this result in csv:
Problem is data doesn't import from this generated csv because the date format is different than in DB.
DB date format is yyyy-mm-dd hh:mm:ss. Also null values are replaced with \n which also fail when importing.
How can I generate the csv columns with correct i.e. yyyy-mm-dd hh:mm:ss date format and null/empty values?
Errors:
Incorrect datetime value: '23/07/2015 11:55' for column 'DATE_CREATED' at row 1
Incorrect integer value: '\N' for column 'column_name' at row 1
Note:
I am using mysql workbench to import the file.
I don't want to change the format/data directly in csv file.
Thanks
UPDATE:
Thanks to AdrianBR I realised I was opening the file with excel first which was overriding the date format hence wrong date format was showing even with notepad++.
\n is still a problem.
When opened with notepad++ for the first time it looks like this:
"100","0","2015-12-02 10:16:36","2015-12-02 10:16:36","0",\N,
The issue is most likely not coming from mysql.
It is most likely coming from the way Excel displays and later saves the dates.
to troublehsoot:
Open the file in a text editor such as notepad or notepad++ and check what the date looks like, if it's in ISO or not. It will probably be fine.
Now, if you open it in excel, it will be displayed in local format.
If you save the file now, you are likely to overwrite the ISO date format, with excel's local date format, making it not a valid importable mysql date anymore.
Moral: don't use excel when working with data, only use it to display charts. Excel makes assumptions about your data and messes with it in the most unexpected ways. Remember than 1.19 VAT tax rate? Excel seems to think it's the same as Jan 19. That integer ID? Excel thinks it's better off to write it in scientific notation and round it to first 4 digits. That Iso date? Excel thinks you are better off guessing which is the month and which is the date. That decimal point? surely you wanted comma as decimal, and dot as thousands separator instead. FTFY!
maybe specify the columns explicitly, and include the format in the select list like so:
TO_CHAR( mydate, 'yyyy-mm-dd hh24:mi:ss' )
edit:
as in:
SELECT my_id, my_val1, TO_CHAR( mydate, 'yyyy-mm-dd hh24:mi:ss' )
FROM mytable

Unable to import 3.4GB csv into redshift because values contains free-text with commas

And so we found a 3.6GB csv that we have uploaded onto S3 and now want to import into Redshift, then do the querying and analysis from iPython.
Problem 1:
This comma delimited file contains values free text that also contains commas and this is interfering with the delimiting so can’t upload to Redshift.
When we tried opening the sample dataset in Excel, Excel surprisingly puts them into columns correctly.
Problem 2:
A column that is supposed to contain integers have some records containing alphabets to indicate some other scenario.
So, the only way to get the import through is to declare this column as varchar. But then we can do calculations later on.
Problem 3:
The datetime data type requires the date time value to be in the format YYYY-MM-DD HH:MM:SS, but the csv doesn’t contain the SS and the database is rejecting the import.
We can’t manipulate the data on a local machine because it is too big, and we can’t upload onto the cloud for computing because it is not in the correct format.
The last resort would be to scale the instance running iPython all the way up so that we can read the big csv directly from S3, but this approach doesn’t make sense as a long-term solution.
Your suggestions?
Train: https://s3-ap-southeast-1.amazonaws.com/bucketbigdataclass/stack_overflow_train.csv (3.4GB)
Train Sample: https://s3-ap-southeast-1.amazonaws.com/bucketbigdataclass/stack_overflow_train-sample.csv (133MB)
Try having different delimiter or use escape characters.
http://docs.aws.amazon.com/redshift/latest/dg/r_COPY_preparing_data.html
For second issue, if you want to extract only numbers from the column after loading into char use regexp_replace or other functions.
For third issue, you can as well load it into VARCHAR field and then use substring cast(left(column_name, 10)||' '||right(column_name, 6)||':00' as timestamp)
to load it into final table from staging table
For the first issue, you need to find out a way to differentiate between the two types of commas - the delimiter and the text commas. Once you have done that, replace the delimiters with a different delimiter and use the same as delimiter in the copy command for Redshift.
For the second issue, you need to first figure out if this column needs to be present for numerical aggregations once loaded. If yes, you need to get this data cleaned up before loading. If no, you can directly load this as char/ varchar field. All your queries will still work but you will not be able to do any aggregations (sum/ avg and the likes) on this field.
For problem 3, you can use Text(date, "yyyy-mm-dd hh:mm:ss") function in excel to do a mass replace for this field.
Let me know if this works out.

Importing an excel .csv file and adding it to a column in phpMyAdmin

I've read through some other posts and nothing quite answers my question specifically.
I have an existing database in phpMyAdmin - a set of pin codes we use to collect contest entries.
The DB has about 10,000 pin codes in it.
I need to add 250 "New" codes to it. I have an excel file that is stripped down to a single column .csv, no header - just codes.
What I need to do is import this into the table named "pin2" and add these to the row called "pin"
The other rows are where entrants would add names and phone numbers, so are all "null"
I've uploaded a screen grab of the structure.
DB Structure http://www.redpointdesign.ca/sql.png
any help would be appreciated!
You need to use a LOAD DATA query similar to this:
LOAD DATA INFILE 'pincodes.csv'
INTO TABLE pin2 (pin)
If the pin codes in the csv file are enclosed in quotes you may also need to include an ENCLOSED BY clause.
LOAD DATA INFILE 'pincodes.csv'
INTO TABLE pin2
FIELDS ENCLOSED BY '"'
(pin)
If you wants to do using csv
Then you need to need to follow these steps
Manually define autoincremented value in first comlumn.
In other column you have to externally define it as a NULL,
otherwise you will get Invalid column count in CSV input on line 1.
because column with no value is not consider by phpmyadmin
Them click on import in phpmyadmin and you are done ..

MySql load data infile STR_TO_DATE returning blank?

i'm importing 1m+ records into my table from a csv file.
Works great using the load data local infile method.
However, the dates are all different formats.
A quick google lead me to this function:
STR_TO_DATE
However, when I implement that, I get nothing, an empty insert. here's my SQ cut down to include one date (I've 4 with the same issue) and generic column names:
load data local infile 'myfile.csv' into table `mytable`
fields terminated by '\t'
lines terminated by '\n'
IGNORE 1 LINES
( `column name 1`
, `my second column`
, #temp_date
, `final column`)
SET `Get Date` = STR_TO_DATE(#temp_date, '%c/%e/%Y')
If I do:
SET `Get Date` = #temp_date
The date from the csv is captured in the the format it was in the file.
However when I try the first method, my table column is empty. I've changed the column type to varchar (255) from timestamp to captre whatever is going in, but ultimatly, I want to capture y-m-d H:i:s (Not sure if STR_TO_DATE can do that?)
I'm also unsure as to why I need the # symbol.. google failed me there.
So, my questions are:
Why do I need the # symbol to use this function?
Should the data format ('%c/%e/%Y') be the format of the inputted data or my desired output?
Can I capture time in this way too?
sorry for the large post!
Back to Google for now...
Why do I need the # symbol to use this function?
The # symbol means that you are using a variable, so the read string isnt put right away into the table but into a memory pice that lets you operate with it before inserting it. More info in http://dev.mysql.com/doc/refman/5.0/en/user-variables.html
Should the data format ('%c/%e/%Y') be the format of the inputted data or my desired output?
Its the format of the inputted data, more info in http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_str-to-date
Can I capture time in this way too?
You should be able to as long as you chose the correct format, something like
STR_TO_DATE(#temp_date,'%c/%e/%Y %h:%i:%s');
I had this problem. What solved it for me was making sure I accounted for whitespace that weren't delimiters in my load file. So if ',' is the delimiter:
..., 4/29/2012, ...
might be interpreted as " 4/29/2012"
So should be
...,4/29/2012,...