Dear All,
I am working on a Java script capable of loading historical market data from files to MySQL tables, where each data record contains 'date' and 'timestamp' field. A usual record is represented as:
01/11/2010,10:00:00.007,P,210.8,210.86,6,2,R,,2486662,P,P,,0,,N
01/11/2010,10:00:00.577,W,210.51,211.61,1,1,R,,2487172,W,W,,0,,N
where the first column represents 'date' and the second represents 'timestamp' (with a millisecond granularity). Since MySQL do not support millisecond granularity, I am using BIGINT (long) representation of the 'timestamp' field.
Currently the load query is as follows:
LOAD DATA LOCAL INFILE 'path to file'
INTO TABLE table
FIELDS TERMINATED BY ','
(#date, #time, remaining columns ...)
SET date=STR_TO_DATE(#date, '%m/%d/%Y'),
SET time=UNIX_TIMESTAMP( STR_TO_DATE(#time, '%H:%i:%S.%f') )
The query works fine, apart from the last line (which throws SQLExceptions). I've tried different format patterns but am having problems working out the conversion from string to timestamp, before converting to unix timestamp. Google isn't helping either. I would really appreciate if anyone could suggest reasonable solution. Unfortunately, changing MySQL to other database is not an option.
Regards,
UNIX_TIMESTAMP wants a "date time" string - you're giving it a "time" string.
select UNIX_TIMESTAMP(CONCAT(STR_TO_DATE("01/11/2010", "%m/%d/%Y"), " ", STR_TO_DATE("10:00:00.577", "%H:%i:%S.%f")));
Related
I have a column of data, let's call it bank_date, that I receive from an external vendor as a csv file every day. As such the dates in that column show as '1/1/2020'.
I am trying to upload that raw csv file directly to SQL daily. We used to store the SQL bank_date format as text, but we have converted it to a Data data type, and now it keeps zero'ing out every time, with some sort of truncate / "datetime value incorrect" error.
I have now tested 17 different versions of utilizing STR_TO_date (mostly), CAST, and CONVERT, and feel like I'm close, but I'm not quite getting the syntax right.
Also for reference, I did find 2 other workarounds that are successful, but my boss specifically wants it uploaded and converted directly through the import process (not manipulating the raw csv data) for safety reasons. For reference:
Workaround 1: Convert csv date column to the YYYY-MM-DD format and save file. The issue with this is that if you try to open that CSV file again, it auto-changes the date format back to the standard mm/dd/yyyy. If someone doesn't know to watch out for this and is re-opening the csv file to double check something, they're gonna find an error when they upload, and the problem is not easy to identify.
Workaround 2:Create an extra dummy_date column in the table that is formatted as a text data type and upload as normal. Then copy and paste the data into the correct bank_date column using a str_to_date function as follows: UPDATE dummy_date SET bank_date = STR_TO_DATE(dummy_date, ‘%c/%e/%Y’); The issue with this is that it just creates extra unnecessary data that can be confused when other people may not know that 1 of the columns is not intended for querying.
Here is my current code:
USE database_name;
LOAD DATA LOCAL INFILE 'C:/Users/Shelly/Desktop/Date Import.csv'
INTO TABLE bank_table
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(bank_date, bank_amount)
SET bank_date = str_to_date(bank_date,'%Y-%m-%d');
The "SET" line is what I cannot work out on syntax to convert a csv's 1/5/2020' to SQL's 2020-1-5 format. Every test I've made either produces 0000-00-00 or nulls the column cells. I'm thinking maybe I need to tell SQL how to understand the csv's format in order for it to know how to convert it. Newbie here and stuck.
You need to specify a format for a date that is in the file, not a "required" one:
SET bank_date = str_to_date(bank_date,'%c/%e/%Y');
I'm currently learning/testing Hive and can't seem to find a suitable solution to this problem:
I have log files which look like this:
IP, Date, Time, URL, Useragent
Which I have currently in a Table with these Columns. These Columns are delimited by '\t' but URL has been given some specific client information looking somewhat like this:
example.org/log.gif?userID=xxx&sex=m&age=y&subscriber=y&lastlogin=ddd
and I want to create a new table with these given value-pairs: userID, sex, age, subscriber, lastlogin another problem being that the value-pairs are not always complete, or some are missing. Like this:
example.org/log.gif?userID=xxx&sex=m&age=y&subscriber=y&lastlogin=ddd
example.org/log.gif?userID=xxx&sex=m&age=y&lastlogin=
Which makes Hive's ... format delimited fields terminated by '&'; afaik useless in this case because it would lead to wrong values in columns.
Is there a way to solve this problem in Hive with SQL and regex?
This can be done, albeit with two Hive tables. You first load data into one table with the columns:
IP, Date, Time, URL, Useragent
Here I recommend using an EXTERNAL Hive table - you aren't parsing the data and this Hive table doesn't need to exist for very long, so simply place Hive metadata on top of it:
CREATE EXTERNAL TABLE raw_log (
ip string,
date string,
time string,
url string,
useragent string
)
LOCATION '<hdfs_location_of_the_raw_log_folder>';
Then use an INSERT INTO query with the Hive regexp_extract(string subject, string pattern, int index) method (see https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF) to load it into the "final" table with the correct columns.
You can also write your own UDF which would enable you to better-handle the incomplete/missing values that you mention, albeit with the tradeoff that you have to re-compile and re-deploy a JAR every time the input data's format changes (see https://cwiki.apache.org/confluence/display/Hive/HivePlugins).
i'm importing 1m+ records into my table from a csv file.
Works great using the load data local infile method.
However, the dates are all different formats.
A quick google lead me to this function:
STR_TO_DATE
However, when I implement that, I get nothing, an empty insert. here's my SQ cut down to include one date (I've 4 with the same issue) and generic column names:
load data local infile 'myfile.csv' into table `mytable`
fields terminated by '\t'
lines terminated by '\n'
IGNORE 1 LINES
( `column name 1`
, `my second column`
, #temp_date
, `final column`)
SET `Get Date` = STR_TO_DATE(#temp_date, '%c/%e/%Y')
If I do:
SET `Get Date` = #temp_date
The date from the csv is captured in the the format it was in the file.
However when I try the first method, my table column is empty. I've changed the column type to varchar (255) from timestamp to captre whatever is going in, but ultimatly, I want to capture y-m-d H:i:s (Not sure if STR_TO_DATE can do that?)
I'm also unsure as to why I need the # symbol.. google failed me there.
So, my questions are:
Why do I need the # symbol to use this function?
Should the data format ('%c/%e/%Y') be the format of the inputted data or my desired output?
Can I capture time in this way too?
sorry for the large post!
Back to Google for now...
Why do I need the # symbol to use this function?
The # symbol means that you are using a variable, so the read string isnt put right away into the table but into a memory pice that lets you operate with it before inserting it. More info in http://dev.mysql.com/doc/refman/5.0/en/user-variables.html
Should the data format ('%c/%e/%Y') be the format of the inputted data or my desired output?
Its the format of the inputted data, more info in http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_str-to-date
Can I capture time in this way too?
You should be able to as long as you chose the correct format, something like
STR_TO_DATE(#temp_date,'%c/%e/%Y %h:%i:%s');
I had this problem. What solved it for me was making sure I accounted for whitespace that weren't delimiters in my load file. So if ',' is the delimiter:
..., 4/29/2012, ...
might be interpreted as " 4/29/2012"
So should be
...,4/29/2012,...
I have a csv file with a couple thousand game dates in it, but they are all in the MM/DD/YYYY format
2/27/2011,3:05 PM,26,14
(26 and 14 are team id #s), and trying to put them into SQL like that just results in 0000-00-00 being put into the date field of my table. This is the command I tried using:
LOAD DATA LOCAL INFILE 'c:/scheduletest.csv' INTO TABLE game
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(`date`, `time`, `awayteam_id`, `hometeam_id`);
but again, it wouldn't do the dates right. Is there a way I can have it convert the date as it tries to insert it? I found another SO question similar to this, but I couldn't get it to work.
Have you tried the following:
LOAD DATA LOCAL INFILE 'c:/scheduletest.csv' INTO TABLE game
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(#DATE_STR, `time`, `awayteam_id`, `hometeam_id`)
SET `date` = STR_TO_DATE(#DATE_STR, '%c/%e/%Y');
For more information, the documentation has details about the use of user variables with LOAD DATA (about half-way down - search for "User variables in the SET clause" in the page)
You can use variables to load the data from the csv into and run functions on them before inserting, like:
LOAD DATA INFILE 'file.txt'
INTO TABLE t1
(#datevar, #timevar, awayteam_id, hometeam_id)
SET date = STR_TO_DATE(#datevar, '%m/%d/%Y'),
SET time = etc etc etc;
My suggestion would be to insert the file into a temporary holding table where the date column is a character datatype. Then write a query with theSTR_TO_DATE conversion to move the data from the holding table to your final destination.
Convert field that you are using for the date to varchar type so it will play friendly with any format
Import CSV
Convert the dates to a valid mysql date format using something like:
UPDATE table SET field = STR_TO_DATE(field, '%c/%e/%Y %H:%i');
Then revert field type to date
Use a function to convert the format as needed.
I'm not an expert on MySQL, but http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_str-to-date looks promising.
If you can't do that in the load command directly, you may try creating a table that allows you to load all the values as VARCHAR and then to do an insert into your game table with a select statement with the appropriate conversion instead.
If you file is not too big, you can use the Excel function TEXT. If, for example, your date is in cell A2, then the formula in a temporary column next to it would be =TEXT(A2,"yyyy-mm-dd hh:mm:ss"). This will do it and then you can paste the values of the formula's result back into the column and then delete the temporary column.
At csv file, the date field is in such format:
2/9/2010 7:32
3/31/2011 21:20
I am using php + mysql for development.
I need to read it and store into mysql db.
final value to store in mysql should be format as below:
2010-02-09 07:32:00
What's the correct way of it?
Is mysql syntax alone can handle the conversion easily?
Use the STR_TO_DATE() function.
Example
STR_TO_DATE('3/31/2011 21:20', '%c/%e/%Y %H:%i');
I face the same issue and after little research this is how i resolved it-
LOAD DATA LOCAL INFILE 'D:/dataupload.csv' INTO TABLE table1
FIELDS TERMINATED BY ',' ENCLOSED BY '' LINES TERMINATED BY '\r\n' (#d1,col2,col3,col4)
SET col1 = date_format(str_to_date(#d1, **'%m/%d/%Y'**), **'%Y-%m-%d'**)
Details:
'%m/%d/%Y' - this is the format of date in my CSV file
'%Y-%m-%d' - this is the mysql format in which i want to convert my CSV field date while inserting data
col1 - is the actual column of my table (having date data type)
#d1 - is the dummy variable to use in set statement, you can take it any variable
I had the same problem (with DATE) and another solution, is to use the native mysql format YYYYMMDD ie 20120209.
I haven't tried with DATETIME but I guess YYYYMMDDhhmmss will work.