I have a csv file containing timestamps like:
2018-01-01T12:13:14.000+01:00
I would like to store them as timestamp in Hive, is it possible to directly do it, or should I preprocess the csv file in order to have "better" timestamps ?
The following query is not able to correctly store them:
CREATE EXTERNAL TABLE IF NOT EXISTS test_timestamps(
timestamp TIMESTAMP,
name STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
location '/test_timestamps/';
Thank you
If you want to retain the format, store it as timestamp STRING and use the DATE functions to convert it to required format when you select from the table.
Note: All Hive keywords are case-insensitive,you might want to use a a proper name for the column instead of "timestamp".
select date_format(timestamp, "yyyy-MM-dd'T'hh:mm:ss.SSS'Z'"),name from test_timestamps;
Related
I have a csv file that i want to create a database from. Now everything works fine except for the date that is now stored in a VARCHAR column.
Is it possible to make it so that the date get stored in a DATE column when i import it from my phpMyAdmin? The csv looks like this:
Date,HomeTeam,AwayTeam,FTHG,FTAG,FTR,HTHG,HTAG,HTR
29-09-2017,Excelsior,Vitesse,0,3,A,0,1,A
30-09-2017,Heracles,Feyenoord,2,4,A,0,3,A
30-09-2017,Willem II,Den Haag,1,2,A,0,1,A
You may use LOAD DATA and inline a call to STR_TO_DATE to convert your string dates on the fly:
LOAD DATA LOCAL INFILE 'yourfile.csv'
INTO TABLE yourTable
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
(#Date, HomeTeam, AwayTeam, FTHG, FTAG, FTR, HTHG, HTAG, HTR)
SET Date = STR_TO_DATE(#Date, '%d/%m/%Y');
By the way, you are absolutely making the right design decision by storing your date information as actual dates, and not just as text. Storing dates as text opens the door for problems later on when you actually go to use your database table.
I'm importing a CSV into MonetDB. I create a table called fx:
CREATE TABLE fx(ticktime timestamp,broker varchar(6),pair varchar(10),side varchar(1),price float,size tinyint,level tinyint)
and now I am trying to upload a large CSV file that does not have a header.
My sample.csv:
20150828 00:00:00.023,BRK1,EUR/USD,A,1.12437,1,1
20150828 00:00:00.023,BRK1,EUR/USD,A,1.12439,5,2
20150828 00:00:00.023,BRK1,EUR/USD,A,1.12441,9,3
My command:
sql>copy into fx from 'c:\fx\sample.csv' using delimiters ',','\n';
Failed to import table line 1 field 1 'timestamp(7)' expected in '20150828 00:00:00.023'
How do I upload this csv?
The timestamp format in your file is not the one MonetDB likes. So two options:
1) Change the type of ticktime to string:
CREATE TABLE fx(ticktime string, broker varchar(6),pair varchar(10),side varchar(1),price float,size tinyint,level tinyint);
COPY INTO ...
However, you would then need to convert the string column ticktime to a new column ticktimet of type timestamp using string manipulation, for example:
ALTER TABLE fx add column ticktimet timestamp;
UPDATE fx SET ticktimet=str_to_timestamp(ticktime , '%Y%m%d %H:%M:%S');
Note that this solution will discard the subsecond part (e.g. .023) from the timestamp, as this is currently not supported in str_to_timestamp.
2) Change the CSV to use a date format MonetDB likes, e.g.
2015-08-28 00:00:00.023,BRK1,EUR/USD,A,1.12437,1,1
2015-08-28 00:00:00.023,BRK1,EUR/USD,A,1.12439,5,2
2015-08-28 00:00:00.023,BRK1,EUR/USD,A,1.12441,9,3
Then, COPY INTO should work directly.
i'm importing 1m+ records into my table from a csv file.
Works great using the load data local infile method.
However, the dates are all different formats.
A quick google lead me to this function:
STR_TO_DATE
However, when I implement that, I get nothing, an empty insert. here's my SQ cut down to include one date (I've 4 with the same issue) and generic column names:
load data local infile 'myfile.csv' into table `mytable`
fields terminated by '\t'
lines terminated by '\n'
IGNORE 1 LINES
( `column name 1`
, `my second column`
, #temp_date
, `final column`)
SET `Get Date` = STR_TO_DATE(#temp_date, '%c/%e/%Y')
If I do:
SET `Get Date` = #temp_date
The date from the csv is captured in the the format it was in the file.
However when I try the first method, my table column is empty. I've changed the column type to varchar (255) from timestamp to captre whatever is going in, but ultimatly, I want to capture y-m-d H:i:s (Not sure if STR_TO_DATE can do that?)
I'm also unsure as to why I need the # symbol.. google failed me there.
So, my questions are:
Why do I need the # symbol to use this function?
Should the data format ('%c/%e/%Y') be the format of the inputted data or my desired output?
Can I capture time in this way too?
sorry for the large post!
Back to Google for now...
Why do I need the # symbol to use this function?
The # symbol means that you are using a variable, so the read string isnt put right away into the table but into a memory pice that lets you operate with it before inserting it. More info in http://dev.mysql.com/doc/refman/5.0/en/user-variables.html
Should the data format ('%c/%e/%Y') be the format of the inputted data or my desired output?
Its the format of the inputted data, more info in http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_str-to-date
Can I capture time in this way too?
You should be able to as long as you chose the correct format, something like
STR_TO_DATE(#temp_date,'%c/%e/%Y %h:%i:%s');
I had this problem. What solved it for me was making sure I accounted for whitespace that weren't delimiters in my load file. So if ',' is the delimiter:
..., 4/29/2012, ...
might be interpreted as " 4/29/2012"
So should be
...,4/29/2012,...
I have a csv file with a couple thousand game dates in it, but they are all in the MM/DD/YYYY format
2/27/2011,3:05 PM,26,14
(26 and 14 are team id #s), and trying to put them into SQL like that just results in 0000-00-00 being put into the date field of my table. This is the command I tried using:
LOAD DATA LOCAL INFILE 'c:/scheduletest.csv' INTO TABLE game
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(`date`, `time`, `awayteam_id`, `hometeam_id`);
but again, it wouldn't do the dates right. Is there a way I can have it convert the date as it tries to insert it? I found another SO question similar to this, but I couldn't get it to work.
Have you tried the following:
LOAD DATA LOCAL INFILE 'c:/scheduletest.csv' INTO TABLE game
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(#DATE_STR, `time`, `awayteam_id`, `hometeam_id`)
SET `date` = STR_TO_DATE(#DATE_STR, '%c/%e/%Y');
For more information, the documentation has details about the use of user variables with LOAD DATA (about half-way down - search for "User variables in the SET clause" in the page)
You can use variables to load the data from the csv into and run functions on them before inserting, like:
LOAD DATA INFILE 'file.txt'
INTO TABLE t1
(#datevar, #timevar, awayteam_id, hometeam_id)
SET date = STR_TO_DATE(#datevar, '%m/%d/%Y'),
SET time = etc etc etc;
My suggestion would be to insert the file into a temporary holding table where the date column is a character datatype. Then write a query with theSTR_TO_DATE conversion to move the data from the holding table to your final destination.
Convert field that you are using for the date to varchar type so it will play friendly with any format
Import CSV
Convert the dates to a valid mysql date format using something like:
UPDATE table SET field = STR_TO_DATE(field, '%c/%e/%Y %H:%i');
Then revert field type to date
Use a function to convert the format as needed.
I'm not an expert on MySQL, but http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_str-to-date looks promising.
If you can't do that in the load command directly, you may try creating a table that allows you to load all the values as VARCHAR and then to do an insert into your game table with a select statement with the appropriate conversion instead.
If you file is not too big, you can use the Excel function TEXT. If, for example, your date is in cell A2, then the formula in a temporary column next to it would be =TEXT(A2,"yyyy-mm-dd hh:mm:ss"). This will do it and then you can paste the values of the formula's result back into the column and then delete the temporary column.
At csv file, the date field is in such format:
2/9/2010 7:32
3/31/2011 21:20
I am using php + mysql for development.
I need to read it and store into mysql db.
final value to store in mysql should be format as below:
2010-02-09 07:32:00
What's the correct way of it?
Is mysql syntax alone can handle the conversion easily?
Use the STR_TO_DATE() function.
Example
STR_TO_DATE('3/31/2011 21:20', '%c/%e/%Y %H:%i');
I face the same issue and after little research this is how i resolved it-
LOAD DATA LOCAL INFILE 'D:/dataupload.csv' INTO TABLE table1
FIELDS TERMINATED BY ',' ENCLOSED BY '' LINES TERMINATED BY '\r\n' (#d1,col2,col3,col4)
SET col1 = date_format(str_to_date(#d1, **'%m/%d/%Y'**), **'%Y-%m-%d'**)
Details:
'%m/%d/%Y' - this is the format of date in my CSV file
'%Y-%m-%d' - this is the mysql format in which i want to convert my CSV field date while inserting data
col1 - is the actual column of my table (having date data type)
#d1 - is the dummy variable to use in set statement, you can take it any variable
I had the same problem (with DATE) and another solution, is to use the native mysql format YYYYMMDD ie 20120209.
I haven't tried with DATETIME but I guess YYYYMMDDhhmmss will work.