loading time_stamp format using sqlldr

loading time_stamp format using sqlldr - sql-loader

I have the following time_stamp data in my csv file:
CREATE_DATE = 17-SEP-14 03.26.26.000000000 PM
I try the following statement when loading with SQL*Loader:
(CREATE_DATE "to_timestamp(:CREATE_DATE,'DD-MON-YY HH.MI.SS AM')"
But it fails with:
AM/A.M. or PM/P.M. required

Your control file entry for that field should look like this:
CREATE_DATE TIMESTAMP "dd-MON-yy hh.mi.ss.ff9 PM"

Related

Angular - using a date pipe with ngModel

I am getting the following error in my console - "Error: InvalidPipeArgument: 'Unable to convert "[object Object]" into a date' for pipe 'DatePipe'".
I have a calendar input, which should allow the user to select a date and then pass on that date in a certain format, 'dd/MM/yyyy'. I want the date selected to show in the calendar input once they have selected a date.
I realise I cannot have two way binding on the [ngModel] if I have a pipe there so I'm using
(ngModelChange). If I remove the #createdByCutOffDate="ngModel" then the error is removed but I cannot see the selected date show in the calendar input.
I also tried the updateCreatedByCutOffDate() method taking a date type or string.
this.createdByCutOffDate is in the following format - 'Thu Feb 17 2022 00:00:00 GMT+0000 (Greenwich Mean Time)'
component.html
<input type="date"
id="createdByCutOffDate"
[ngModel]="createdByCutOffDate | date:'dd/MM/yyyy'"
#createdByCutOffDate="ngModel"
(ngModelChange)="updateCreatedByCutOffDate($event)" />
component.ts
createdByCutOffDate: Date;
updateCreatedByCutOffDate(date: string) {
this.createdByCutOffDate = new Date(date);
}

createdByCutOffDate is a Date object that has its methods and properties.
So to solve your problem, use "createdByCutOffDate.date | date:'dd/MM/yyyy'" instead of "createdByCutOffDate | date:'dd/MM/yyyy'"

How to get output of a bigquery in a specific json format

I have a bigquery table in this format :
DataProvider,Id,Name,Time
ABC,f8453e99-516f-4f15-a3bd-8749089b6934,"xyz",43200
ABC,f8453e99-516f-4f15-a3bd-8749089b6934,"123",43200
ABC,00453e99-516f-4f15-a3bd-8749089b6934,"xyz",43200
I want to generate the output in this format (json) :
{"dataProviderId":"ABC","items":[{"Id":"f8453e99-516f-4f15-a3bd-8749089b6934","data":[{"Name":"xyz","Time":43200},{"Name":"xyz","Time":43200}],
{"Id":"00453e99-516f-4f15-a3bd-8749089b6934","data":[{"Name":"xyz","Time":43200}]}

In your CLI, you can use bq command with --format flag, where you can pass prettyjson format (easy-to-read JSON format).
bq query --format=prettyjson --use_legacy_sql=false 'SELECT * FROM `project_id`:dataset.table' > output.json
By using > at the end of the command, it is possible to save the output of a command to a new file. You will be able to see the output of query in output.json file.
I hope it helps.

Below is for BigQuery Standard SQL
#standardSQL
SELECT TO_JSON_STRING(t) json
FROM (
SELECT dataProvider, ARRAY_AGG(STRUCT(id, data)) items
FROM (
SELECT dataProvider, id, ARRAY_AGG(STRUCT(name, time)) data
FROM `project.dataset.table` t
GROUP BY dataProvider, id
)
GROUP BY dataProvider
) t
If to apply to sample data in your question - output is
Row json
1 {"dataProvider":"ABC","items":[{"id":"f8453e99-516f-4f15-a3bd-8749089b6934","data":[{"name":"xyz","time":43200},{"name":"123","time":43200}]},{"id":"00453e99-516f-4f15-a3bd-8749089b6934","data":[{"name":"xyz","time":43200}]}]}

Loading Json Array File in Hive

I have a file which contains the data as follows
[{"col1":"col1","col2":1}
,{"col1":"col11","col2":11}
,{"col1":"col111","col2":2}
]
I am trying to load the table in Hive.
I am using following Hive serde
CREATE EXTERNAL TABLE my_table (
my_array ARRAY<struct<col1:string,col2:int>>
)ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
WITH SERDEPROPERTIES ( "ignore.malformed.json" = "true")
LOCATION "MY_LOCATION";
I am getting error when I try to run select * after running the create command -
['*org.apache.hive.service.cli.HiveSQLException:java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start token not found where expected:25:24', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:499', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:307', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:878', 'sun.reflect.GeneratedMethodAccessor29:invoke::-1', 'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43', 'java.lang.reflect.Method:invoke:Method.java:498', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78', 'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36', 'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63', 'java.security.AccessController:doPrivileged:AccessController.java:-2', 'javax.security.auth.Subject:doAs:Subject.java:422', 'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1698', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59', 'com.sun.proxy.$Proxy35:fetchResults::-1', 'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:559', 'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:751', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1717', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1702', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624', 'java.lang.Thread:run:Thread.java:748', '*java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start token not found where expected:29:4', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:521', 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:428', 'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:147', 'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:2207', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:494', '*org.apache.hadoop.hive.serde2.SerDeException:java.io.IOException: Start token not found where expected:30:1', 'org.apache.hive.hcatalog.data.JsonSerDe:deserialize:JsonSerDe.java:184', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:502', '*java.io.IOException:Start token not found where expected:30:0', 'org.apache.hive.hcatalog.data.JsonSerDe:deserialize:JsonSerDe.java:170'], statusCode=3), results=None, hasMoreRows=None)
I tried several things, none of which worked as expected. I can't change the input data format as it is someone else who is providing the data.

This is a malformed JSON issue. A JSON file will always have "curly braces" at the beginning and the end. So change your JSON file to look something like below.
{"my_array":[{"col1":"col1","col2":1},{"col1":"col11","col2":11},{"col1":"col111","col2":2}]}
Create your table in the exact same way as you are doing it already.
CREATE EXTERNAL TABLE my_table
(
my_array ARRAY<struct<col1:string,col2:int>>
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
WITH SERDEPROPERTIES ( "ignore.malformed.json" = "true")
LOCATION "MY_LOCATION";
Now fire a select * on your newly created table to see following results.
[{"col1":"col1","col2":1},{"col1":"col11","col2":11},{"col1":"col111","col2":2}]
Use select my_array.col1 from my_table; to see the values for col1 from your array.
["col1","col11","col111"]
PS - Not the most efficient way to store the data. Consider transforming the data and storing it as ORC/Parquet.
Hope that helps!

Looks like the issue is with your json data. Can you try with below example?
Create employee json with below content and place it in hdfs.
[root#quickstart spark]# hadoop fs -cat /user/cloudera/spark/employeejson/*
{"Name":"Vinayak","age":35}
{"Name":"Nilesh","age":37}
{"Name":"Raju","age":30}
{"Name":"Karthik","age":28}
{"Name":"Shreshta","age":1}
{"Name":"Siddhish","age":2}
Add below jar(execute only if you get any error. )
hive> ADD JAR /usr/lib/hive-hcatalog/lib/hive-hcatalog-core.jar;
hive>
CREATE TABLE employeefromjson(name string, age int)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE
LOCATION '/user/cloudera/hive/employeefromjson'
;
hive> LOAD DATA INPATH '/user/cloudera/spark/employeejson' OVERWRITE INTO TABLE employeefromjson;
hive> select * from employeefromjson;
OK
Vinayak 35
Nilesh 37
Raju 30
Karthik 28
Shreshta 1
Siddhish 2
Time taken: 0.174 seconds, Fetched: 6 row(s)

A JSON should always start with '{' and not with '['. That is the problem. As you know, JSON has a structure of {'key':'value'}. What you have given in your file is a value which does not have any key. So, change your JSON to the below formmat
{"my_array":[{"col1":"col1","col2":1},{"col1":"col11","col2":11},{"col1":"col111","col2":2}]}
Your Create table statement should work fine.
If you want to get the data for each column for all the rows, use the below query.
select my_array.col1, my_array.col2 from my_table;
The above command will give you the below result.
OK
["col1","col11","col111"] [1,11,2]
If you want to get the result column wise for each row seperately, use the below query.
select a.* from my_table m lateral view outer inline (m.my_array) a;
The above command will give you the below result.
OK
col1 1
col11 11
col111 2
Hope you this helps!

Loading JSON file in HIVE table

I have a JSON file like below, which I want to load in a HIVE table with parsed format, what are possible options I can go for.
If it is AVRO then I could have used directly AvroSerDe. But the source file in this case is JSON.
{
"subscriberId":"vfd1234-07e1-4054-9b64-83a5a20744db",
"cartId":"1234edswe-6a9c-493c-bcd0-7fb71995beef",
"cartStatus":"default",
"salesChannel":"XYZ",
"accountId":"12345",
"channelNumber":"12",
"timestamp":"Dec 12, 2013 8:30:00 AM",
"promotions":[
{
"promotionId":"NEWID1234",
"promotionContent":{
"has_termsandconditions":[
"TC_NFLMAXDEFAULT16R103578"
],
"sequenceNumber":"305",
"quantity":"1",
"promotionLevel":"basic",
"promotionDuration":"1",
"endDate":"1283142400000",
"description":"Regular Season One Payment",
"active":"true",
"disableInOfferPanel":"true",
"displayInCart":"true",
"type":"promotion",
"frequencyOfCharge":"weekly",
"promotionId":"NEWID1234",
"promotionIndicator":"No",
"shoppingCartTitle":"Regular Season One Payment",
"discountedPrice":"0",
"preselectedInOfferPanel":"false",
"price":"9.99",
"name":"Regular Season One Payment",
"have":[
"CatNFLSundayMax"
],
"ID":"NEWID1234",
"startDate":"1451365600000",
"displayInOfferPanel":"true"
}
}
]
}
I did tried to create a table using org.openx.data.jsonserde.JsonSerDe, but it is not showing me the data.
CREATE EXTERNAL TABLE test1
(
SUBSCRIBER_ID string,
CART_ID string,
CART_STAT_NAME string,
SLS_CHAN_NAME string,
ACCOUNT_ID string,
CHAN_NBR string,
TX_TMSTMP string,
PROMOTION ARRAY<STRING>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '<HDFS location where the json file is place in single line>';

Not sure about the JsonSerDe you are using . Bu here this JsonSerDe you can use for you.Hive-JSON-Serde
hive> add jar /User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;
Added [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar] to class path
Added resources: [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar]
hive> use default;
OK
Time taken: 0.021 seconds
hive> CREATE EXTERNAL TABLE IF NOT EXISTS json_poc (
> alertHistoryId bigint, entityId bigint, deviceId string, alertTypeId int, AlertStartDate string
> )
> ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
> LOCATION '/User/User1/sandeep_poc/hive_json';
OK
Time taken: 0.077 seconds
hive> select * from json_poc;
OK
123456 123 123 1 jan 04, 2017 2:46:48 PM
Time taken: 0.052 seconds, Fetched: 1 row(s)
How to build jar.
Maven should be installed on your PC then run command like this.
C:\Users\User1\Downloads\Hive-JSON-Serde-develop\Hive-JSON-Serde-develop>mvn -Phdp23 clean package.
-Phdp23 is hdp2.3 it should be replaced with your hadoop version.
Or if you want to use inbuilt JsonSerde get_json_object json_tuple
if you are looking for an example how to use see this blog Hive-JSON-Serde example .
I will recommend validate your JSON file as well.JSON Validator

If you read the official document
when you are using hive 0.12 and later, use hive-hcatalog-core,
Note: For Hive releases prior to 0.12, Amazon provides a JSON SerDe available at s3://elasticmapreduce/samples/hive-ads/libs/jsonserde.jar.
you should first add the jar hive-hcatalog-core,
ADD JAR /path/to/jar/;
you can either download it from mvn repository or find it manually.
then the hive table should look like
CREATE EXTERNAL TABLE test1
(
SUBSCRIBER_ID string,
CART_ID string,
CART_STAT_NAME string,
SLS_CHAN_NAME string,
ACCOUNT_ID string,
CHAN_NBR string,
TX_TMSTMP string,
PROMOTION ARRAY<STRING>
)
ROW FORMAT SERDE
'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION '<HDFS location where the json file is place in single line>';

Steps to load JSON file data in hive table
1] Create table in hive
hive> create table JsonTableExample(data string);
2] Load JSON file into a hive table
hive> load data inpath '/home/cloudera/testjson.json' into table JsonTableExample;
3] If we apply normal select * from JsonTableExample; we will get all data. This is not an effective solution for that we have to follow step 4.
4] Select data using get_json_object() function
hive> select get_json_object(data,'$.id') as id,
get_json_object(data,'$.name') as name from JsonTableExample;

For many versions of Hive, perhaps the best way to enable JSON processing is using org.apache.hive.hcatalog.data.JsonSerDe as previously mentioned. This is the out-of-the-box capability. However, for some versions of CDH6 and HDP3, there is a new feature where JSON is a first-class citizen. This exists in Apache Hive 4.0 and higher.
CREATE TABLE ... STORED AS JSONFILE;
Please note that each JSON object must be on its own line (without line breaks).
{"name"="john","age"=30}
{"name"="sue","age"=32}
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

Move data from mysql to oracle with data contain html format use sql loader

Please help move data from mysql to oracle with data contain html format use sqlloader.
I has export data mysql to file csv.
sample data csv :
±14044±©±1±©±1±©±1±©±MailManager Attachment±©±image001.gif±©±6416-01-11 11:30:06±©±6416-01-11 11:30:06±©±null±©±null±©±0±©±1±©±0±©±null±
±14045±©±1±©±1±©±1±©±MailManager Attachment±©±image002.jpg±©±6416-01-11 11:30:06±©±6416-01-11 11:30:06±©±null±©±null±©±0±©±1±©±0±©±null±
±14046±©±1±©±1±©±1±©±Emails±©±"
<p>"
 </p>"
<p style=""margin:0;padding:0;"">"
On 02-20-2014 13:26:49, crmtelesales#fecredit.com.vn, wrote:</p>"
<blockquote style=""border:0;margin:0;border-left:1px solid #808080;padding:0 0 0 2px;"">"
<div style=""font-size:13px;font-family:tahoma;color:rgb(0,0,0);font-weight:normal;font-style:normal;background-image:none;background-attachment:scroll;background-position:0% 0%;"">"
do not reply</div>"
<br />"
 </blockquote>"
<br />±©±2014-03-03 10:11:39±©±2014-03-03 10:11:39±©±null±©±null±©±0±©±1±©±0±©±Re: tests±
My control file
LOAD DATA
INFILE '/home/ggt/csv/vtiger_crmentity.csv'
TRUNCATE INTO TABLE DWVTIGER.VTIGER_CRMENTITY
FIELDS TERMINATED BY "," ENCLOSED BY '|'
TRAILING NULLCOLS
(
CRMID ,
SMCREATORID ,
SMOWNERID ,
MODIFIEDBY ,
SETYPE ,
DESCRIPTION NULLIF DESCRIPTION='null',
CREATEDTIME date "yyyy-mm-dd hh24:mi:ss" ,
MODIFIEDTIME date "yyyy-mm-dd hh24:mi:ss" ,
VIEWEDTIME date "yyyy-mm-dd hh24:mi:ss" NULLIF VIEWEDTIME='null',
STATUS NULLIF STATUS='null',
VERSION ,
PRESENCE NULLIF PRESENCE='null',
DELETED ,
LABEL NULLIF LABEL='null'
)

SQL*Loader cannot load data that is in HTML format. You will need to get the data out of the page and make a true CSV before SQL Loader can load it. It's made to load data in flat files, where each row is a record and each record is alike (basically).
You need to read up on this: https://docs.oracle.com/cd/B28359_01/server.111/b28319/ldr_concepts.htm#g1013706
and this: https://docs.oracle.com/cd/B28359_01/server.111/b28319/ldr_control_file.htm#i1006645

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

loading time_stamp format using sqlldr - sql-loader

I have the following time_stamp data in my csv file: CREATE_DATE = 17-SEP-14 03.26.26.000000000 PM I try the following statement when loading with SQL*Loader: (CREATE_DATE "to_timestamp(:CREATE_DATE,'DD-MON-YY HH.MI.SS AM')" But it fails with: AM/A.M. or PM/P.M. required

Your control file entry for that field should look like this: CREATE_DATE TIMESTAMP "dd-MON-yy hh.mi.ss.ff9 PM"

Related

Angular - using a date pipe with ngModel

How to get output of a bigquery in a specific json format

Loading Json Array File in Hive

Loading JSON file in HIVE table

Move data from mysql to oracle with data contain html format use sql loader

Categories

Resources