loading time_stamp format using sqlldr - sql-loader

I have the following time_stamp data in my csv file:
CREATE_DATE = 17-SEP-14 03.26.26.000000000 PM
I try the following statement when loading with SQL*Loader:
(CREATE_DATE "to_timestamp(:CREATE_DATE,'DD-MON-YY HH.MI.SS AM')"
But it fails with:
AM/A.M. or PM/P.M. required

Your control file entry for that field should look like this:
CREATE_DATE TIMESTAMP "dd-MON-yy hh.mi.ss.ff9 PM"

Related

Angular - using a date pipe with ngModel

I am getting the following error in my console - "Error: InvalidPipeArgument: 'Unable to convert "[object Object]" into a date' for pipe 'DatePipe'".
I have a calendar input, which should allow the user to select a date and then pass on that date in a certain format, 'dd/MM/yyyy'. I want the date selected to show in the calendar input once they have selected a date.
I realise I cannot have two way binding on the [ngModel] if I have a pipe there so I'm using
(ngModelChange). If I remove the #createdByCutOffDate="ngModel" then the error is removed but I cannot see the selected date show in the calendar input.
I also tried the updateCreatedByCutOffDate() method taking a date type or string.
this.createdByCutOffDate is in the following format - 'Thu Feb 17 2022 00:00:00 GMT+0000 (Greenwich Mean Time)'
component.html
<input type="date"
id="createdByCutOffDate"
[ngModel]="createdByCutOffDate | date:'dd/MM/yyyy'"
#createdByCutOffDate="ngModel"
(ngModelChange)="updateCreatedByCutOffDate($event)" />
component.ts
createdByCutOffDate: Date;
updateCreatedByCutOffDate(date: string) {
this.createdByCutOffDate = new Date(date);
}
createdByCutOffDate is a Date object that has its methods and properties.
So to solve your problem, use "createdByCutOffDate.date | date:'dd/MM/yyyy'" instead of "createdByCutOffDate | date:'dd/MM/yyyy'"

How to get output of a bigquery in a specific json format

I have a bigquery table in this format :
DataProvider,Id,Name,Time
ABC,f8453e99-516f-4f15-a3bd-8749089b6934,"xyz",43200
ABC,f8453e99-516f-4f15-a3bd-8749089b6934,"123",43200
ABC,00453e99-516f-4f15-a3bd-8749089b6934,"xyz",43200
I want to generate the output in this format (json) :
{"dataProviderId":"ABC","items":[{"Id":"f8453e99-516f-4f15-a3bd-8749089b6934","data":[{"Name":"xyz","Time":43200},{"Name":"xyz","Time":43200}],
{"Id":"00453e99-516f-4f15-a3bd-8749089b6934","data":[{"Name":"xyz","Time":43200}]}
In your CLI, you can use bq command with --format flag, where you can pass prettyjson format (easy-to-read JSON format).
bq query --format=prettyjson --use_legacy_sql=false 'SELECT * FROM `project_id`:dataset.table' > output.json
By using > at the end of the command, it is possible to save the output of a command to a new file. You will be able to see the output of query in output.json file.
I hope it helps.
Below is for BigQuery Standard SQL
#standardSQL
SELECT TO_JSON_STRING(t) json
FROM (
SELECT dataProvider, ARRAY_AGG(STRUCT(id, data)) items
FROM (
SELECT dataProvider, id, ARRAY_AGG(STRUCT(name, time)) data
FROM `project.dataset.table` t
GROUP BY dataProvider, id
)
GROUP BY dataProvider
) t
If to apply to sample data in your question - output is
Row json
1 {"dataProvider":"ABC","items":[{"id":"f8453e99-516f-4f15-a3bd-8749089b6934","data":[{"name":"xyz","time":43200},{"name":"123","time":43200}]},{"id":"00453e99-516f-4f15-a3bd-8749089b6934","data":[{"name":"xyz","time":43200}]}]}

Loading Json Array File in Hive

I have a file which contains the data as follows
[{"col1":"col1","col2":1}
,{"col1":"col11","col2":11}
,{"col1":"col111","col2":2}
]
I am trying to load the table in Hive.
I am using following Hive serde
CREATE EXTERNAL TABLE my_table (
my_array ARRAY<struct<col1:string,col2:int>>
)ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
WITH SERDEPROPERTIES ( "ignore.malformed.json" = "true")
LOCATION "MY_LOCATION";
I am getting error when I try to run select * after running the create command -
['*org.apache.hive.service.cli.HiveSQLException:java.io.IOException: org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start token not found where expected:25:24', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:499', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:307', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:878', 'sun.reflect.GeneratedMethodAccessor29:invoke::-1', 'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43', 'java.lang.reflect.Method:invoke:Method.java:498', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78', 'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36', 'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63', 'java.security.AccessController:doPrivileged:AccessController.java:-2', 'javax.security.auth.Subject:doAs:Subject.java:422', 'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1698', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59', 'com.sun.proxy.$Proxy35:fetchResults::-1', 'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:559', 'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:751', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1717', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1702', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624', 'java.lang.Thread:run:Thread.java:748', '*java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: java.io.IOException: Start token not found where expected:29:4', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:521', 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:428', 'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:147', 'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:2207', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:494', '*org.apache.hadoop.hive.serde2.SerDeException:java.io.IOException: Start token not found where expected:30:1', 'org.apache.hive.hcatalog.data.JsonSerDe:deserialize:JsonSerDe.java:184', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:502', '*java.io.IOException:Start token not found where expected:30:0', 'org.apache.hive.hcatalog.data.JsonSerDe:deserialize:JsonSerDe.java:170'], statusCode=3), results=None, hasMoreRows=None)
I tried several things, none of which worked as expected. I can't change the input data format as it is someone else who is providing the data.
This is a malformed JSON issue. A JSON file will always have "curly braces" at the beginning and the end. So change your JSON file to look something like below.
{"my_array":[{"col1":"col1","col2":1},{"col1":"col11","col2":11},{"col1":"col111","col2":2}]}
Create your table in the exact same way as you are doing it already.
CREATE EXTERNAL TABLE my_table
(
my_array ARRAY<struct<col1:string,col2:int>>
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
WITH SERDEPROPERTIES ( "ignore.malformed.json" = "true")
LOCATION "MY_LOCATION";
Now fire a select * on your newly created table to see following results.
[{"col1":"col1","col2":1},{"col1":"col11","col2":11},{"col1":"col111","col2":2}]
Use select my_array.col1 from my_table; to see the values for col1 from your array.
["col1","col11","col111"]
PS - Not the most efficient way to store the data. Consider transforming the data and storing it as ORC/Parquet.
Hope that helps!
Looks like the issue is with your json data. Can you try with below example?
Create employee json with below content and place it in hdfs.
[root#quickstart spark]# hadoop fs -cat /user/cloudera/spark/employeejson/*
{"Name":"Vinayak","age":35}
{"Name":"Nilesh","age":37}
{"Name":"Raju","age":30}
{"Name":"Karthik","age":28}
{"Name":"Shreshta","age":1}
{"Name":"Siddhish","age":2}
Add below jar(execute only if you get any error. )
hive> ADD JAR /usr/lib/hive-hcatalog/lib/hive-hcatalog-core.jar;
hive>
CREATE TABLE employeefromjson(name string, age int)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE
LOCATION '/user/cloudera/hive/employeefromjson'
;
hive> LOAD DATA INPATH '/user/cloudera/spark/employeejson' OVERWRITE INTO TABLE employeefromjson;
hive> select * from employeefromjson;
OK
Vinayak 35
Nilesh 37
Raju 30
Karthik 28
Shreshta 1
Siddhish 2
Time taken: 0.174 seconds, Fetched: 6 row(s)
A JSON should always start with '{' and not with '['. That is the problem. As you know, JSON has a structure of {'key':'value'}. What you have given in your file is a value which does not have any key. So, change your JSON to the below formmat
{"my_array":[{"col1":"col1","col2":1},{"col1":"col11","col2":11},{"col1":"col111","col2":2}]}
Your Create table statement should work fine.
If you want to get the data for each column for all the rows, use the below query.
select my_array.col1, my_array.col2 from my_table;
The above command will give you the below result.
OK
["col1","col11","col111"] [1,11,2]
If you want to get the result column wise for each row seperately, use the below query.
select a.* from my_table m lateral view outer inline (m.my_array) a;
The above command will give you the below result.
OK
col1 1
col11 11
col111 2
Hope you this helps!

Loading JSON file in HIVE table

I have a JSON file like below, which I want to load in a HIVE table with parsed format, what are possible options I can go for.
If it is AVRO then I could have used directly AvroSerDe. But the source file in this case is JSON.
{
"subscriberId":"vfd1234-07e1-4054-9b64-83a5a20744db",
"cartId":"1234edswe-6a9c-493c-bcd0-7fb71995beef",
"cartStatus":"default",
"salesChannel":"XYZ",
"accountId":"12345",
"channelNumber":"12",
"timestamp":"Dec 12, 2013 8:30:00 AM",
"promotions":[
{
"promotionId":"NEWID1234",
"promotionContent":{
"has_termsandconditions":[
"TC_NFLMAXDEFAULT16R103578"
],
"sequenceNumber":"305",
"quantity":"1",
"promotionLevel":"basic",
"promotionDuration":"1",
"endDate":"1283142400000",
"description":"Regular Season One Payment",
"active":"true",
"disableInOfferPanel":"true",
"displayInCart":"true",
"type":"promotion",
"frequencyOfCharge":"weekly",
"promotionId":"NEWID1234",
"promotionIndicator":"No",
"shoppingCartTitle":"Regular Season One Payment",
"discountedPrice":"0",
"preselectedInOfferPanel":"false",
"price":"9.99",
"name":"Regular Season One Payment",
"have":[
"CatNFLSundayMax"
],
"ID":"NEWID1234",
"startDate":"1451365600000",
"displayInOfferPanel":"true"
}
}
]
}
I did tried to create a table using org.openx.data.jsonserde.JsonSerDe, but it is not showing me the data.
CREATE EXTERNAL TABLE test1
(
SUBSCRIBER_ID string,
CART_ID string,
CART_STAT_NAME string,
SLS_CHAN_NAME string,
ACCOUNT_ID string,
CHAN_NBR string,
TX_TMSTMP string,
PROMOTION ARRAY<STRING>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '<HDFS location where the json file is place in single line>';
Not sure about the JsonSerDe you are using . Bu here this JsonSerDe you can use for you.Hive-JSON-Serde
hive> add jar /User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;
Added [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar] to class path
Added resources: [/User/User1/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar]
hive> use default;
OK
Time taken: 0.021 seconds
hive> CREATE EXTERNAL TABLE IF NOT EXISTS json_poc (
> alertHistoryId bigint, entityId bigint, deviceId string, alertTypeId int, AlertStartDate string
> )
> ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
> LOCATION '/User/User1/sandeep_poc/hive_json';
OK
Time taken: 0.077 seconds
hive> select * from json_poc;
OK
123456 123 123 1 jan 04, 2017 2:46:48 PM
Time taken: 0.052 seconds, Fetched: 1 row(s)
How to build jar.
Maven should be installed on your PC then run command like this.
C:\Users\User1\Downloads\Hive-JSON-Serde-develop\Hive-JSON-Serde-develop>mvn -Phdp23 clean package.
-Phdp23 is hdp2.3 it should be replaced with your hadoop version.
Or if you want to use inbuilt JsonSerde get_json_object json_tuple
if you are looking for an example how to use see this blog Hive-JSON-Serde example .
I will recommend validate your JSON file as well.JSON Validator
If you read the official document
when you are using hive 0.12 and later, use hive-hcatalog-core,
Note: For Hive releases prior to 0.12, Amazon provides a JSON SerDe available at s3://elasticmapreduce/samples/hive-ads/libs/jsonserde.jar.
you should first add the jar hive-hcatalog-core,
ADD JAR /path/to/jar/;
you can either download it from mvn repository or find it manually.
then the hive table should look like
CREATE EXTERNAL TABLE test1
(
SUBSCRIBER_ID string,
CART_ID string,
CART_STAT_NAME string,
SLS_CHAN_NAME string,
ACCOUNT_ID string,
CHAN_NBR string,
TX_TMSTMP string,
PROMOTION ARRAY<STRING>
)
ROW FORMAT SERDE
'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION '<HDFS location where the json file is place in single line>';
Steps to load JSON file data in hive table
1] Create table in hive
hive> create table JsonTableExample(data string);
2] Load JSON file into a hive table
hive> load data inpath '/home/cloudera/testjson.json' into table JsonTableExample;
3] If we apply normal select * from JsonTableExample; we will get all data. This is not an effective solution for that we have to follow step 4.
4] Select data using get_json_object() function
hive> select get_json_object(data,'$.id') as id,
get_json_object(data,'$.name') as name from JsonTableExample;
For many versions of Hive, perhaps the best way to enable JSON processing is using org.apache.hive.hcatalog.data.JsonSerDe as previously mentioned. This is the out-of-the-box capability. However, for some versions of CDH6 and HDP3, there is a new feature where JSON is a first-class citizen. This exists in Apache Hive 4.0 and higher.
CREATE TABLE ... STORED AS JSONFILE;
Please note that each JSON object must be on its own line (without line breaks).
{"name"="john","age"=30}
{"name"="sue","age"=32}
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

Move data from mysql to oracle with data contain html format use sql loader

Please help move data from mysql to oracle with data contain html format use sqlloader.
I has export data mysql to file csv.
sample data csv :
±14044±©±1±©±1±©±1±©±MailManager Attachment±©±image001.gif±©±6416-01-11 11:30:06±©±6416-01-11 11:30:06±©±null±©±null±©±0±©±1±©±0±©±null±
±14045±©±1±©±1±©±1±©±MailManager Attachment±©±image002.jpg±©±6416-01-11 11:30:06±©±6416-01-11 11:30:06±©±null±©±null±©±0±©±1±©±0±©±null±
±14046±©±1±©±1±©±1±©±Emails±©±"
<p>"
 </p>"
<p style=""margin:0;padding:0;"">"
On 02-20-2014 13:26:49, crmtelesales#fecredit.com.vn, wrote:</p>"
<blockquote style=""border:0;margin:0;border-left:1px solid #808080;padding:0 0 0 2px;"">"
<div style=""font-size:13px;font-family:tahoma;color:rgb(0,0,0);font-weight:normal;font-style:normal;background-image:none;background-attachment:scroll;background-position:0% 0%;"">"
do not reply</div>"
<br />"
 </blockquote>"
<br />±©±2014-03-03 10:11:39±©±2014-03-03 10:11:39±©±null±©±null±©±0±©±1±©±0±©±Re: tests±
My control file
LOAD DATA
INFILE '/home/ggt/csv/vtiger_crmentity.csv'
TRUNCATE INTO TABLE DWVTIGER.VTIGER_CRMENTITY
FIELDS TERMINATED BY "," ENCLOSED BY '|'
TRAILING NULLCOLS
(
CRMID ,
SMCREATORID ,
SMOWNERID ,
MODIFIEDBY ,
SETYPE ,
DESCRIPTION NULLIF DESCRIPTION='null',
CREATEDTIME date "yyyy-mm-dd hh24:mi:ss" ,
MODIFIEDTIME date "yyyy-mm-dd hh24:mi:ss" ,
VIEWEDTIME date "yyyy-mm-dd hh24:mi:ss" NULLIF VIEWEDTIME='null',
STATUS NULLIF STATUS='null',
VERSION ,
PRESENCE NULLIF PRESENCE='null',
DELETED ,
LABEL NULLIF LABEL='null'
)
SQL*Loader cannot load data that is in HTML format. You will need to get the data out of the page and make a true CSV before SQL Loader can load it. It's made to load data in flat files, where each row is a record and each record is alike (basically).
You need to read up on this: https://docs.oracle.com/cd/B28359_01/server.111/b28319/ldr_concepts.htm#g1013706
and this: https://docs.oracle.com/cd/B28359_01/server.111/b28319/ldr_control_file.htm#i1006645