I'm a newbie to SQL Server. I have a table Accounts which is defined as:
OrganizationId int,
AccountDetails varchar(max)
The AccountDetails column contains XML data.
The data in the table looks like this:
1 | <Account><Id>100</Id><Name>A</Name></Account>
2 | <Account><Id>200</Id><Name>B</Name></Account>
3 | <Account><Id>300</Id><Name>C</Name></Account>
4 | <Account><Id>400</Id><Name>D</Name></Account>
I need write a SQL query to get the records from this table where AccountId is 200 or 400.
The query should return two rows (#2 and #4) in JSON format, like this:
result1 : { "account_id": 200, "account_name": B }
result2 : { "account_id": 400, "account_name": D }
I'm wondering how do I go about this?
Thank you.
For # 1 above, should I be trying to cast the AccountDetails column to XML type and then use "nodes" feature for querying/filtering?
For #2, I should be writing a SQL function to convert the XML to JSON first and querying XML to build the JSON as needed?
As already mentioned, it is much better to use a proper XML data type for the AccountDetails column.
Please try the following solution.
It will work starting from SQL Server 2016 onwards.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (OrganizationId INT IDENTITY PRIMARY KEY, AccountDetails NVARCHAR(MAX));
INSERT #tbl (AccountDetails) VALUES
('<Account><Id>100</Id><Name>A</Name></Account>'),
('<Account><Id>200</Id><Name>B</Name></Account>'),
('<Account><Id>300</Id><Name>C</Name></Account>'),
('<Account><Id>400</Id><Name>D</Name></Account>');
-- DDL and sample data population, end
;WITH rs AS
(
SELECT t.OrganizationId
, account_id = x.value('(/Account/Id/text())[1]', 'INT')
, account_name = x.value('(/Account/Name/text())[1]', 'VARCHAR(20)')
FROM #tbl AS t
CROSS APPLY (VALUES(TRY_CAST(AccountDetails AS XML))) AS t1(x)
)
SELECT *
, JSONData = (SELECT rs.account_id, rs.account_name FOR JSON PATH,WITHOUT_ARRAY_WRAPPER)
FROM rs
WHERE rs.account_id IN (200, 400);
Output
OrganizationId
account_id
account_name
JSONData
2
200
B
{"account_id":200,"account_name":"B"}
4
400
D
{"account_id":400,"account_name":"D"}
I have a hive table to load JSON data. There are two values in my JSON. Both have data type as string. If I keep them as bigint, then select on this table gives below error:
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream#3b6c740b; line: 1, column: 21]
If I change it two string, then it works OK.
Now, because these columns are in string, I am not able to use from_unixtime method for these columns.
If I try to alter these columns data types from string to bigint, I get below error:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : uploadtimestamp
Below is my create table statement:
create table ABC
(
uploadTimeStamp bigint
,PDID string
,data array
<
struct
<
Data:struct
<
unit:string
,value:string
,heading:string
,loc:string
,loc1:string
,loc2:string
,loc3:string
,speed:string
,xvalue:string
,yvalue:string
,zvalue:string
>
,Event:string
,PDID:string
,`Timestamp`:string
,Timezone:string
,Version:string
,pii:struct<dummy:string>
>
>
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
stored as textfile;
My JSON:
{"uploadTimeStamp":"1488793268598","PDID":"123","data":[{"Data":{"unit":"rpm","value":"100"},"EventID":"E1","PDID":"123","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"Data":{"heading":"N","loc":"false","loc1":"16.032425","loc2":"80.770587","loc3":"false","speed":"10"},"EventID":"Location","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.1","pii":{}},{"Data":{"xvalue":"1.1","yvalue":"1.2","zvalue":"2.2"},"EventID":"AccelerometerInfo","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"EventID":"FuelLevel","Data":{"value":"50","unit":"percentage"},"Version":"1.0","Timestamp":1488793268598,"PDID":"skga06031430gedvcl1pdid2367","Timezone":330},{"Data":{"unit":"kmph","value":"70"},"EventID":"VehicleSpeed","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}}]}
Any ways I can convert this string unixtimestamp to standard time or I can work with bigint for these columns?
If you are talking about Timestamp and Timezone then you can define them as int/big int types.
If you'll look on their definition you'll see that there are no qualifiers (") around the values, therefore they are of numeric types within in the JSON doc:
"Timestamp":1488793268598,"Timezone":330
create external table myjson
(
uploadTimeStamp string
,PDID string
,data array
<
struct
<
Data:struct
<
unit:string
,value:string
,heading:string
,loc3:string
,loc:string
,loc1:string
,loc4:string
,speed:string
,x:string
,y:string
,z:string
>
,EventID:string
,PDID:string
,`Timestamp`:bigint
,Timezone:smallint
,Version:string
,pii:struct<dummy:string>
>
>
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
stored as textfile
location '/tmp/myjson'
;
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| myjson.uploadtimestamp | myjson.pdid | myjson.data |
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1486631318873 | 123 | [{"data":{"unit":"rpm","value":"0","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E1","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":"N","loc3":"false","loc":"14.022425","loc1":"78.760587","loc4":"false","speed":"10","x":null,"y":null,"z":null},"eventid":"E2","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.1","pii":{"dummy":null}},{"data":{"unit":null,"value":null,"heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":"1.1","y":"1.2","z":"2.2"},"eventid":"E3","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}},{"data":{"unit":"percentage","value":"50","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E4","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":null},{"data":{"unit":"kmph","value":"70","heading":null,"loc3":null,"loc":null,"loc1":null,"loc4":null,"speed":null,"x":null,"y":null,"z":null},"eventid":"E5","pdid":"123","timestamp":1486631318873,"timezone":330,"version":"1.0","pii":{"dummy":null}}] |
+------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Even if you have defined Timestamp as a string you can still cast it to a bigint before using it in a function that requires a bigint.
cast (`Timestamp` as bigint)
hive> with t as (select '0' as `timestamp`) select from_unixtime(`timestamp`) from t;
FAILED: SemanticException [Error 10014]: Line 1:45 Wrong arguments
'timestamp': No matching method for class
org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with (string). Possible
choices: FUNC(bigint) FUNC(bigint, string) FUNC(int)
FUNC(int, string)
hive> with t as (select '0' as `timestamp`) select from_unixtime(cast(`timestamp` as bigint)) from t;
OK
1970-01-01 00:00:00
My column family structure is:
create table mykeyspc."test" (
id int PRIMARY KEY,
val set<frozen<map<text,text>>>
);
when I am inserting data through CQL shell
insert into "test" JSON '{"id":1,"val":{"ab","bc"}}';
Error: INVALIDREQUEST: code=2200 [Invalid query] message="Counld not decode JSon string as
map:org.codehaus.jackson.jsonParseException: Unexpected character{'{'{ code 123})
or
insert into "test" (id,val) values (1,{{'ab','bc'},{'sdf','name'}});
Error: INVALIDREQUEST: code=2200 [Invalid query] message="INVALID SET LITERAL FOR
VAL:value{'a','b'} is not of type frozen<map<text,text>>"
In your second example, try separating the map key/values with colons : instead of commas.
aploetz#cqlsh:stackoverflow> INSERT INTO mapOfSet (id,val)
VALUES (1,{{'ab':'bc'},{'sdf':'name'}});
aploetz#cqlsh:stackoverflow> SELECT * FROm mapofset WHERE id=1;
id | val
----+---------------------------------
1 | {{'ab': 'bc'}, {'sdf': 'name'}}
(1 rows)