Parsing json file in hive

Parsing json file in hive - json

I am trying to load the json file in hive.Below is sample json file.
{"Result":[
{"Col1":"Key1","Col2":"abc#gmail.com","Col3":"7"},
{"Col1":"Key2","Col2":"abc#gmail.com","Col3":"7"},
{"Col1":"Key3","Col2":"abc#gmail.com","Col3":"7"},
{"Col1":"Key4","Col2":"abc#gmail.com","Col3":"7"}
]}
I have tried below create statement in hive.
create table if not exists sample_json (A Array<struct<"Col1":String,"Col2":string,"Col3":string>>) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' LOCATION '/a/b/c'
I am not able to retrieve the each columns data from table.I have tried to explode the array but it returns only 1st record .Can anyone please suggest what is wrong with it?

Create
CREATE EXTERNAL TABLE testJson (
Result ARRAY <struct<Col1:String ,
Col2 : string ,
Col3 : string > >)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3n://temp_db.db/testjsonstring';
Query
SELECT
t.col1
,t.col2
,t.col3
FROM
testJson LATERAL VIEW explode (result) r AS t LIMIT 100
Result
Col1 Col2 Col3
Key1 abc#gmail.com 7
Key2 abc#gmail.com 7
Key3 abc#gmail.com 7
Key4 abc#gmail.com 7

Related

How to convert Table Data into JSon Array in MYSQL

I have sample Data
ID VAL LINK
12 5335.1 2
12 5336.1 2
12 5337.1 2
Initially I have tried Using GROUP_CONCAT on top of it applied Json_ARRAYAGG
Select JSON_OBJECT('VAL',VAL,'LINK',LINK)AS COL
from (
Select GROUP_CONCAT("",VAL,"")VAL,LINK from Table GROUP BY VAL,LINK ) T
I'm getting output like this :
[{"VAL": "5335.1,5336.1,5337.1", "LINK": 1}]
How can I convert this into Json Array :
Required Out Put
[{
"VAL":["5335.1","5336.1","5337.1"],
"LINK":1
}]

SELECT JSON_OBJECT('VAL', JSON_ARRAYAGG(VAL), 'LINK', LINK) output
FROM source_table
GROUP BY LINK

How to search JSON data in MySQL by key and value?

I am inserting my data in a database with json_encoded. Now I want to search in "feature", but the result is not exactly true.
MySQL query:
select *
from `modul_69`
where `extmod` like '%"68":%'
and `extmod` like '%"4"%'
and `extmod` not like '%"4":%'
Results:
row1 data:
{"68":["1","4","7"],"67":["14"],"75":["28"]} - true
row2 data:
{"68":["59"],"67":["1","11","13"],"75":["3","4","5","27"]} - false
I want select only row1 by key:68 and value:4
Please help

Here is one way to do it using MySQL JSON functions, available since version 5.7:
select *
from t
where json_search(js -> '$."68"', 'one', '4') is not null
What this does is get the array that correspond to outer key '68' (using ->, which is a syntactic sugar for json_extract()), and then search its content with json_search(); if a non-null value is returned, we have a match.

To find if the value '"4"' is contained in the member '"68"', you can first extract the array using JSON_EXTRACT() :
SELECT JSON_EXTRACT(m.extmod, '$."68"')
FROM modul_69 m;
This outputs
["1", "4", "7"]
["59"]
To search in a JSON array if it contains a specific value, you can use JSON_CONTAINS() :
SELECT JSON_CONTAINS('["1", "4", "7"]', '"4"', '$'); -- output is 1
SELECT JSON_CONTAINS('["59"]', '"4"', '$'); -- output is 0
Now you can combine both functions to get the rows that contains the expected value :
Schema (MySQL v5.7)
CREATE TABLE modul_69
(
extmod JSON
);
INSERT INTO modul_69 VALUES ('{"68":["1","4","7"],"67":["14"],"75":["28"]}'), ('{"68":["59"],"67":["1","11","13"],"75":["3","4","5","27"]}');
Query #1
SELECT *
FROM modul_69 m
WHERE JSON_CONTAINS(JSON_EXTRACT(m.extmod, '$."68"'),
'"4"',
'$') = 1;
Output
| extmod |
| --------------------------------------------------- |
| {"67": ["14"], "68": ["1", "4", "7"], "75": ["28"]} |
View on DB Fiddle

Map two columns into one on Athena using SerDe properties

I'm trying to map two columns into one on Athena using JsonSerDe properties.
In this case, I want to map both columns conversionsRate and cr from jsons 1 and 2 to column cr_new (doing like a coalesce).
json1
{
"deviceType": "TABLET",
"day": "2018-10-27",
"conversionsRate": 0,
"clicksCount": 3
}
json2
{
"deviceType": "TABLET",
"day": "2018-10-29",
"cr": 2,
"clicksCount": 5
}
The expected result on Athena:
|deviceType|day |cr_new|clicksCount|
|TABLET |2018-10-27|0 |3 |
|TABLET |2018-10-29|2 |5 |
Is it possible to achieve such a result on athena table mapping using SerDe?

JSON SerDe does not mandate (restrict) that column defined in the table DDL should exist in the JSON record. If there is no such attribute, JSONSerDe will return NULL. So, you can define both columns and apply coalesce in the query:
CREATE EXTERNAL TABLE json_table (
devicetype string,
`day` date,
cr int,
conversionsrate int,
clickscount int
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://bucket/path/'; --JSON files location
--make sure JSONs are in separate lines each
select deviceType, `day`, coalesce(conversionsRate ,cr) as cr_new, clicksCount
from json_table ;

insert and fetch strings and matrices to/from MySQL with Matlab

I need to store data in a database. I have installed and configured a MySQL database (and an SQLite database) in Matlab. However I cannot store and retrieve anything other than scalar numeric values.
% create an empty database called test_data base with MySQL workbench.
% connect to it in Matlab
conn=database('test_database','root','XXXXXX','Vendor','MySQL');
% create a table to store values
create_test_table=['CREATE TABLE test_table (testID NUMERIC PRIMARY KEY, test_string VARCHAR(255), test_vector BLOB, test_scalar NUMERIC)'];
curs=exec(conn,create_test_table)
Result is good so far (curs.Message is an empty string)
% create a new record
datainsert(conn,'test_table',{'testID','test_string','test_vector','test_scalar'},{1,'string1',[1,2],1})
% try to read out the new record
sqlquery='SELECT * FROM test_table8';
data_to_view=fetch(conn,sqlquery)
Result is bad:
data_to_view =
1 NaN NaN 1
From the documentation for "fetch" I would expect:
data_to_view =
1×4 table
testID test_string test_vector test_scalar
_____________ ___________ ______________ ________
1 'string1' 1x2 double 1
Until I learn how to read blobs I'd even be willing to accept:
data_to_view =
1×4 table
testID test_string test_vector test_scalar
_____________ ___________ ______________ ________
1 'string1' NaN 1
I get the same thing with an sqlite database. How can I store and then read out strings and blobs and why isn't the data returned in table format?

Matlab does not document that the default options for SQLite and MySQL database retrieval are to attempt to return everything as a numeric array. One only needs this line:
setdbprefs('DataReturnFormat','cellarray')
or
setdbprefs('DataReturnFormat','table')
in order to get results with differing datatypes. However! now my result is:
data_to_view =
1×4 cell array
{[2]} {'string1'} {11×1 int8} {[1]}
If instead I input:
datainsert(conn,'test_table',{'testID','test_string','test_vector','test_scalar'},{1,'string1',typecast([1,2],'int8'),1})
Then I get:
data_to_view =
1×4 cell array
{[2]} {'string1'} {16×1 int8} {[1]}
which I can convert like so:
typecast(data_to_view{3},'double')
ans =
1 2
Unfortunately this does not work for SQLite. I get:
data_to_view =
1×4 cell array
{[2]} {'string1'} {' �? #'} {[1]}
and I can't convert the third part correctly:
typecast(unicode2native(data_to_view{1,3}),'double')
ans =
0.0001 2.0000
So I still need to learn how to read an SQLite blob in Matlab but that is a different question.

How to search multiple items in JSON array in Postgres 9.3

I have scenario where i need to search multiple values in a JSON array. Below is my schema.
ID DATA
1 {"bookIds" : [1,2,3,5], "storeIds": [2,3]}
2 {"bookIds" : [1,2], "storeIds": [1,3]}
3 {"bookIds" : [11,12,10,9], "storeIds": [4,3]}
I want all the rows with value 1,2. Below is query i am using (This is query is written by one of fellow stackoverflow user Mr. klin credit to him).
select t.*
from JSONTest t, json_array_elements(data->'bookIds') books
where books::text::int in (1, 2);
However output I am duplicate rows in output, below is my output.
id data
1 {"bookIds" : [1,2,3,5], "storeIds": [2,3]}
1 {"bookIds" : [1,2,3,5], "storeIds": [2,3]}
2 {"bookIds" : [1,2], "storeIds": [1,3]}
2 {"bookIds" : [1,2], "storeIds": [1,3]}
I want only two rows in output that is id 1,2. How can i do that? I don't want use Distinct due to other constraints,
SQL Fiddle : http://sqlfiddle.com/#!15/6457a/2

Unfortunately there is no direct conversion function from a JSON array to a "real" Postgres array. (data ->'bookIds')::text returns something that is nearly a Postgres array literal: e.g. [1,2,3,5]. If you replace the [] with {} the value can be cast to an integer array. Once we have a proper integer array we can use the #> to test if it contains another array:
select *
from jsontest
where translate((data ->'bookIds')::text, '[]', '{}')::int[] #> array[1,2];
translate((data ->'bookIds')::text, '[]', '{}') will convert [1,2,3,5] to {1,2,3,5} which then is converted to an array using ::int[]
SQLFiddle: http://sqlfiddle.com/#!15/6457a/4

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Parsing json file in hive - json

Related

How to convert Table Data into JSon Array in MYSQL

How to search JSON data in MySQL by key and value?

Map two columns into one on Athena using SerDe properties

insert and fetch strings and matrices to/from MySQL with Matlab

How to search multiple items in JSON array in Postgres 9.3

Categories

Resources