Convert Json to separate columns in HIVE - json

I have 4 columns in Hive database table. First two columns are of type string, 3rd and 4th are of JSON. Type. How to extract json data in different columns.
SERDE available in Hive seems to be handling only json data. I have both normal (STRING) and JSON data. How can I extract data in separate colums here.
Example:
abc 2341 {max:2500e0,value:"20",Type:"1",ProviderType:"ABC"} {Name:"ABC",minA:1200e0,StartDate:1483900200000,EndDate:1483986600000,Flags:["flag4","flag3","flag2","flag1"]}
xyz 6789 {max:1300e0,value:"10",Type:"0",ProviderType:"foo"} {Name:"foo",minA:3.14159e0,StartDate:1225864800000,EndDate:1225864800000,Flags:["foo","foo"]}

Given a fixed JSON
create table mytable (str string,i int,jsn1 string, jsn2 string);
insert into mytable values
('abc',2341,'{"max":2500e0,"value":"20","Type":"1","ProviderType":"ABC"}','{"Name":"ABC","minA":1200e0,"StartDate":1483900200000,"EndDate":1483986600000,"Flags":["flag4","flag3","flag2","flag1"]}')
,('xyz',6789,'{"max":1300e0,"value":"10","Type":"0","ProviderType":"foo"}','{"Name":"foo","minA":3.14159e0,"StartDate":1225864800000,"EndDate":1225864800000,"Flags":["foo","foo"]}')
;
select str,i
,jsn1_max,jsn1_value,jsn1_type,jsn1_ProviderType
,jsn2_Name,jsn2_minA,jsn2_StartDate,jsn2_EndDate
,jsn2_Flags
from mytable
lateral view json_tuple (jsn1,'max','value','Type','ProviderType')
j1 as jsn1_max,jsn1_value,jsn1_type,jsn1_ProviderType
lateral view json_tuple (jsn2,'Name','minA','StartDate','EndDate','Flags')
j2 as jsn2_Name,jsn2_minA,jsn2_StartDate,jsn2_EndDate,jsn2_Flags
;
+-----+------+----------+------------+-----------+-------------------+-----------+-----------+----------------+---------------+-----------------------------------+
| str | i | jsn1_max | jsn1_value | jsn1_type | jsn1_providertype | jsn2_name | jsn2_mina | jsn2_startdate | jsn2_enddate | jsn2_flags |
+-----+------+----------+------------+-----------+-------------------+-----------+-----------+----------------+---------------+-----------------------------------+
| abc | 2341 | 2500.0 | 20 | 1 | ABC | ABC | 1200.0 | 1483900200000 | 1483986600000 | ["flag4","flag3","flag2","flag1"] |
| xyz | 6789 | 1300.0 | 10 | 0 | foo | foo | 3.14159 | 1225864800000 | 1225864800000 | ["foo","foo"] |
+-----+------+----------+------------+-----------+-------------------+-----------+-----------+----------------+---------------+-----------------------------------+

Related

How to parse JSON array that has no element/property names using Oracle

I am new to JSON and am trying to parse the data returned by following URL
https://api.binance.com/api/v3/klines?symbol=LTCBTC&interval=5m
The data is public if you want to see the exact output
I am in an Oracle 18c database trying to use json_table but I am not sure how to format the query or reference the columns as the JSON has no names, just values.
If I just paste in one record from the array as follows then I can get a column with all the values, but I need to parse the entire array and get the output into a table
SELECT *
FROM json_table( '[1617210000000,"0.00325500","0.00326600","0.00325400","0.00326600","780.81000000",1617210299999,"2.54374363",210,"569.58000000","1.85545803","0"]' , '$[*]'
COLUMNS (value PATH '$' ))
I have been searching google for days and not found an example of what I am trying to do, all the example use JSON with name:value pairs.
Thank you in advance.
The raw data is an array of arrays, so you can use $[*] to get the individual arrays, and then numbered positions to get the values from each of those arrays:
SELECT *
FROM json_table(
'[[...], [...], ...]', -- use actual data, as CLOB?
'$[*]'
COLUMNS (
open_time PATH '$[0]',
open PATH '$[1]',
high PATH '$[2]',
low PATH '$[3]',
close PATH '$[4]',
volume PATH '$[5]',
close_time PATH '$[6]',
quote_av PATH '$[7]',
number_of_trades PATH '$[8]',
taker_buy_base_av PATH '$[9]',
taker_buy_quote_av PATH '$[10]',
ignore PATH '$[11]'
)
)
I've taken the column names from the API documentation. Not sure why some are strings, presumably a precision thing; but you can obviously specify the data types. (And there are lots of examples of converted epoch timestamps to Oracle dates/timestamps if you want to do that.)
db<>fiddle with four entries, and an additional column for ordinality, which you might not want/need.
IDX | OPEN_TIME | OPEN | HIGH | LOW | CLOSE | VOLUME | CLOSE_TIME | QUOTE_AV | NUMBER_OF_TRADES | TAKER_BUY_BASE_AV | TAKER_BUY_QUOTE_AV | IGNORE
--: | :------------ | :--------- | :--------- | :--------- | :--------- | :----------- | :------------ | :--------- | :--------------- | :---------------- | :----------------- | :-----
1 | 1617423900000 | 0.00356800 | 0.00357100 | 0.00356400 | 0.00356800 | 358.71000000 | 1617424199999 | 1.27964866 | 90 | 313.96000000 | 1.12008826 | 0
2 | 1617424200000 | 0.00356800 | 0.00357000 | 0.00356600 | 0.00356800 | 349.47000000 | 1617424499999 | 1.24704741 | 105 | 283.05000000 | 1.01005077 | 0
3 | 1617424500000 | 0.00357000 | 0.00357900 | 0.00357000 | 0.00357400 | 412.32000000 | 1617424799999 | 1.47359944 | 127 | 53.73000000 | 0.19203676 | 0
4 | 1617424800000 | 0.00357500 | 0.00357500 | 0.00356500 | 0.00356600 | 910.58000000 | 1617425099999 | 3.25045272 | 198 | 463.30000000 | 1.65400945 | 0

MySQL JSON wildcards

I'm trying to obtain all IDs from table A of the elements which are in a JSON array in table B. My problem is that I don't know how to use wildcard symbols with JSON.
Table A looks like this
+-----------------------+--------------------+
| Identifier (TINYTEXT) | Filter (JSON) |
+-----------------------+--------------------+
| Obj1 | ['Test1', 'Test2'] |
| Obj2 | ['Test3', 'Test4'] |
+-----------------------+--------------------+
and table B looks like this
+-----------+--------------------+
| UID (INT) | Object (TINYTEXT) |
+-----------+--------------------+
| 1 | xyzTest1, abc |
| 2 | xyzTest2, abc |
| 3 | xyzTest3, abc |
| 4 | xyzTest4, abc |
+----------------+---------------+
I want to use A.Identifier as the input to get B.UID as the output, applying the filter A.Filter on B.Object using wildcard symbols.
Example:
I have A.Identifier = 'Obj1' and want to find all B.UID for the corresponding B.Object that contain Test1 or Test2 (A.Filter). In this case, the output would be 1 and 2.
The SQL code without the inner join I would manually use for this is
SELECT UID FROM B WHERE Object LIKE '%Test1%' OR Object LIKE '%Test2%';

MariaDB 10.1 JsonGet_string

In one of our columns we store this example json string:
[{"Name":"Pay Amount","Value":"0.00"},{"Name":"Period","Value":"3"},{"Name":"Client","Value":"TestClient"},{"Name":"Our Reference","Value":""},{"Name":"Pay Type","Value":"Test"}]
We repeat the Names through out and the values will differ.
I've tried querying this data using JsonGet_string :
SELECT
JSONGET_string(Header, "Name") Name
FROM tbl
but what it does it selects the first one i.e PayAmount and it only displays a list of payamount it doesn't select anything for Period, Client etc.
The result that it returns looks like this:
| Name |
|----------|
| |
| PayAmount|
| PayAmount|
And it should return this:
| Name |
|-------------|
| |
| PayAmount |
| Period |
| Client |
| OutReference|
| Pay Type |
Any ideas?

export mysql table with data spread over multiple rows to csv

I have a mysql table which is filled with inputs from a webform on my website. The form has fields for last name, surname, email, phone, address, etc.... and when a user submits the form these data are stored in a mysql table in a rather strange way.
my table looks like this:
subission# | value | field | tstamp | and |many |more |columns
=====================================================================================
1 |john#server.com |email |1448898875 | | | |
1 |john |firstname|1448898875 | | | |
1 |doe |lastname |1448898875 | | | |
1 |london |city |1448898875 | | | |
2 |jane#aol.com |email |1448898870 | | | |
2 |jane |firstname|1448898870 | | | |
2 |doe |lastname |1448898870 | | | |
2 |new york |city |1448898870 | | | |
3 |tim #aol.com |email |1448838571 | | | |
3 |tim |firstname|1448838571 | | | |
3 |smith |lastname |1448838571 | | | |
3 |paris |city |1448838571 | | | |
I need to export these data to a csv file in order to import it to a newsletter script on some other server, but the server expects these data in a different format:
submission#,email,firstname,lastname,tstamp,.....
1,john#server.com,john,doe,london,1448898875,,,,
2,jane#aol.com,jane,doe,1448898870,,,,
The export as csv is not the problem, but how do I get all the data of one submission# into one row? Can anyone please point me into the right direction, how to accomplish this with SQL?
You can achieve the desired output, if you concatenate the field contents into a single field using concat() and group_concat() functions, where the values are separated by comma.
The only issue can be if for a particular submission any of the properies is missing. If that's the case, then you will need a helper table which lists all properies and you need to left join on that table. Since this is not the case for your sample data, I'm not providing the code for this scenario.
select concat(submission, ',', group_concat(`value` order by `field` asc), ',',tstamp)
from table group by submission, tstamp
If you need the field names in the 1st row, then create a separate query that conatenates the field names separated by commas and combine the 2 with union.

Populate values from one table

I am trying to populate an empty table(t) from another table(t2) based on a flag field being set. He is my attempt below and the table data.
UPDATE 2014PriceSheetIssues AS t
JOIN TransSalesAvebyType2013Combined AS t2
SET t.`Tran_Type`=t2.`Tran_Type` WHERE t.`rflag`='1';
When I run the script, I receive (0) zero records affected.??
+-----------+----------------+-------------------+-------+-------+
| Tran_Type | RetailAvePrice | WholesaleAvePrice | Rflag | Wflag |
+-----------+----------------+-------------------+-------+-------+
| 125C | 992 | 650 | 1 | NULL |
| 2004R | 1500 | NULL | 1 | NULL |
| 4EAT | 1480 | 1999 | 1 | 1 |
+-----------+----------------+-------------------+-------+-------+
I think you should just do the following
INSERT INTO 2014PriceSheetIssues
( `fldX`, `fldY` )
VALUES (
SELECT `fldX`, `fldY`
FROM TransSalesAvebyType2013Combined
WHERE 2014PriceSheetIssues.`rflag`='1'
)
The select query gets the values and the insert puts it in the (empty) other table.