Issue with KSQL STRUCT with VALUE_FORMAT='JSON' - json

Unable to create KSQL Stream with STRUCT with the following data
{
"_id": {"$oid": "62d79f3f63000ed99fa46f27"},
"CustomerID": "TT-21070",
"CustomerName": "TedTrevino",
"Segment": "Consumer",
"Country": "UnitedStates",
"City": "Akron",
"State": "Ohio",
"PostalCode": 44312,
"Region": "East"
}
Here is the Topic
ksql> print 'Mongo.Sample_SuperStore.People' from beginning;
Key format: JSON or HOPPING(KAFKA_STRING) or TUMBLING(KAFKA_STRING) or KAFKA_STRING
Value format: does not match any supported format. It may be a STRING with encoding other than UTF8, or some other format.
rowtime: 2022/07/20 07:33:55.826 Z, key: {"schema":{"type":"string","optional":false},"payload":"{\"_id\": {\"_data\": \"8262D7AFE0000000072B022C0100296E5A1004308C1145AA9245EB958ACBB9EA8ECEDF46645F6964006462D7AFE063000ED99FA470D00004\"}}"}, value: {"schema":{"type":"string","optional":false},"payload":"{\x5C"_id\x5C": {\x5C"$oid\x5C": \x5C"62d7afe063000ed99fa470d0\x5C"}, \x5C"CustomerID\x5C": \x5C"SM-20320\x5C", \x5C"CustomerName\x5C": \x5C"SeanMiller\x5C", \x5C"Segment\x5C": \x5C"HomeOffice\x5C", \x5C"Country\x5C": \x5C"UnitedStates\x5C", \x5C"City\x5C": \x5C"Jacksonville\x5C", \x5C"State\x5C": \x5C"Florida\x5C", \x5C"PostalCode\x5C": 32216, \x5C"Region\x5C": \x5C"South\x5C"}"}, partition: 0
Then created this Stream as below
CREATE STREAM STREAM_SUPERSTORE_PEOPLE (
payload STRUCT< \
_id STRUCT<`$oid` VARCHAR>, \
CustomerID VARCHAR, \
CustomerName VARCHAR, \
Segment VARCHAR, \
Country VARCHAR, \
City VARCHAR, \
State VARCHAR, \
PostalCode INT, \
Region VARCHAR> \
) \
WITH (KAFKA_TOPIC='Mongo.Sample_SuperStore.People', VALUE_FORMAT='JSON');
Output is blank
ksql> select * from STREAM_SUPERSTORE_PEOPLE emit changes;
+---------------------------------------------------------------------------------------------------------------------------------------+
|PAYLOAD |
+---------------------------------------------------------------------------------------------------------------------------------------+
But following way with Type VARCHAR instead of STRUCT is working
CREATE STREAM STREAM_SUPERSTORE_PEOPLE (payload varchar)
WITH (KAFKA_TOPIC='Mongo.Sample_SuperStore.People', VALUE_FORMAT='JSON');
## Extracting field one by one from PAYLOAD
SELECT
EXTRACTJSONFIELD(PAYLOAD, '$.CustomerID') as CustomerID,
EXTRACTJSONFIELD(PAYLOAD, '$.CustomerName') as CustomerName,
EXTRACTJSONFIELD(PAYLOAD, '$.Segment') as Segment,
EXTRACTJSONFIELD(PAYLOAD, '$.Country') as Country,
EXTRACTJSONFIELD(PAYLOAD, '$.City') as City,
EXTRACTJSONFIELD(PAYLOAD, '$.State') as State,
EXTRACTJSONFIELD(PAYLOAD, '$.PostalCode') as PostalCode,
EXTRACTJSONFIELD(PAYLOAD, '$.Region') as Region from STREAM_SUPERSTORE_PEOPLE
emit changes limit 1;
Output
+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+
|CUSTOMERID |CUSTOMERNAME |SEGMENT |COUNTRY |CITY |STATE |POSTALCODE |REGION |
+---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+
|TT-21070 |TedTrevino |Consumer |UnitedStates |Akron |Ohio |44312 |East |
Limit Reached
Query terminated
I want to do with the STRUCT way. Please help.

What you've posted is not your actual data in the topic. Look at the error output, and you see {"schema":{"type":"string","optional":false},"payload":"{\"_id\": ..., so your inner-JSON is just a string, explaining why you needed EXTRACTJSONFIELD function.
In other words, the data in your topic may look like JSON, but it is not properly formatted to use STRUCT VALUE_FORMAT='JSON'.
You need to fix Kafka Connector (?) to use JSONConverter with schemas.enable=false to get records without schema and payload fields.

Related

How to Convert Oracle to Postgresql [duplicate]

I'm trying to migrate Oracle 12c queries to Postgres11.5.
Here is the json:
{
"cost": [{
"spent": [{
"ID": "HR",
"spentamount": {
"amount": 2000.0,
"country": "US"
}
}]
}],
"time": [{
"spent": [{
"ID": "HR",
"spentamount": {
"amount": 308.91,
"country": "US"
}
}]
}]
}
Here is the query that has to be migrated to Postgres 11.5:
select js.*
from P_P_J r,
json_table(r.P_D_J, '$.*[*]'
COLUMNS(NESTED PATH '$.spent[*]'
COLUMNS(
ID VARCHAR2(100 CHAR) PATH '$.ID',
amount NUMBER(10,4) PATH '$.spentamount.amount',
country VARCHAR2(100 CHAR) PATH '$.spentamount.country'))
) js
The result:
ID, amount, country
HR, 2000.0,US
HR,308.91,US
I have two questions here:
What does $.*[*] mean?
How can we migrate this query in Postgres so that it directly looks at 'spent' instead of navigating 'cost'->'spent' or 'time'->'spent'
There is no direct replacement for json_table in Postgres. You will have to combine several calls to explode the JSON structure.
You didn't show us your expected output, but as far as I can tell, the following should do the same:
select e.item ->> 'ID' as id,
(e.item #>> '{spentamount, amount}')::numeric as amount,
e.item #>> '{spentamount, country}' as country
from p_p_j r
cross join jsonb_each(r.p_d_j) as a(key, val)
cross join lateral (
select *
from jsonb_array_elements(a.val)
where jsonb_typeof(a.val) = 'array'
) as s(element)
cross join jsonb_array_elements(s.element -> 'spent') as e(item)
;
The JSON path expression '$.*[*] means: iterate over all top-level keys, then iterate over all array elements found in there and the nested path '$.spent[*]' then again iterates over all array elements in there. These steps are reflected in the three JSON function calls that are needed to get there.
With Postgres 12, this would be a bit easier as this can be done with a single call to jsonb_path_query() which also use a JSON Path to access the elements using a very similar JSON path expression:
select e.item ->> 'ID' as id,
(e.item #>> '{spentamount, amount}')::numeric as amount,
e.item #>> '{spentamount, country}' as country
from p_p_j r
cross join jsonb_path_query(r.p_d_j, '$.*[*].spent[*]') as e(item)
;
Online example

Oracle Parse JSON Variable into table

I need to write a procedure which will accept a parameter of type CLOB, which will actually be a JSON string of text, parse that, and insert it into a table. The fields in the JSON are in the same order as the columns in the table.
The string would look like this:
{
"signal_id": "1",
"ts_id": "3",
"add_price": "0",
"qty": "1",
"stops": "0.00",
"yield": "0.00",
"close_date": "NULL",
"close_price": "0.00",
"ticker": "IBM",
"option_ticker": "NULL",
"signal_date": "2012-07-25",
"estimated_reporting_date": "NULL",
"signal_val": "1",
"comp_name": "INTERNATIONA",
"lt_price": "190.34",
"sell_target": "NULL",
"high_target": "NULL",
}
What is the best way to parse that, and insert into the table?
Use JSON_TABLE:
CREATE PROCEDURE insert_json (i_json IN CLOB)
IS
BEGIN
INSERT INTO your_table (
signal_id, ts_id, add_price, qty, stops, yield, close_date, close_price,
ticker, option_ticker, signal_date, estimated_reporting_date
/*...*/
)
SELECT *
FROM JSON_TABLE(
i_json,
'$'
COLUMNS(
signal_id NUMBER PATH '$.signal_id',
ts_id NUMBER PATH '$.ts_id',
add_price NUMBER PATH '$.add_price',
qty NUMBER PATH '$.qty',
stops NUMBER PATH '$.stops',
yield NUMBER PATH '$.yield',
close_date DATE PATH '$.close_date',
close_price NUMBER PATH '$.close_price',
ticker VARCHAR2(10) PATH '$.ticker',
option_ticker VARCHAR2(10) PATH '$.option_ticker',
signal_date DATE PATH '$.signal_date',
estimated_reporting_date DATE PATH '$.estimated_reporting_date'
-- ...
)
);
END insert_json;
/
db<>fiddle here

Oracle select JSON column as key / value table [duplicate]

This question already has an answer here:
Query json dictionary data in SQL
(1 answer)
Closed 1 year ago.
In Oracle 12c, having a column with JSON data in this format:
{
"user_name": "Dave",
"phone_number": "13326415",
"married": false,
"age": 18
}
How can I select it in this format:
key val
-------------- ----------
"user_name" "Dave"
"phone_number" "13326415"
"married" "false"
"age" "18"
As stated in the comment, there is no way to get the keys of a JSON object using just SQL. With PL/SQL you can create a pipelined function to get the information you need. Below is a very simple pipelined function that will get the keys of a JSON object and print the type each key is, as well as the key name and the value.
First, you will need to create the types that will be used by the function
CREATE OR REPLACE TYPE key_value_table_rec FORCE AS OBJECT
(
TYPE VARCHAR2 (100),
key VARCHAR2 (200),
VALUE VARCHAR2 (200)
);
/
CREATE OR REPLACE TYPE key_value_table_t AS TABLE OF key_value_table_rec;
/
Next, create the pipelined function that will return the information in the format of the types defined above.
CREATE OR REPLACE FUNCTION get_key_value_table (p_json CLOB)
RETURN key_value_table_t
PIPELINED
AS
l_json json_object_t;
l_json_keys json_key_list;
l_json_element json_element_t;
BEGIN
l_json := json_object_t (p_json);
l_json_keys := l_json.get_keys;
FOR i IN 1 .. l_json_keys.COUNT
LOOP
l_json_element := l_json.get (l_json_keys (i));
PIPE ROW (key_value_table_rec (
CASE
WHEN l_json_element.is_null THEN 'null'
WHEN l_json_element.is_boolean THEN 'boolean'
WHEN l_json_element.is_number THEN 'number'
WHEN l_json_element.is_timestamp THEN 'timestamp'
WHEN l_json_element.is_date THEN 'date'
WHEN l_json_element.is_string THEN 'string'
WHEN l_json_element.is_object THEN 'object'
WHEN l_json_element.is_array THEN 'array'
ELSE 'unknown'
END,
l_json_keys (i),
l_json.get_string (l_json_keys (i))));
END LOOP;
RETURN;
EXCEPTION
WHEN OTHERS
THEN
CASE SQLCODE
WHEN -40834
THEN
--JSON format is not valid
NULL;
ELSE
RAISE;
END CASE;
END;
/
Finally, you can call the pipelined function from a SELECT statement
SELECT * FROM TABLE (get_key_value_table (p_json => '{
"user_name": "Dave",
"phone_number": "13326415",
"married": false,
"age": 18
}'));
TYPE KEY VALUE
__________ _______________ ___________
string user_name Dave
string phone_number 13326415
boolean married false
number age 18
If your JSON values are stored in a column in a table, you can view the keys/values using CROSS JOIN
WITH
sample_table (id, json_col)
AS
(SELECT 1, '{"key1":"val1","key_obj":{"nested_key":"nested_val"},"key_bool":false}'
FROM DUAL
UNION ALL
SELECT 2, '{"key3":3.14,"key_arr":[1,2,3]}' FROM DUAL)
SELECT t.id, j.*
FROM sample_table t CROSS JOIN TABLE (get_key_value_table (p_json => t.json_col)) j;
ID TYPE KEY VALUE
_____ __________ ___________ ________
1 string key1 val1
1 object key_obj
1 boolean key_bool false
2 number key3 3.14
2 array key_arr

SQL json_extract returns null

I am attempting to extract from my json object
hits = [{“title”: “Facebook”,
“domain”: “facebook.com”},
{“title”: “Linkedin”,
“domain”: “linkedin.com”}]
When I use:
json_extract(hits,'$.title') as title,
nothing is returned. I would like the result to be: [Facebook, Linkedin].
However, when I extract by a scalar value, ex.:
json_extract_scalar(hits,'$[0].title') as title,
it works and Facebook is returned.
hits contains a lot of values, so I need to use json_extract in order to get all of them, so I can't do each scalar individually. Any suggestions to fix this would be greatly appreciated.
I get INVALID_FUNCTION_ARGUMENT: Invalid JSON path: '$.title' as an error for $.title (double stars). When I try unnest I get INVALID_FUNCTION_ARGUMENT: Cannot unnest type: varchar as an error and INVALID_FUNCTION_ARGUMENT: Cannot unnest type: json. I get SYNTAX_ERROR: line 26:19: Column '$.title' cannot be resolved when I try double quotes
Correct json path to exract all titles is $.[*].title (or $.*.title), though it is not supported by athena. One option is to cast your json to array of json and use transform on it:
WITH dataset AS (
SELECT * FROM (VALUES
(JSON '[{"title": "Facebook",
"domain": "facebook.com"},
{"title": "Linkedin",
"domain": "linkedin.com"}]')
) AS t (json_string))
SELECT transform(cast(json_string as ARRAY(JSON)), js -> json_extract_scalar(js, '$.title'))
FROM dataset
Output:
_col0
[Facebook, Linkedin]
Fits you have an array. So $.title doesn't exist see below
Second, you have not a valid json, is must have double quotes " like the example shows
SET #a := '[{
"title": "Facebook",
"domain": "facebook.com"
},
{
"title": "Linkedin",
"domain": "linkedin.com"
}
]'
SELECT json_extract(#a,'$[0]') as title
| title |
| :---------------------------------------------- |
| {"title": "Facebook", "domain": "facebook.com"} |
SELECT JSON_EXTRACT(#a, "$[0].title") AS 'from'
| from |
| :--------- |
| "Facebook" |
SELECT #a
| #a |
| :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [{<br> "title": "Facebook",<br> "domain": "facebook.com"<br> },<br> {<br><br> "title": "Linkedin",<br> "domain": "linkedin.com"<br> }<br>] |
db<>fiddle here

Oracle JSON_TABLE to PostgreSQL - how to search from the second hierarchical key in a JSON column

I'm trying to migrate Oracle 12c queries to Postgres11.5.
Here is the json:
{
"cost": [{
"spent": [{
"ID": "HR",
"spentamount": {
"amount": 2000.0,
"country": "US"
}
}]
}],
"time": [{
"spent": [{
"ID": "HR",
"spentamount": {
"amount": 308.91,
"country": "US"
}
}]
}]
}
Here is the query that has to be migrated to Postgres 11.5:
select js.*
from P_P_J r,
json_table(r.P_D_J, '$.*[*]'
COLUMNS(NESTED PATH '$.spent[*]'
COLUMNS(
ID VARCHAR2(100 CHAR) PATH '$.ID',
amount NUMBER(10,4) PATH '$.spentamount.amount',
country VARCHAR2(100 CHAR) PATH '$.spentamount.country'))
) js
The result:
ID, amount, country
HR, 2000.0,US
HR,308.91,US
I have two questions here:
What does $.*[*] mean?
How can we migrate this query in Postgres so that it directly looks at 'spent' instead of navigating 'cost'->'spent' or 'time'->'spent'
There is no direct replacement for json_table in Postgres. You will have to combine several calls to explode the JSON structure.
You didn't show us your expected output, but as far as I can tell, the following should do the same:
select e.item ->> 'ID' as id,
(e.item #>> '{spentamount, amount}')::numeric as amount,
e.item #>> '{spentamount, country}' as country
from p_p_j r
cross join jsonb_each(r.p_d_j) as a(key, val)
cross join lateral (
select *
from jsonb_array_elements(a.val)
where jsonb_typeof(a.val) = 'array'
) as s(element)
cross join jsonb_array_elements(s.element -> 'spent') as e(item)
;
The JSON path expression '$.*[*] means: iterate over all top-level keys, then iterate over all array elements found in there and the nested path '$.spent[*]' then again iterates over all array elements in there. These steps are reflected in the three JSON function calls that are needed to get there.
With Postgres 12, this would be a bit easier as this can be done with a single call to jsonb_path_query() which also use a JSON Path to access the elements using a very similar JSON path expression:
select e.item ->> 'ID' as id,
(e.item #>> '{spentamount, amount}')::numeric as amount,
e.item #>> '{spentamount, country}' as country
from p_p_j r
cross join jsonb_path_query(r.p_d_j, '$.*[*].spent[*]') as e(item)
;
Online example