I'm trying to retreive some specific data from a json stored in my database.
Here is my fidle : https://www.db-fiddle.com/f/5qZhsyddqJNej2NGj1x1hi/1
An exemple of a json string :
{
"complexProperties":[
{
"properties":{
"key":"Registred",
"Value":"123456789"
}
},
{
"properties":{
"key":"Urgency",
"Value":"Total"
}
},
{
"properties":{
"key":"ImpactScope",
"Value":"All"
}
}
]
}
In this case I need to retreive the value of Registred which is 123456789
Here is the request I tried to retreive first all value:
SELECT CAST(data AS jsonb)::json->>'complexProperties'->'properties' AS Registred FROM jsontesting
Query Error: error: operator does not exist: text -> unknown
You can use a JSON Path expression:
select jsonb_path_query_first(data, '$.complexProperties[*].properties ? (#.key == "Registred").Value')
from jsontesting;
This returns a jsonb value. If you need to convert that to a text value, use jsonb_path_query_first(...) #>> '{}'
Online example
An alternative that first flattens the JSON field (the arrj subquery) and then performs an old-school select. Using your jsontesting table -
select (j -> 'properties' ->> 'Value')
from
(
select json_array_elements(data::json -> 'complexProperties') as j
from jsontesting
) as arrj
where j -> 'properties' ->> 'key' = 'Registred';
Online example
Related
this is my test json file.
{
"item" : {
"fracData" : [ ],
"fractimeData" : [ {
"number" : "1232323232",
"timePeriods" : [ {
"validFrom" : "2021-08-03"
} ]
} ],
"Module" : [ ]
}
}
This is how I read the json file.
starhist_test_df = spark.read.json("/mapr/xxx/yyy/ttt/dev/rawdata/Test.json", multiLine=True)
starhist_test_df.createOrReplaceTempView("v_test_df")
This query works.
df_test_01 = spark.sql("""
select item.fractimeData.number from v_test_df""")
df_test_01.collect();
Result
[Row(number=['1232323232'])]
But this query doesn't work.
df_test_01 = spark.sql("""
select item.fractimeData.timePeriods.validFrom from v_test_df""")
df_test_01.collect();
Error
cannot resolve 'v_test_df.`item`.`fractimeData`.`timePeriods`['validFrom']' due to data type mismatch: argument 2 requires integral type, however, ''validFrom'' is of string type.; line 3 pos 0;
What do I have to change, to read the validFrom field?
dot notation to access values works with struct or array<struct> types.
The schema for field number in item.fractimeData is string and accessing it via dot notation returns an array<string> since fractimeData is an array.
Similarly, the schema for field timePeriods in item.fractimeData is <array<struct<validFrom>>, and accessing it via dot notation wraps it into another array, resulting in final schema of array<array<struct<validFrom>>>.
The error you get is because the dot notation can work on array<struct> but not on array<array>.
Hence, flatten the result from item.fractimeData.timePeriods to get back an array<struct<validFrom>> and then apply the dot notation.
df_test_01 = spark.sql("""
select flatten(item.fractimeData.timePeriods).validFrom as validFrom from v_test_df""")
df_test_01.collect()
"""
[Row(validFrom=['2021-08-03', '2021-08-03'])]
"""
I have some columns in my Oracle database that contains json and to extract it's data in a query, I use REGEXP_SUBSTR.
In the following example, value is a column in the table DOSSIER that contains json. The regex extract the value of the property client.reference in that json
SELECT REGEXP_SUBSTR(value, '"client"(.*?)"reference":"([^"]+)"', 1, 1, NULL, 2) FROM DOSSIER;
So if the json looks like this :
[...],
"client": {
"someproperty":"123",
"someobject": {
[...]
},
"reference":"ABCD",
"someotherproperty":"456"
},
[...]
The SQL query will return ABDC.
My problem is that some json have multiple instance of "client", for example :
[...],
"contract": {
"client":"Name of the client",
"supplier": {
"reference":"EFGH"
}
},
[...],
"client": {
"someproperty":"123",
"someobject": {
[...]
},
"reference":"ABCD",
"someotherproperty":"456"
},
[...]
You get the issue, now the SQL query will return EFGH, which is the supplier's reference.
How can I make sure that "reference" is contained in a json object "client" ?
EDIT : I'm on Oracle 11g so I can't use the JSON API and I would like to avoid using third-party package
Assuming you are using Oracle 12c or later then you should NOT use regular expressions and should use Oracle's JSON functions.
If you have the table and data:
CREATE TABLE table_name ( value CLOB CHECK ( value IS JSON ) );
INSERT INTO table_name (
value
) VALUES (
'{
"contract": {
"client":"Name of the client",
"supplier": {
"reference":"EFGH"
}
},
"client": {
"someproperty":"123",
"someobject": {},
"reference":"ABCD",
"someotherproperty":"456"
}
}'
);
Then you can use the query:
SELECT JSON_VALUE( value, '$.client.reference' ) AS reference
FROM table_name;
Which outputs:
REFERENCE
ABCD
db<>fiddle here
If you are using Oracle 11 or earlier then you could use the third-party PLJSON package to parse JSON in PL/SQL. For example, this question.
Or enable Java within the database and then use CREATE JAVA (or the loadjava utility) to add a Java class that can parse JSON to the database and then wrap it in an Oracle function and use that.
I faced similar issue recently. If "reference" is a property that is only present inside "client" object, this will solve:
SELECT reference FROM (
SELECT DISTINCT
REGEXP_SUBSTR(
DBMS_LOB.SUBSTR(
value,
4000
),
'"reference":"(.+?)"',
1, 1, 'c', 1) reference
FROM DOSSIER
) WHERE reference IS NOT null;
You can also try to adapt the regex to your need.
Edit:
In my case, column type is CLOB and that's why I use DBMS_LOB.SUBSTR function there. You can remove this function and pass column directly in REGEXP_SUBSTR.
I have the following json :
[
{
"transition":"random_word",
"from":"paris",
"to":"porto",
"date":{
"date":"2020-05-28 11:51:25.201864",
"timezone_type":3,
"timezone":"Europe\/Paris"
}
},
{
"transition":"rainbow",
"from":"porto",
"to":"faro",
"date":{
"date":"2020-06-06 23:10:06.878539",
"timezone_type":3,
"timezone":"Europe\/Paris"
}
},
{
"transition":"banana",
"from":"faro",
"to":"rio_de_janeiro",
"date":{
"date":"2020-06-06 23:14:10.975099",
"timezone_type":3,
"timezone":"Europe\/Paris"
}
},
{
"transition":"hello",
"from":"rio_de_janeiro",
"to":"buenos_aires",
"date":{
"date":"2020-06-06 23:14:15.314370",
"timezone_type":3,
"timezone":"Europe\/Paris"
}
}
]
Imagine I want to retrieve the last stop of my traveler (the value of the key "to" from the last json object. Here : buenos_aires) and the date (here :2020-06-06 23:14:15.314370).
How should I proceed knowing that I want to do that using PostgreSQL?
If with "last" you mean the order in which the elements show up in the array, you can use jsonb_array_length() to get the length of the array and use that to obtain the last element:
select (the_json_column -> jsonb_array_length(the_json_column) - 1) ->> 'to' as "to",
(the_json_column -> jsonb_array_length(the_json_column) - 1) #>> '{date,date}' as "date"
from the_table
The expression jsonb_array_length(the_json_column) - 1 calculates the index of the "last" element in the array.
If your column is defined as json rather than jsonb (which it should be) you need to use the equivalent json_array_length() instead.
I have JSON that resembles the following:
{
"ANNOTATIONS": [
{
"Label": "CommingledProduct",
"Text": "NBP"
},
{
"Label": "CommingledVenue",
"Text": "OTC"
}
]
}
I need to unpack this into a flat table with columns matching the annotation labels. So columns based on the above json become:
comingled_product
comingled_venue
The JSON is coming from a json field in a source table and being unpacked into another table.
I know that I could code as follows:
INSERT INTO my_target_table (comingled_product, comingled_venue)
SELECT
payload->'ANNOTATIONS'->0->>'Text',
payload->'ANNOTATIONS'->1->>'Text'
FROM my_source_table;
However, I would rather not use the ordinals of the annotations. I would prefer to use some syntax mirroring the psuedo-code below:
INSERT INTO my_target_table (comingled_product, comingled_venue)
SELECT
payload->'ANNOTATIONS'->'label="ComingledProduct"'->>'Text',
payload->'ANNOTATIONS'->'label="ComingledVenueID"'->>'Text'
FROM my_source_table;
Can anyone tell me if what I'm trying to ahcieve is possible and how to do it? There are more than the two annotations I have included in the sample, so anything that involves multiple joins is probably a no go.
Using PostGres 10.7
demo:db<>fiddle
WITH cte AS (
SELECT
elems.value
FROM
my_source_table,
json_array_elements(payload -> 'ANNOTATIONS') elems
)
SELECT
(SELECT value ->> 'Text' FROM cte WHERE value ->> 'Label' = 'CommingledProduct'),
(SELECT value ->> 'Text' FROM cte WHERE value ->> 'Label' = 'CommingledVenue')
Expanding the array into one row per array element and store this result for further usage into a CTE
This result can be used to query the expected values (without doing the expanding twice)
Could be a little bit faster:
demo:db<>fiddle
SELECT
payload,
MIN(the_text) FILTER (WHERE label = 'CommingledProduct'),
MIN(the_text) FILTER (WHERE label = 'CommingledVenue')
FROM (
SELECT
payload::text AS payload,
elems ->> 'Label' AS label,
elems ->> 'Text' AS the_text
FROM
my_source_table,
json_array_elements(payload -> 'ANNOTATIONS') elems
) s
GROUP BY payload
The answer from #S-Man is great and you should use that for your postgres 10.7. json_path will be added in postgres 12, which will allow you to do something a little bit closer to your pseudo-code, but only with jsonb (not json):
INSERT INTO my_target_table (comingled_product, comingled_venue)
SELECT jsonb_path_query(payload,
'$.ANNOTATIONS[*] ? (#.Label == "CommingledProduct")')->>'Text',
jsonb_path_query(payload,
'$.ANNOTATIONS[*] ? (#.Label == "CommingledVenue")')->>'Text'
FROM my_source_table;
The jsonb_path_query syntax takes a bit to figure out, but it is basically returning elements of the ANNOTATIONS array for which the Label equals either CommingledProduct or CommingledVenue. jsonb_path_query returns a jsonb object, so we can use the ->> operator to grab the value of 'Text' from the object.
I have a postgresql table called datasource with jsonb column called config. It has the following structure:
{
"url":"some_url",
"password":"some_password",
"username":"some_username",
"projectNames":[
"project_name_1",
...
"project_name_N"
]
}
I would like to transform nested json array projectNames into a map and add a default value for each element from the array, so it would look like:
{
"url":"some_url",
"password":"some_password",
"username":"some_username",
"projectNames":{
"project_name_1": "value",
...
"project_name_N": "value"
}
}
I have selected projectNames from the table using postgresql jsonb operator config#>'{projectNames}', but I have no idea how to perform transform operation.
I think, I should use something like jsonb_object_agg, but it converts all data into a single row.
I'm using PostgreSQL 9.6 version.
You need to first unnest the array, then build a new JSON document from that. Then you can put that back into the column.
update datasource
set config = jsonb_set(config, '{projectNames}', t.map)
from (
select id, jsonb_object_agg(pn.n, 'value') as map
from datasource, jsonb_array_elements_text(config -> 'projectNames') as pn (n)
group by id
) t
where t.id = datasource.id;
The above assumes that there is a primary (or at least unique) column named id. The inner select transforms the array into a map.
Online example: http://rextester.com/GPP85654
are you looking for smth like:
t=# with c(j) as (values('{
"url":"some_url",
"password":"some_password",
"username":"some_username",
"projectNames":[
"project_name_1",
"project_name_N"
]
}
'::jsonb))
, n as (select j,jsonb_array_elements_text(j->'projectNames') a from c)
select jsonb_pretty(jsonb_set(j,'{projectNames}',jsonb_object_agg(a,'value'))) from n group by j
;
jsonb_pretty
------------------------------------
{ +
"url": "some_url", +
"password": "some_password", +
"username": "some_username", +
"projectNames": { +
"project_name_1": "value",+
"project_name_N": "value" +
} +
}
(1 row)
Time: 19.756 ms
if so, look at:
https://www.postgresql.org/docs/current/static/functions-aggregate.html
https://www.postgresql.org/docs/current/static/functions-json.html