how to extract data from json using oracle text index - json

I have a table, which has an Oracle text index. I created the index because I need an extra fast search. The table contains JSON data. Oracle json_textcontains works very poorly so I tried to play with CONTAINS (json_textcontains is rewritten to CONTAINS actually if we have a look into query plan).
I want to find all jsons by given class_type and id of value but Oracle looks all over JSON without looking that class_type and id should be in one JSON section i.e. it deals with JSON not like structured data but like a huge string.
Well formatted JSON looks like this:
{
"class":[
{
"class_type":"ownership",
"values":[{"nm":"id","value":"1"}]
},
{
"class_type":"country",
"values":[{"nm":"id","value":"640"}]
},
,
{
"class_type":"features",
"values":[{"nm":"id","value":"15"},{"nm":"id","value":"20"}]
}
]
}
The second one which shouldn't be found looks like this:
{
"class":[
{
"class_type":"ownership",
"values":[{"nm":"id","value":"18"}]
},
{
"class_type":"country",
"values":[{"nm":"id","value":"11"}]
},
,
{
"class_type":"features",
"values":[{"nm":"id","value":"7"},{"nm":"id","value":"640"}]
}
]
}
Please see how to reproduce what I'm trying to achieve:
create table perso.json_data(id number, data_val blob);
insert into perso.json_data
values(
1,
utl_raw.cast_to_raw('{"class":[{"class_type":"ownership","values":[{"nm":"id","value":"1"}]},{"class_type":"country","values":[{"nm":"id","value":"640"}]},{"class_type":"features","values":[{"nm":"id","value":"15"},{"nm":"id","value":"20"}]}]}')
);
insert into perso.json_data values(
2,
utl_raw.cast_to_raw('{"class":[{"class_type":"ownership","values":[{"nm":"id","value":"18"}]},{"class_type":"country","values":[{"nm":"id","value":"11"}]},{"class_type":"features","values":[{"nm":"id","value":"7"},{"nm":"id","value":"640"}]}]}')
)
;
commit;
ALTER TABLE perso.json_data
ADD CONSTRAINT check_is_json
CHECK (data_val IS JSON (STRICT));
CREATE INDEX perso.json_data_idx ON json_data (data_val)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('section group CTXSYS.JSON_SECTION_GROUP SYNC (ON COMMIT)');
select *
from perso.json_data
where ctxsys.contains(data_val, '(640 INPATH(/class/values/value)) and (country inpath (/class/class_type))')>0
The query returns 2 rows but I expect to get only the record where id = 1.
How can I use a full text index with the ability to search without the error I highlighted, without using JSON_TABLE?
There is no options to put data in relational format.
Thanks in advance.

Please don't use the text index directly to try to solve this kind of problem. It's not what it's designed for..
In 12.2.0.1.0 this should work for you (and yes it does use a specialized version of the text index under the covers, but it also applies selective post filtering to ensure the results are correct)..
SQL> create table json_data(id number, data_val blob)
2 /
Table created.
SQL> insert into json_data values(
2 1,utl_raw.cast_to_raw('{"class":[{"class_type":"ownership","values":[{"nm":"id","value":"1"}]},{"class_type":"cou
ntry","values":[{"nm":"id","value":"640"}]},{"class_type":"features","values":[{"nm":"id","value":"15"},{"nm":"id","valu
e":"20"}]}]}')
3 )
4 /
1 row created.
Execution Plan
----------------------------------------------------------
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | 100 | 1 (0)| 00:00:01 |
| 1 | LOAD TABLE CONVENTIONAL | JSON_DATA | | | | |
--------------------------------------------------------------------------------------
SQL> insert into json_data values(
2 2,utl_raw.cast_to_raw('{"class":[{"class_type":"ownership","values":[{"nm":"id","value":"18"}]},{"class_type":"co
untry","values":[{"nm":"id","value":"11"}]},{"class_type":"features","values":[{"nm":"id","value":"7"},{"nm":"id","value
":"640"}]}]}')
3 )
4 /
1 row created.
Execution Plan
----------------------------------------------------------
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | 100 | 1 (0)| 00:00:01 |
| 1 | LOAD TABLE CONVENTIONAL | JSON_DATA | | | | |
--------------------------------------------------------------------------------------
SQL> commit
2 /
Commit complete.
SQL> ALTER TABLE json_data
2 ADD CONSTRAINT check_is_json
3 CHECK (data_val IS JSON (STRICT))
4 /
Table altered.
SQL> CREATE SEARCH INDEX json_SEARCH_idx ON json_data (data_val) for JSON
2 /
Index created.
SQL> set autotrace on explain
SQL> --
SQL> set lines 256 trimspool on pages 50
SQL> --
SQL> select ID, json_query(data_val, '$' PRETTY)
2 from JSON_DATA
3 /
ID
----------
JSON_QUERY(DATA_VAL,'$'PRETTY)
------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------
----------------
1
{
"class" :
[
{
"class_type" : "ownership",
"values" :
[
{
"nm" : "id",
"value" : "1"
}
]
},
{
"class_type" : "country",
"values" :
[
{
"nm" : "id",
"value" : "640"
}
]
},
{
"class_type" : "features",
"values" :
[
{
"nm" : "id",
"value" : "15"
},
{
"nm" : "id",
"value" : "20"
}
]
}
]
}
2
{
"class" :
[
ID
----------
JSON_QUERY(DATA_VAL,'$'PRETTY)
------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------
----------------
{
"class_type" : "ownership",
"values" :
[
{
"nm" : "id",
"value" : "18"
}
]
},
{
"class_type" : "country",
"values" :
[
{
"nm" : "id",
"value" : "11"
}
]
},
{
"class_type" : "features",
"values" :
[
{
"nm" : "id",
"value" : "7"
},
{
"nm" : "id",
"value" : "640"
}
]
}
]
}
Execution Plan
----------------------------------------------------------
Plan hash value: 3213740116
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 4030 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| JSON_DATA | 2 | 4030 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
SQL> select ID, to_clob(data_val)
2 from json_data
3 where JSON_EXISTS(data_val,'$?(exists(#.class?(#.values.value == $VALUE && #.class_type == $TYPE)))' passing '640'
as "VALUE", 'country' as "TYPE")
4 /
ID TO_CLOB(DATA_VAL)
---------- --------------------------------------------------------------------------------
1 {"class":[{"class_type":"ownership","values":[{"nm":"id","value":"1"}]},{"class_
type":"country","values":[{"nm":"id","value":"640"}]},{"class_type":"features","
values":[{"nm":"id","value":"15"},{"nm":"id","value":"20"}]}]}
Execution Plan
----------------------------------------------------------
Plan hash value: 3248304200
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 4 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| JSON_DATA | 1 | 2027 | 4 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | JSON_SEARCH_IDX | | | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(JSON_EXISTS2("DATA_VAL" FORMAT JSON , '$?(exists(#.class?(#.values.value
== $VALUE && #.class_type == $TYPE)))' PASSING '640' AS "VALUE" , 'country' AS "TYPE"
FALSE ON ERROR)=1)
2 - access("CTXSYS"."CONTAINS"("JSON_DATA"."DATA_VAL",'{640} INPATH
(/class/values/value) and {country} INPATH (/class/class_type)')>0)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
SQL> select ID, TO_CLOB(DATA_VAL)
2 from JSON_DATA d
3 where exists (
4 select 1
5 from JSON_TABLE(
6 data_val,
7 '$.class'
8 columns (
9 CLASS_TYPE VARCHAR2(32) PATH '$.class_type',
10 NESTED PATH '$.values.value'
11 columns (
12 "VALUE" VARCHAR2(32) path '$'
13 )
14 )
15 )
16 where CLASS_TYPE = 'country' and "VALUE" = '640'
17 )
18 /
ID TO_CLOB(DATA_VAL)
---------- --------------------------------------------------------------------------------
1 {"class":[{"class_type":"ownership","values":[{"nm":"id","value":"1"}]},{"class_
type":"country","values":[{"nm":"id","value":"640"}]},{"class_type":"features","
values":[{"nm":"id","value":"15"},{"nm":"id","value":"20"}]}]}
Execution Plan
----------------------------------------------------------
Plan hash value: 1621266031
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 32 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | JSON_DATA | 2 | 4054 | 3 (0)| 00:00:01 |
|* 3 | FILTER | | | | | |
|* 4 | JSONTABLE EVALUATION | | | | | |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT 0 FROM JSON_TABLE( :B1, '$.class' COLUMNS(
"CLASS_TYPE" VARCHAR2(32) PATH '$.class_type' NULL ON ERROR , NESTED PATH
'$.values.value' COLUMNS( "VALUE" VARCHAR2(32) PATH '$' NULL ON ERROR ) ) )
"P" WHERE "CTXSYS"."CONTAINS"(:B2,'({country} INPATH (/class/class_type))
and ({640} INPATH (/class/values/value))')>0 AND "P"."CLASS_TYPE"='country'
AND "P"."VALUE"='640'))
3 - filter("CTXSYS"."CONTAINS"(:B1,'({country} INPATH
(/class/class_type)) and ({640} INPATH (/class/values/value))')>0)
4 - filter("P"."CLASS_TYPE"='country' AND "P"."VALUE"='640')
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
SQL>

Related

Find matches in JSON array field in MySQL

Given a JSON object type column in table t, e.g.
| id | obj |
| -- | ---------------------------------- |
| 1 | { "params": { "id": [13, 23]} } |
| 2 | { "params": { "id": [13, 24]} } |
| 3 | { "params": { "id": [11, 23, 45]} }|
and a list of numeric values, e.g. [12, 23, 45].
We need to check every record if it contains values from the given list.
So, the desired result would be
| id | matches |
| -- | -------- |
| 1 | [23] |
| 3 | [23, 45] |
Could someone please help with such a query for the MySQL 8?
Thank you!
You can use json_table:
select t2.id, t2.n_obj from (
select t1.id, (select json_arrayagg(ids.v)
from json_table(t1.obj, "$.params.id[*]" columns(v text path '$')) ids
where json_contains('[12, 23, 45]', ids.v, '$'))
n_obj from t t1) t2
where t2.n_obj is not null;

How can I convert this Firebase database to SQL database?

I am using Firebase for my project. Firebase database looks like that:
{
myObjects:{
1:{
index: '1',
body: 'foo1'
},
2:{
index: '1',
body: 'foo2'
},
3:{
index: '2',
body: 'foo3'
},
},
objectIndex: 1
}
As above, I have myObjects object and objectIndex variable. I was retrieving myObjects which index is same as objectIndex variable. objectIndex variable increments every 3 days and when it reaches 50 it turns into 0. So it is dynamic and I couldn't store it on the table.
Now I want to convert my Firebase database to MySQL.
MySQL will look like this:
|----|------|-------|
| id | body | index |
|----|------|-------|
| 1 | foo1 | 1 |
|----|------|-------|
| 2 | foo2 | 1 |
|----|------|-------|
| 3 | foo3 | 2 |
|----|------|-------|
Where I can store objectIndex variable?
I can update my table structure according to your suggestions.
Thanks in advance.
You can have a 1-row and 1-column "objectIndex" table where the value can be updated by using an SQL Cron-Job.
You can then build a query with a cartesian product that returns your data as follows:
|----|------|-------|-------------|
| id | body | index | objectIndex |
|----|------|-------|-------------|
| 1 | foo1 | 1 | 25 |
|----|------|-------|-------------|
| 2 | foo2 | 1 | 25 |
|----|------|-------|-------------|
| 3 | foo3 | 2 | 25 |
|----|------|-------|-------------|
It is redundant but gets the job done. The code to retrieve these values can be written as follows:
SELECT id, body, index, objectIndex
FROM objectTable, objectIndexTable

MySQL - Search JSON data column

I have a MySQL database column that contains JSON array encoded strings. I would like to search the JSON array where the "Elapsed" value is greater than a particular number and return the corresponding TaskID value of the object the value was found. I have been attempting to use combinations of the JSON_SEARCH, JSON_CONTAINS, and JSON_EXTRACT functions but I am not getting the desired results.
[
{
"TaskID": "TAS00000012344",
"Elapsed": "25"
},
{
"TaskID": "TAS00000012345",
"Elapsed": "30"
},
{
"TaskID": "TAS00000012346",
"Elapsed": "35"
},
{
"TaskID": "TAS00000012347",
"Elapsed": "40"
}
]
Referencing the JSON above, if I search for "Elapsed" > "30" then 2 records would return
'TAS00000012346'
'TAS00000012347'
I am using MySQL version 5.7.11 and new to querying json data. Any help would be appreciated. thanks
With MySQL pre-8.0, there is no easy way to turn a JSON array to a recordset (ie, function JSON_TABLE() is not yet available).
So, one way or another, we need to manually iterate through the array to extract the relevant pieces of data (using JSON_EXTRACT()). Here is a solution that uses an inline query to generate a list of numbers ; another classic approchach is to use a number tables.
Assuming a table called mytable with a column called js holding the JSON content:
SELECT
JSON_EXTRACT(js, CONCAT('$[', n.idx, '].TaskID')) TaskID,
JSON_EXTRACT(js, CONCAT('$[', n.idx, '].Elapsed')) Elapsed
FROM mytable t
CROSS JOIN (
SELECT 0 idx
UNION ALL SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
) n
WHERE JSON_EXTRACT(js, CONCAT('$[', n.idx, '].Elapsed')) * 1.0 > 30
NB: in the WHERE clause, the * 1.0 operation is there to force the conversion to a number.
Demo on DB Fiddle with your sample data:
| TaskID | Elapsed |
| -------------- | ------- |
| TAS00000012346 | 35 |
| TAS00000012347 | 40 |
Yes , you can definitely to it using JSON_EXTRACT() function in mysql.
lets take a table that contains JSON (table client_services here) :
+-----+-----------+--------------------------------------+
| id | client_id | service_values |
+-----+-----------+------------+-------------------------+
| 100 | 1000 | { "quota": 1,"data_transfer":160000} |
| 101 | 1000 | { "quota": 2,"data_transfer":800000} |
| 102 | 1000 | { "quota": 3,"data_transfer":70000} |
| 103 | 1001 | { "quota": 1,"data_transfer":97000} |
| 104 | 1001 | { "quota": 2,"data_transfer":1760} |
| 105 | 1002 | { "quota": 2,"data_transfer":1060} |
+-----+-----------+--------------------------------------+
And now lets say we want client_id for all who have quota>1 , then use this query :
SELECT
id,client_id,
JSON_EXTRACT(service_values, '$.quota') AS quota
FROM client_services
WHERE JSON_EXTRACT(service_values, '$.quota') > 1;
And hence it will result into :
+-----+-----------+-------+
| id | client_id | quota |
+-----+-----------+--------
| 101 | 1000 | 2 |
| 102 | 1000 | 3 |
| 104 | 1001 | 2 |
| 105 | 1002 | 2 |
+-----+-----------+-------+
hope this helps!

PostgreSQL Creating JSON Column From Existing Data

Given data like the below:
+---+------------+------------
|id | change | date
+---+------------+------------
| 1 | name | 2018-06-20
| 2 | address | 2018-06-20
| 3 | email | 2018-06-20
| 4 | email | 2018-06-21
| 5 | address | 2018-06-22
| 6 | address | 2018-06-23
I'm trying to create a view that summarises the above into a single json column with data like:
{"name":["2018-06-20"], "address":["2018-06-20","2018-06-22","2018-06-23"], "email":["2018-06-20","2018-06-21"]}
I have been trying to figure it out using the array_aggr, array_to_json, json_agg, array_build_object functions but I can't seem to get it quite right.
I hope someone can help.
Cheers
You should use jsonb aggregates twice for two levels:
select jsonb_pretty(jsonb_object_agg(change, dates))
from (
select change, jsonb_agg(date) as dates
from my_table
group by change
) s
jsonb_pretty
-----------------------
{ +
"name": [ +
"2018-06-20" +
], +
"email": [ +
"2018-06-20",+
"2018-06-21" +
], +
"address": [ +
"2018-06-20",+
"2018-06-22",+
"2018-06-23" +
] +
}
(1 row)
Note that jsonb_pretty() is unnecessary, used only for a nice output.

Hive - Create a nested Hive table from another non-nested Hive table

I have a hive table - Table A as follows:
id | partner | recent_use | count |
1 | ab | 20160101 | 5 |
1 | cd | 20160304 | 12 |
2 | ab | 20160205 | 1 |
2 | cd | 20150101 | 2 |
3 | ab | 20150401 | 4 |
From Table A, I want to end up with a table like this - Table B:
id | partner |
1 | [ ab : { recent_use:20160101, count:5 } , cd : { recent_use:20160304, count:12 } ]
2 | [ ab : { recent_use:20160205, count:1 } , cd : { recent_use:20150101, count:2 } ]
3 | [ ab : { recent_use:20150401, count:4 } ]
Basically, Table B is a nested version of Table A such that for a given id, all the data from each of its partner is grouped into one column.
I have two questions:
How can I create Table B from Table A?
How can I convert Table B into a JSON document such that I can load the document into any NOSQL DB?
Would really appreciate any help on this. Thanks!
Simple to achieve this is using UDAF - user defined aggregation function. You can write custom function to make things simple. Here is some thing you can using inbuilt functions. Give it a try.
select id, CONCAT("[", concat_ws(',', collect_set(CONCAT('"', partner,
'":{ "recent_use":', recent_use, ', "count":', count, "}"))), "]") as
collJ from tableA group by id
Above SQL will get ID and collJ in string you looking for after that can use get_json_object function to convert to JSON object.
Reference
https://www.qubole.com/resources/cheatsheet/hive-function-cheat-sheet/
https://cwiki.apache.org/confluence/display/Hive/GenericUDAFCaseStudy