Apache Drill: Two Lateral Joins with Array Fails Only With Storage Plugins - apache-drill

Hitting a bit of a wall with Apache Drill, when we query the following where the JSON is selected from a string using convertFromJSON, the query works(if you paste this into a Query window on drill it will work):
SELECT
t_items.item['secName'] as SecurityName
FROM
--dfs.tmp.`4.json` a,
( Select convert_fromjson('{
"data":{
"RefAccountAll": [
{
"valuations":[
{
"securityValuations": [{"secName":"abc"},{"secName":"def"}]
}]
}]
}
}') as a) t,
LATERAL
( SELECT c1.v.valuations as vals1 FROM
UNNEST
(t.a.data.RefAccountAll) c1(v)
) t_orders
,
LATERAL
(SELECT * FROM UNNEST(t_orders.vals1.securityValuations) _items(item) ) t_items
If I perform the same query but with the same JSON in a file (dfs.tmp.4.json), Apache Drill SQL query returns the following error:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException: Unable to get new vector for minor type [LATE] and mode [OPTIONAL]
I can run other queries no problem (with the JSON in a file) but only hit this when the JSON is in a file. In the case where it blows up "c1.v.valuations" is coming up as [], but when I run it with JSON inline right in the query
"c1.v.valuations" has a nice value.
Any ideas or assistance here? You help would be so appreciated!
Thanks,
Ron

What's happening here is that Drill seems to be using different reader settings for the ConvertFROMJSON function than reading the raw JSON.
I'm wondering however, if there might be a better way of querying this data using Drill's native nested functions rather than the JOINS? Do you have a small sample file you could share?

Related

MySQL using JSON_ARRAYAGG in a SELECT IN clause?

Our database solution is very JSON heavy, and as such, our SQL queries are all JSON based (for the most part). This includes extensive use of JSON_ARRAYAGG().
The problem I'm encountering is using a returned array of indexes in WHERE IN, which simply doesn't work. From what I can tell it's a simple formatting issue where MySQL wants an () encapsulation and a JSON array is a [] encapsulation.
For example:
SELECT COUNT(si.ID) AS item_count, JSON_ARRAYAGG(si.ID) AS item_array
FROM sourcing_item si;
Returns:
7, [1,2,3,4,5,6,7]
What I need to do is write a complex nested query that allows for selecting record IDs that are IN the JSON_ARRAYAGG result. Like:
SELECT si.item_name
FROM sourcing_item si
WHERE si.ID IN item_array
Of course the above doesn't work because MySQL doesn't recognize [] vs. ().
Is there a viable workaround for this issue? I'm surprised they haven't updated MySQL to allow the WHERE IN clause to work with a JSON array...
The MEMBER OF operator does this.
SELECT si.item_name
FROM sourcing_item si
WHERE si.ID MEMBER OF (item_array)

Exporting MySql table into JSON Object

how to convert my mysql table result into json Object in database level
for example ,
SELECT json_array(
group_concat(json_object( name, email))
FROM ....
it will produce the result as
[
{
"name": "something",
"email": "someone#somewhere.net"
},
{
"name": "someone",
"email": "something#someplace.com"
}
]
but what i need is i need to given my own query which may contains functions, subqueries etc.
like in postgres select row_to_json(select name,email,getcode(branch) from .....) then i will get the whole result as json object
in mysql is there any possibilities to do like this?
select jsonArray(select name,email,getcode(branch) from .....)
I only found in official Mysql 8 and 5.7 documentation that it supports casting to JSON type. It includes a JSON_ARRAY function in MySQL 8 and 5.7, and JSON_ARRAYAGG function in MySQL 8. Please see the full JSON functions reference here.
It means that does not exist an easy mysql built-in solution to the problem.
Fortunately, our colleagues started a similar discussion here. Maybe you could find your solution there.
For one searching for well-defined attributes JSON casting, the solution is here.

Postgres 10, select from json object containing json array of objects

In the database I have text field contains json with the structure:
"Limits":{
"fields":[
{
"key":"DAILY_LIMIT",
"value":"1559",
"lastModified":1543857829148,
},
{
"key":"MONTHLY_LIMIT",
"value":"25590",
"lastModified":1543857829148,
}
]
}
I need to check if daily_limit exists. It's easy to do with LIKE %DAILY_LIMIT% but performance is not so good and also I won't have access to value (right now I don't need it but maybe in future, it'll be needed). There is an option to check if this key exists without killing the db? I tried with 'Limits'->'fields'-> but I don't know what should be next... And it must be done by query, I cant pass object to backend and then check it
demo: db<>fiddle
If you want to do it the JSON way this could be a solution:
WITH data AS (
SELECT 'somedata' as somedata, '{"Limits":{"fields":[{"key":"DAILY_LIMITS","value":"1559","lastModified":1543857829148},{"key":"MONTHLY_LIMIT","value":"25590","lastModified":1543857829148}]}}'::jsonb as data
)
SELECT
d.*
FROM data d, jsonb_array_elements(data -> 'Limits' -> 'fields')
WHERE value ->> 'key' = 'DAILY_LIMITS'
jsonb_array_elements expands the array into one row each element. In the next step you are able to check the key's value.
But the demo shows, that a simple LIKE would be much faster as #404 mentioned correctly (have a look at the costs of both examples.)

select distinct values from JSON using django-mysql

I'm using this library and my model looks like this:
class PhoneTest(Model):
data = JSONField()
My JSON obj looks something like this (in a real obj there are way more fields):
{
"deviceStatus": true,
"officerCode": 123456,
"imei": 123456789123456
}
For instance, I want to get a list of all officerCodes. How do I do that ? All I've tried so far has not worked. For example this did not:
tests = PhoneTests.objects.all()
tests.distinct('data__mOfficerCode')
It gives me the following error:
NotSupportedError: DISTINCT ON fields is not supported by this database backend
But it's because I'm using this new library, not the native django mysql backend. What are possible workarounds?
I would greatly appreciate any help.
you can use values_list method
PhoneTests.objects.all().values_list('data__mOfficerCode').distinct()

Read/Write json objects from Postgres database through cache

I'm working with read/write through the cache via Apache Ignite and encountered the following problem:
In lastest versions of postgres there is special json type of data and jsonb add-on to work with json in database.
In Apache Ignite this functions don't implemented, how I know. Moreover, when Trying to do this part, I found it possible to read json from the database as PGobject, but there is no way to add jsonb processing to the built-in SQL query parser.
For example, I'm trying send next query:
SELECT jdata->>'tag1' FROM jsontest;
And get exception:
Syntax error in SQL statement "SELECT JDATA-[*]>>'tag1' FROM JSONTEST; "; SQL statement:
SELECT jdata->>'tag1' FROM jsontest; [42000-195]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
at org.h2.message.DbException.get(DbException.java:179)
at org.h2.message.DbException.get(DbException.java:155)
at org.h2.message.DbException.getSyntaxError(DbException.java:191)
at org.h2.command.Parser.getSyntaxError(Parser.java:533)
at org.h2.command.Parser.getSpecialType(Parser.java:3842)
at org.h2.command.Parser.read(Parser.java:3352)
at org.h2.command.Parser.readIf(Parser.java:3259)
at org.h2.command.Parser.readSum(Parser.java:2375)
at org.h2.command.Parser.readConcat(Parser.java:2341)
at org.h2.command.Parser.readCondition(Parser.java:2172)
at org.h2.command.Parser.readAnd(Parser.java:2144)
at org.h2.command.Parser.readExpression(Parser.java:2136)
at org.h2.command.Parser.parseSelectSimpleSelectPart(Parser.java:2047)
at org.h2.command.Parser.parseSelectSimple(Parser.java:2079)
at org.h2.command.Parser.parseSelectSub(Parser.java:1934)
at org.h2.command.Parser.parseSelectUnion(Parser.java:1749)
at org.h2.command.Parser.parseSelect(Parser.java:1737)
at org.h2.command.Parser.parsePrepared(Parser.java:448)
at org.h2.command.Parser.parse(Parser.java:320)
at org.h2.command.Parser.parse(Parser.java:296)
at org.h2.command.Parser.prepareCommand(Parser.java:257)
at org.h2.engine.Session.prepareLocal(Session.java:573)
at org.h2.engine.Session.prepareCommand(Session.java:514)
at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1204)
at org.h2.jdbc.JdbcPreparedStatement.(JdbcPreparedStatement.java:73)
at org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:288)
at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.prepareStatement(IgniteH2Indexing.java:402)
at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.queryDistributedSqlFields(IgniteH2Indexing.java:1365)
... 9 more
And when I'm trying to research structure of Ignite or H2 databse engine, there is no way to add processing of such queries.
So, maybe someone has met this or a similar problem and can advise how to solve it?