Snowflake - Querying Nested JSON - json

I need some help querying this JSON file I've ingested into a temp table in Snowflake. So, I've created a JSON_DATA variant column and plan to query and do a COPY INTO another table, but my query isn't working yet... I feel I'm close (possibly?)
JSON layout:
{
"nextPage": "01",
"page": "0",
"status": "ok",
"transactions": [
{
"id": "65985",
"recordTp": "vendorbill",
"values": {
"account": [
{
"text": "14500 Deferred Expenses",
"value": "249"
}
],
"account.number": "1450",
"account.type": [
{
"text": "Deferred Expense",
"value": "DeferExpense"
}
],
"amount": "51733",
"classnohierarchy": [
{
"text": "901 Corporate",
"value": "139"
}
],
"currency": [
{
"text": "Canadian Dollar",
"value": "3"
}
],
"customer.altname": "V Sties expenses (Tor)",
"customer.custate": "12/31/2019",
"customer.custentient": "ada Inc.",
"customer.custendate": "1/1/2019",
"customer.entyid": "PR781",
"departmentnohierarchy": [
{
"text": "8rity",
"value": "37"
}
],
"fxamount": "689",
"location": [
{
"text": "Othad Projects",
"value": "48"
}
],
"postingperiod": [
{
"text": "Jan 2020",
"value": "1"
}
],
"subsidiary.custrecord_region": [
{
"text": "CANADA",
"value": "3"
}
],
"subsidiarynohierarchy": [
{
"text": "ada Inc.",
"value": "25"
}
]
}
},
I've been able to query the values that are not (deeply) nested but I need help getting, for example, the values from 'classnohierarchy', to get both the 'text' and 'value' I tried:
transactions.value:"values".classnohierarchy.text::string as class_txt,
transactions.value:"values".classnohierarchy.value::string as class_val,
but it's returning NULL values.
Below is my entire query:
SELECT
JSON_DATA:status::string as connection_status,
transactions.value:id::string as id,
transactions.value:recordType::string as record_type,
transactions.value:"values"::variant as trans_val,
transactions.value:"values".account as acc,
transactions.value:"values".account.text as text,
transactions.value:"values".account.value as val,
transactions.value:"values"."account.number"::string as acc_num,
transactions.value:"values"."account.type".text::string as acc_type_txt,
transactions.value:"values"."account.type".value::string as acc_type_val,
transactions.value:"values".amount::string as amount,
**transactions.value:"values".classnohierarchy.text::string as class_txt,
transactions.value:"values".classnohierarchy.value::string as class_val,**
transactions.value:"values".currency.text::string as currency_text,
transactions.value:"values".currency.value::string as currency_val,
transactions.value:"values"."customer.altname"::string as customer_project_name,
transactions.value:"values"."customer.custate"::string as customer_end_date,
transactions.value:"values"."customer.custentient"::string as customer_end_client,
transactions.value:"values"."customer.custendate"::string as customer_start_date,
transactions.value:"values"."customer.entyid"::string as customer_project_id,
transactions.value:"values".departmentnohierarchy.text::string as department_name,
transactions.value:"values".departmentnohierarchy.value::string as department_value,
transactions.value:"values".fxamount::string as fx_amount,
transactions.value:"values".location.text::string as product_name,
transactions.value:"values".postingperiod.text::string as postingperiod,
transactions.value:"values".postingperiod.value::string as postingperiod,
transactions.value:"values"."subsidiary.custrecord_region".text::string as region_name,
transactions.value:"values"."subsidiary.custrecord_region".value::string as region_value,
transactions.value:"values".subsidiarynohierarchy.text::string as entity_name,
transactions.value:"values".subsidiarynohierarchy.value::string as entity_value,
FROM MY_TABLE,
LATERAL FLATTEN (JSON_DATA:transactions) as transactions
and here's a picture of whats showing in Snowflake:
SNOWFLAKE_SCREENSHOT

departmentnohierarchy is an array. you need to mention the index as below.
select *,transactions.VALUE:"values".departmentnohierarchy[0].value::text as department_name
FROM jsont1,
LATERAL FLATTEN (JSON_DATA:transactions) as transactions

Related

Change subelement with jq

I have a structure that looks like so
[
[
{
"ID": "grp1-001",
},
{
"ID": "grp1-002",
},
{
"ID": "grp1-003",
},
{
"ID": "grp1-004",
},
{
"ID": "grp1-005",
},
{
"ID": "grp1-006",
}
],
[
{
"ID": "grp2-001",
},
{
"ID": "grp2-002",
},
{
"ID": "grp2-003",
},
{
"ID": "grp2-004",
},
{
"ID": "grp2-005",
},
{
"ID": "grp2-006",
}
.......
what I need to get as a result of the modification is this
[
[
["1", "grp1-001"],
["2", "grp1-002"],
["3", "grp1-003"],
["4", "grp1-004"],
["5", "grp1-005"],
["6", "grp1-006"],
],
[
["1", "grp2-001"],
["2", "grp2-002"],
["3", "grp2-003"],
["4", "grp2-004"],
["5", "grp2-005"],
["6", "grp2-006"],
],
Which means I need to keep the external structure (outside array and an internal grouping) but convert the inner dict to an array and replace the "ID" key with a value (that will come from external source like --argjson). I am not even sure how to start - any ideas/resources are highly appreciated.
Assuming you're just taking the objects and transforming them to pairs of the index in the array and the ID value, you could do this:
map([to_entries[] | [.key + 1, .value.ID | tostring]])
https://jqplay.org/s/RBac7SPfdG
Using to_entries/0 on an array gives you an array of key/value (index/value) pairs. You could then shift the indices by 1 and convert to strings.

AWS Athena - Querying JSON - Searching for Values

I have nested JSON files on S3 and am trying to query them with Athena.
However, I am having problems to query the nested JSON values.
My JSON file looks like this:
{
"id": "17842007980192959",
"acount_id": "17841401243773780",
"stats": [
{
"name": "engagement",
"period": "lifetime",
"values": [
{
"value": 374
}
],
"title": "Engagement",
"description": "Total number of likes and comments on the media object",
"id": "17842007980192959/insights/engagement/lifetime"
},
{
"name": "impressions",
"period": "lifetime",
"values": [
{
"value": 11125
}
],
"title": "Impressions",
"description": "Total number of times the media object has been seen",
"id": "17842007980192959/insights/impressions/lifetime"
},
{
"name": "reach",
"period": "lifetime",
"values": [
{
"value": 8223
}
],
"title": "Reach",
"description": "Total number of unique accounts that have seen the media object",
"id": "17842007980192959/insights/reach/lifetime"
},
{
"name": "saved",
"period": "lifetime",
"values": [
{
"value": 0
}
],
"title": "Saved",
"description": "Total number of unique accounts that have saved the media object",
"id": "17842007980192959/insights/saved/lifetime"
}
],
"import_date": "2017-12-04"
}
What I'm trying to do is to query the "stats" field value where name=impressions.
So ideally something like:
SELECT id, account_id, stats.values.value WHERE stats.name='engagement'
AWS example: https://docs.aws.amazon.com/athena/latest/ug/searching-for-values.html
Any help would be appreciated.
You can query the JSON with the following table definition:
CREATE EXTERNAL TABLE test(
id string,
acount_id string,
stats array<
struct<
name:string,
period:string,
values:array<
struct<value:string>>,
title:string
>
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://bucket/';
Now, the value column is available through the following unnesting:
select id, acount_id, stat.name,x.value
from test
cross join UNNEST(test.stats) as st(stat)
cross join UNNEST(stat."values") as valx(x)
WHERE stat.name='engagement';

ElasticSearch filtered query with operator AND and OR

I'm intervening on an existing app which interacts with an elasticsearch sever and i'm seeing some weird responses, probably due to the fact that i'm new to elastic.
I have the indexed item below :
"_id": "59773d268770541557000012",
"_score": 0.03282923,
"_source": {
"_id": "59773d268770541557000012",
"active": null,
"address": "dummy address",
"center_ids": [],
"consultation_site_ids": [],
"coordinates": null,
"created_at": "2017-07-25T14:44:22.270+02:00",
"death_declaration_form_step_id": "56ddb086f0e0103b44000000",
"end_of_pregnancy_form_step_id": "56c34e63f0e0105e65000000",
"fax": "06.95.40.58.84",
"form_step_ids": [
"55361b215342491667030000",
"5541f16252f131f6a125a375",
"55361ba05342491667040000",
"553610835342491667010000",
"55361d225342491667050000",
"5541f34a52f131f6a125a377"
],
"hospital_id": "57c004905c5393772c002a62",
"name": "test site d'encronologie",
"phone": "06.95.40.58.84",
"short_name": "test site d'encronologie d'endcronologie",
"sites_union_ids": [],
"state": "active",
"updated_at": "2017-07-25T14:44:22.270+02:00",
"url": "http://www.testurl.com",
"user_ids": [],
"warnings_threshold": null,
"_type": "Site
AND I am querying the server with this query:
"query":{
"filtered":{
"query":{
"bool":{
"should":[
{
"multi_match":{
"fields":[
"name^5",
"name.edge^1",
"name.full^0.3"
],
"query":"enc",
"type":"cross_fields"
}
},
{
"match":{
"name":{
"query":"enc",
"type":"phrase_prefix",
"operator":"or"
}
}
},
{
"match":{
"name":{
"query":"enc",
"type":"boolean",
"boost":5
}
}
}
]
}
},
"filter":{
"and":[
{
"term":{
"hospital_id":"57c004905c5393772c002a62"
}
},
{
"term":{
"state":"active"
}
}
]
}
}
}}
Which returns nothing (no hits)
And the other hand, if I change the filter operator "AND" to "OR" I recieve my 1 hit.
I am talking about the "and" on the "filter" branch :
"filter":{
"and":[
I realy don't understand how come OR works but not AND?
Then again when I change my query term from "enc" to "zzz_enc" in all the query{} of the first branch WHILE keeping the "OR" I have zero matches, even though the filter condition hospital_id and state are true on my item.
Why does the filter operator behave like this ?
Thank you in advance.

Expand the linked list column in OrientDb

Using Orient db 2.1.12(DocumentDB) version.Facing issues in expanding the linked list column .
Result of my orientdb query :
{
"result": [
{
"#type": "d",
"#rid": "#28:0",
"#version": 7,
"#class": "testSuite",
"testSuiteName": "web",
"testCaseLink": [
"#20:0",
"#20:1",
"#20:2",
"#20:3",
"#20:4",
"#20:5"
],
"testingType": "Web",
"#fieldTypes": "testCaseLink=z"
}
],
"notification": "Query executed in 0.061 sec. Returned 1 record(s)"
}
testCaseLink is a property of linked list with values as rid of another class.The query used to obtain the above result select * from testSuite
Expected output :
{
"result": [
{
"#type": "d",
"#rid": "#28:0",
"#version": 7,
"#class": "testSuite",
"testSuiteName": "web",
"testCaseLink": [
{
"#type": "d",
"#rid": "#20:0",
"#version": 5,
"#class": "testCase",
"name": "testForBAsu",
"uiJson": "#18:0",
"testcaseType": "webWithCsv",
"isEdited": false,
"isDeleted": false,
"childtestCaseLink": [
"#20:3",
"#20:4"
],
"#fieldTypes": "uiJson=x,childtestCaseLink=z"
},
{
"#type": "d",
"#rid": "#20:1",
"#version": 6,
"#class": "testCase",
"name": "success",
"uiJson": "#18:1",
"testcaseType": "WebWithoutCsv",
"isEdited": true,
"isDeleted": false,
"eeJson": "#19:0",
"parentTestCaseLink": null,
"#fieldTypes": "uiJson=x,eeJson=x,parentTestCaseLink=x"
},
"#20:2",
"#20:3",
"#20:4",
"#20:5"
],
"testingType": "Web",
"#fieldTypes": "testCaseLink=z"
}
],
"notification": "Query executed in 0.061 sec. Returned 1 record(s)"
}
Need to Expand the rid present in the list.
Tired select testSuiteName,testingType,Expand(testCaseLink) from testSuite where testSuiteName='web'
But the query expands only the testCaseLink.Note:testCaseLink contains rid of different class
You could use
select from testSuite FETCHPLAN *:1
Check documentation for more information.
Hope it helps.
select from testSuite FETCHPLAN *:1

Read JSON File for Records using Linq

Following is my JSON file . I have to get Fields mentioned for each page and for each Type in comma separated string. Please help in how to proceed using Linq
Example : If I want "Type = customFields" defined for "page1" , have to get output in comma separated ProjectID,EmployeeID,EmployeeName,hasExpiration etc
{
"Fields": {
"Pages": {
"Page": {
"-Name": "page1",
"Type": [
{
"-TypeID": "CUSTOMIZEDFIELDS",
"Field": [
"ProjectID",
"EmployeeID",
"EmployeeName",
"HasExpiration",
"EndDate",
"OTStrategy",
"Division",
"AddTimesheets",
"SubmitTimesheets",
"ManagerTimesheetApprovalRequired",
"OTAllowed",
"AddExpenses",
"SubmitExpenses",
"ManagerExpenseApprovalRequired",
"SendApprovalEmails"
]
},
{
"-TypeID": "CFDATASET",
"Field": [
"ProjectID",
"EmployeeID",
"EmployeeName",
"HasExpiration",
"EndDate",
"OTStrategy",
"Division",
"AddTimesheets",
"SubmitTimesheets",
"ManagerTimesheetApprovalRequired",
"OTAllowed",
"AddExpenses",
"SubmitExpenses",
"ManagerExpenseApprovalRequired",
"SendApprovalEmails"
]
},
{
"-TypeID": "CFDATASETCAPTION",
"Field": [
"ProjectID",
"EmployeeID",
"EmployeeName",
"HasExpiration",
"EndDate",
"OTStrategy",
"Division",
"AddTimesheets",
"SubmitTimesheets",
"ManagerTimesheetApprovalRequired",
"OTAllowed",
"AddExpenses",
"SubmitExpenses",
"ManagerExpenseApprovalRequired",
"SendApprovalEmails"
]
}
]
}
}
}
}