I need some help querying this JSON file I've ingested into a temp table in Snowflake. So, I've created a JSON_DATA variant column and plan to query and do a COPY INTO another table, but my query isn't working yet... I feel I'm close (possibly?)
JSON layout:
{
"nextPage": "01",
"page": "0",
"status": "ok",
"transactions": [
{
"id": "65985",
"recordTp": "vendorbill",
"values": {
"account": [
{
"text": "14500 Deferred Expenses",
"value": "249"
}
],
"account.number": "1450",
"account.type": [
{
"text": "Deferred Expense",
"value": "DeferExpense"
}
],
"amount": "51733",
"classnohierarchy": [
{
"text": "901 Corporate",
"value": "139"
}
],
"currency": [
{
"text": "Canadian Dollar",
"value": "3"
}
],
"customer.altname": "V Sties expenses (Tor)",
"customer.custate": "12/31/2019",
"customer.custentient": "ada Inc.",
"customer.custendate": "1/1/2019",
"customer.entyid": "PR781",
"departmentnohierarchy": [
{
"text": "8rity",
"value": "37"
}
],
"fxamount": "689",
"location": [
{
"text": "Othad Projects",
"value": "48"
}
],
"postingperiod": [
{
"text": "Jan 2020",
"value": "1"
}
],
"subsidiary.custrecord_region": [
{
"text": "CANADA",
"value": "3"
}
],
"subsidiarynohierarchy": [
{
"text": "ada Inc.",
"value": "25"
}
]
}
},
I've been able to query the values that are not (deeply) nested but I need help getting, for example, the values from 'classnohierarchy', to get both the 'text' and 'value' I tried:
transactions.value:"values".classnohierarchy.text::string as class_txt,
transactions.value:"values".classnohierarchy.value::string as class_val,
but it's returning NULL values.
Below is my entire query:
SELECT
JSON_DATA:status::string as connection_status,
transactions.value:id::string as id,
transactions.value:recordType::string as record_type,
transactions.value:"values"::variant as trans_val,
transactions.value:"values".account as acc,
transactions.value:"values".account.text as text,
transactions.value:"values".account.value as val,
transactions.value:"values"."account.number"::string as acc_num,
transactions.value:"values"."account.type".text::string as acc_type_txt,
transactions.value:"values"."account.type".value::string as acc_type_val,
transactions.value:"values".amount::string as amount,
**transactions.value:"values".classnohierarchy.text::string as class_txt,
transactions.value:"values".classnohierarchy.value::string as class_val,**
transactions.value:"values".currency.text::string as currency_text,
transactions.value:"values".currency.value::string as currency_val,
transactions.value:"values"."customer.altname"::string as customer_project_name,
transactions.value:"values"."customer.custate"::string as customer_end_date,
transactions.value:"values"."customer.custentient"::string as customer_end_client,
transactions.value:"values"."customer.custendate"::string as customer_start_date,
transactions.value:"values"."customer.entyid"::string as customer_project_id,
transactions.value:"values".departmentnohierarchy.text::string as department_name,
transactions.value:"values".departmentnohierarchy.value::string as department_value,
transactions.value:"values".fxamount::string as fx_amount,
transactions.value:"values".location.text::string as product_name,
transactions.value:"values".postingperiod.text::string as postingperiod,
transactions.value:"values".postingperiod.value::string as postingperiod,
transactions.value:"values"."subsidiary.custrecord_region".text::string as region_name,
transactions.value:"values"."subsidiary.custrecord_region".value::string as region_value,
transactions.value:"values".subsidiarynohierarchy.text::string as entity_name,
transactions.value:"values".subsidiarynohierarchy.value::string as entity_value,
FROM MY_TABLE,
LATERAL FLATTEN (JSON_DATA:transactions) as transactions
and here's a picture of whats showing in Snowflake:
SNOWFLAKE_SCREENSHOT
departmentnohierarchy is an array. you need to mention the index as below.
select *,transactions.VALUE:"values".departmentnohierarchy[0].value::text as department_name
FROM jsont1,
LATERAL FLATTEN (JSON_DATA:transactions) as transactions
Related
I have a structure that looks like so
[
[
{
"ID": "grp1-001",
},
{
"ID": "grp1-002",
},
{
"ID": "grp1-003",
},
{
"ID": "grp1-004",
},
{
"ID": "grp1-005",
},
{
"ID": "grp1-006",
}
],
[
{
"ID": "grp2-001",
},
{
"ID": "grp2-002",
},
{
"ID": "grp2-003",
},
{
"ID": "grp2-004",
},
{
"ID": "grp2-005",
},
{
"ID": "grp2-006",
}
.......
what I need to get as a result of the modification is this
[
[
["1", "grp1-001"],
["2", "grp1-002"],
["3", "grp1-003"],
["4", "grp1-004"],
["5", "grp1-005"],
["6", "grp1-006"],
],
[
["1", "grp2-001"],
["2", "grp2-002"],
["3", "grp2-003"],
["4", "grp2-004"],
["5", "grp2-005"],
["6", "grp2-006"],
],
Which means I need to keep the external structure (outside array and an internal grouping) but convert the inner dict to an array and replace the "ID" key with a value (that will come from external source like --argjson). I am not even sure how to start - any ideas/resources are highly appreciated.
Assuming you're just taking the objects and transforming them to pairs of the index in the array and the ID value, you could do this:
map([to_entries[] | [.key + 1, .value.ID | tostring]])
https://jqplay.org/s/RBac7SPfdG
Using to_entries/0 on an array gives you an array of key/value (index/value) pairs. You could then shift the indices by 1 and convert to strings.
I have nested JSON files on S3 and am trying to query them with Athena.
However, I am having problems to query the nested JSON values.
My JSON file looks like this:
{
"id": "17842007980192959",
"acount_id": "17841401243773780",
"stats": [
{
"name": "engagement",
"period": "lifetime",
"values": [
{
"value": 374
}
],
"title": "Engagement",
"description": "Total number of likes and comments on the media object",
"id": "17842007980192959/insights/engagement/lifetime"
},
{
"name": "impressions",
"period": "lifetime",
"values": [
{
"value": 11125
}
],
"title": "Impressions",
"description": "Total number of times the media object has been seen",
"id": "17842007980192959/insights/impressions/lifetime"
},
{
"name": "reach",
"period": "lifetime",
"values": [
{
"value": 8223
}
],
"title": "Reach",
"description": "Total number of unique accounts that have seen the media object",
"id": "17842007980192959/insights/reach/lifetime"
},
{
"name": "saved",
"period": "lifetime",
"values": [
{
"value": 0
}
],
"title": "Saved",
"description": "Total number of unique accounts that have saved the media object",
"id": "17842007980192959/insights/saved/lifetime"
}
],
"import_date": "2017-12-04"
}
What I'm trying to do is to query the "stats" field value where name=impressions.
So ideally something like:
SELECT id, account_id, stats.values.value WHERE stats.name='engagement'
AWS example: https://docs.aws.amazon.com/athena/latest/ug/searching-for-values.html
Any help would be appreciated.
You can query the JSON with the following table definition:
CREATE EXTERNAL TABLE test(
id string,
acount_id string,
stats array<
struct<
name:string,
period:string,
values:array<
struct<value:string>>,
title:string
>
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://bucket/';
Now, the value column is available through the following unnesting:
select id, acount_id, stat.name,x.value
from test
cross join UNNEST(test.stats) as st(stat)
cross join UNNEST(stat."values") as valx(x)
WHERE stat.name='engagement';
I'm intervening on an existing app which interacts with an elasticsearch sever and i'm seeing some weird responses, probably due to the fact that i'm new to elastic.
I have the indexed item below :
"_id": "59773d268770541557000012",
"_score": 0.03282923,
"_source": {
"_id": "59773d268770541557000012",
"active": null,
"address": "dummy address",
"center_ids": [],
"consultation_site_ids": [],
"coordinates": null,
"created_at": "2017-07-25T14:44:22.270+02:00",
"death_declaration_form_step_id": "56ddb086f0e0103b44000000",
"end_of_pregnancy_form_step_id": "56c34e63f0e0105e65000000",
"fax": "06.95.40.58.84",
"form_step_ids": [
"55361b215342491667030000",
"5541f16252f131f6a125a375",
"55361ba05342491667040000",
"553610835342491667010000",
"55361d225342491667050000",
"5541f34a52f131f6a125a377"
],
"hospital_id": "57c004905c5393772c002a62",
"name": "test site d'encronologie",
"phone": "06.95.40.58.84",
"short_name": "test site d'encronologie d'endcronologie",
"sites_union_ids": [],
"state": "active",
"updated_at": "2017-07-25T14:44:22.270+02:00",
"url": "http://www.testurl.com",
"user_ids": [],
"warnings_threshold": null,
"_type": "Site
AND I am querying the server with this query:
"query":{
"filtered":{
"query":{
"bool":{
"should":[
{
"multi_match":{
"fields":[
"name^5",
"name.edge^1",
"name.full^0.3"
],
"query":"enc",
"type":"cross_fields"
}
},
{
"match":{
"name":{
"query":"enc",
"type":"phrase_prefix",
"operator":"or"
}
}
},
{
"match":{
"name":{
"query":"enc",
"type":"boolean",
"boost":5
}
}
}
]
}
},
"filter":{
"and":[
{
"term":{
"hospital_id":"57c004905c5393772c002a62"
}
},
{
"term":{
"state":"active"
}
}
]
}
}
}}
Which returns nothing (no hits)
And the other hand, if I change the filter operator "AND" to "OR" I recieve my 1 hit.
I am talking about the "and" on the "filter" branch :
"filter":{
"and":[
I realy don't understand how come OR works but not AND?
Then again when I change my query term from "enc" to "zzz_enc" in all the query{} of the first branch WHILE keeping the "OR" I have zero matches, even though the filter condition hospital_id and state are true on my item.
Why does the filter operator behave like this ?
Thank you in advance.
Using Orient db 2.1.12(DocumentDB) version.Facing issues in expanding the linked list column .
Result of my orientdb query :
{
"result": [
{
"#type": "d",
"#rid": "#28:0",
"#version": 7,
"#class": "testSuite",
"testSuiteName": "web",
"testCaseLink": [
"#20:0",
"#20:1",
"#20:2",
"#20:3",
"#20:4",
"#20:5"
],
"testingType": "Web",
"#fieldTypes": "testCaseLink=z"
}
],
"notification": "Query executed in 0.061 sec. Returned 1 record(s)"
}
testCaseLink is a property of linked list with values as rid of another class.The query used to obtain the above result select * from testSuite
Expected output :
{
"result": [
{
"#type": "d",
"#rid": "#28:0",
"#version": 7,
"#class": "testSuite",
"testSuiteName": "web",
"testCaseLink": [
{
"#type": "d",
"#rid": "#20:0",
"#version": 5,
"#class": "testCase",
"name": "testForBAsu",
"uiJson": "#18:0",
"testcaseType": "webWithCsv",
"isEdited": false,
"isDeleted": false,
"childtestCaseLink": [
"#20:3",
"#20:4"
],
"#fieldTypes": "uiJson=x,childtestCaseLink=z"
},
{
"#type": "d",
"#rid": "#20:1",
"#version": 6,
"#class": "testCase",
"name": "success",
"uiJson": "#18:1",
"testcaseType": "WebWithoutCsv",
"isEdited": true,
"isDeleted": false,
"eeJson": "#19:0",
"parentTestCaseLink": null,
"#fieldTypes": "uiJson=x,eeJson=x,parentTestCaseLink=x"
},
"#20:2",
"#20:3",
"#20:4",
"#20:5"
],
"testingType": "Web",
"#fieldTypes": "testCaseLink=z"
}
],
"notification": "Query executed in 0.061 sec. Returned 1 record(s)"
}
Need to Expand the rid present in the list.
Tired select testSuiteName,testingType,Expand(testCaseLink) from testSuite where testSuiteName='web'
But the query expands only the testCaseLink.Note:testCaseLink contains rid of different class
You could use
select from testSuite FETCHPLAN *:1
Check documentation for more information.
Hope it helps.
select from testSuite FETCHPLAN *:1
Following is my JSON file . I have to get Fields mentioned for each page and for each Type in comma separated string. Please help in how to proceed using Linq
Example : If I want "Type = customFields" defined for "page1" , have to get output in comma separated ProjectID,EmployeeID,EmployeeName,hasExpiration etc
{
"Fields": {
"Pages": {
"Page": {
"-Name": "page1",
"Type": [
{
"-TypeID": "CUSTOMIZEDFIELDS",
"Field": [
"ProjectID",
"EmployeeID",
"EmployeeName",
"HasExpiration",
"EndDate",
"OTStrategy",
"Division",
"AddTimesheets",
"SubmitTimesheets",
"ManagerTimesheetApprovalRequired",
"OTAllowed",
"AddExpenses",
"SubmitExpenses",
"ManagerExpenseApprovalRequired",
"SendApprovalEmails"
]
},
{
"-TypeID": "CFDATASET",
"Field": [
"ProjectID",
"EmployeeID",
"EmployeeName",
"HasExpiration",
"EndDate",
"OTStrategy",
"Division",
"AddTimesheets",
"SubmitTimesheets",
"ManagerTimesheetApprovalRequired",
"OTAllowed",
"AddExpenses",
"SubmitExpenses",
"ManagerExpenseApprovalRequired",
"SendApprovalEmails"
]
},
{
"-TypeID": "CFDATASETCAPTION",
"Field": [
"ProjectID",
"EmployeeID",
"EmployeeName",
"HasExpiration",
"EndDate",
"OTStrategy",
"Division",
"AddTimesheets",
"SubmitTimesheets",
"ManagerTimesheetApprovalRequired",
"OTAllowed",
"AddExpenses",
"SubmitExpenses",
"ManagerExpenseApprovalRequired",
"SendApprovalEmails"
]
}
]
}
}
}
}