JSON path parent object, or equivalent MongoDB query - json

I am selecting nodes in a JSON input but can't find a way to include parent object detail for each array entry that I am querying. I am using pentaho data integration to query the data using JSON input form a mongodb input.
I have also tried to create a mongodb query to achieve the same but cannot seem to do this either.
Here are the two fields/paths that display the data:
$.size_break_costs[*].size
$.size_break_costs[*].quantity
Here is the json source format:
{
"_id" : ObjectId("4f1f74ecde074f383a00000f"),
"colour" : "RAVEN-SMOKE",
"name" : "Authority",
"size_break_costs" : [
{
"quantity" : NumberLong("80"),
"_id" : ObjectId("518ffc0697eee36ff3000002"),
"size" : "S"
},
{
"quantity" : NumberLong("14"),
"_id" : ObjectId("518ffc0697eee36ff3000003"),
"size" : "M"
},
{
"quantity" : NumberLong("55"),
"_id" : ObjectId("518ffc0697eee36ff3000004"),
"size" : "L"
}
],
"sku" : "SK3579"
}
I currently get the following results:
S,80
M,14
L,55
I would like to get the SKU and Name as well as my source will have multiple products (SKU/Description):
SK3579,Authority,S,80
SK3579,Authority,M,14
SK3579,Authority,L,55
When I try To include using $.sku, I the process errors.
The end result i'm after is a report of all products and the available quantities of their various sizes. Possibly there's an alternative mongodb query that provides this.
EDIT:
It seems the issue may be due to the fact that not all lines have the same structure. For example the above contains 3 sizes - S,M,L. Some products come in one size - PACK. Other come in multiple sizes - 28,30,32,33,34,36,38 etc.
The error produced is:
*The data structure is not the same inside the resource! We found 1 values for json path [$.sku], which is different that the number retourned for path [$.size_break_costs[].quantity] (7 values). We MUST have the same number of values for all paths.
I have tried the following mongodb query separately which gives the correct results, but the corresponding export of this doesn't work. No values are returned for the Size and Quantity.
Query:
db.product_details.find( {}, {sku: true, "size_break_costs.size": true, "size_break_costs.quantity": true}).pretty();
Export:
mongoexport --db brandscope_production --collection product_details --csv --out Test01.csv --fields sku,"size_break_costs.size","size_break_costs.quantity" --query '{}';

Shortly after I added my own bounty, I figured out the solution. My problem has the same basic structure, which is a parent identifier, and some number N child key/value pairs for ratings (quality, value, etc...).
First, you'll need a JSON Input step that gets the SKU, Name, and size_break_costs array, all as Strings. The important part is that size_break_costs is a String, and is basically just a stringified JSON array. Make sure that under the Content tab of the JSON Input, that "Ignore missing path" is checked, in case you get one with an empty array or the field is missing for some reason.
For your fields, use:
Name | Path | Type
ProductSKU | $.sku | String
ProductName | $.name | String
SizeBreakCosts | $.size_break_costs | String
I added a "Filter rows" block after this step, with the condition "SizeBreakCosts IS NOT NULL", which is then passed to a second JSON Input block. This second JSON block, you'll need to check "Source is defined in a field?", and set the value of "Get source from field" to "SizeBreakCosts", or whatever you named it in the first JSON Input block.
Again, make sure "Ignore missing path" is checked, as well as "Ignore empty file". From this block, we'll want to get two fields. We'll already have ProductSKU and ProductName with each row that's passed in, and this second JSON Input step will further split it into however many rows are in the SizeBreakCosts input JSON. For fields, use:
Name | Path | Type
Quantity | $.[*].quantity | Integer
Size | $.[*].size | String
As you can see, these paths use "$.[*].FieldName", because the JSON string we passed in has an array as the root item, so we're getting every item in that array, and parsing out its quantity and size.
Now every row should have the SKU and name from the parent object, and the quantity and size from each child object. Dumping this example to a text file, I got:
ProductSKU;ProductName;Size;Quantity
SK3579;Authority;S; 80
SK3579;Authority;M; 14
SK3579;Authority;L; 55

Related

jq join on common key

I'm very new to jq and this post is a result of not understanding the mechanics behind jq.
I could develop a bash script, which does what I want but jq and it's JSON super-powers have intrigued me and I'd like to learn it by applying to real world scenarios. Here's one...
BTW, I've tried to make use of the existing jq related SO solutions for merging/joining JSONs but have failed.
The closest I came to what I needed was to use an INDEX and a concatenation of $x + . , however I was only getting the LAST item from my second (c2) json.
So, my problem is as follows:
There are Two JSON files:
JSON #1 will have unique "id" and "type" keys - among other key/value pairs, which I've removed for better clarity of my post.
JSON #2 will contain multiples/non-unique "type" keys, which I'd like to match these two JSON files on. This JSON #2 will also contain other key/value pairs, which are expected to be contained in the resultant output.
My output requirements are:
I'd like to obtain a (one per line or a single array) list of all combinations of matching key/values pairs between c1 and c2 array where the value of the "type" key (string) matches between c1 and c2 exactly.
One more question, how much more difficult would it be to scale the solution to perform similar matching/joining between three JSON files at once - again on the same value of a particular key?
Any assistance or even just hints on how to solve and understand how to solve this would be greatly appreciated!
1st input file: JSON #1, Array c1 (collection 1)
{ "c1":
[
{ "c1id":1, "type":"alpha" },
{ "c1id":2, "type":"beta" }
]
}
2nd input file: JSON #2, Array c2 (collection 2)
{
"c2":
[
{ "c2id":1,"type":"alpha","serial":"DDBB001"} ,
{ "c2id":2,"type":"beta","serial":"DDBB007"} ,
{ "c2id":3,"type":"alpha","serial":"DDTT005"} ,
{ "c2id":4,"type":"beta","serial":"DDAA002"} ,
{ "c2id":5,"type":"yotta","serial":"DDCC017"}
]
}
Expected output:
{"c1id":1,"type":"alpha","c2id":1,"serial":"DDBB001"}
{"c1id":1,"type":"alpha","c2id":3,"serial":"DDTT005"}
{"c1id":2,"type":"beta","c2id":2,"serial":"DDBB007"}
{"c1id":2,"type":"beta","c2id":4,"serial":"DDAA002"}
You will notice that type "yotta" from the c2 is not included in the output. This is expected. Only "types" which exist in c1 and match c2 are expected to be in the results. I guess this is implied by this being a matching/joining exercise - I added it just for clarity - I hope it worked.
Here's an example of using INDEX and JOIN:
jq --compact-output --slurpfile c1 c1.json '
INDEX(
$c1[0].c1[];
.type
) as $index |
JOIN(
$index;
.c2[];
.type;
reverse|add
)
' c2.json
The first argument to INDEX needs to produce a stream of items, which is why we apply [] to get the items from the array individually. The second argument selects our index key.
We use the four argument version of JOIN. The first argument is the index itself, the second is a stream of objects to be joined to the index, the third argument selects the lookup key from the streamed objects, and the fourth argument is an expression to assemble the join object. The input to that expression is a stream of two-item arrays, each looking something like this:
[{"c2id":1,"type":"alpha","serial":"DDBB001"},{"c1id":1,"type":"alpha"}]
Since we just want to combine all the keys and values from the objects we just use add, but we first reverse the array to nicely arrange the c1 fields before the c2 fields. The end result is as you hoped:
{"c1id":1,"type":"alpha","c2id":1,"serial":"DDBB001"}
{"c1id":2,"type":"beta","c2id":2,"serial":"DDBB007"}
{"c1id":1,"type":"alpha","c2id":3,"serial":"DDTT005"}
{"c1id":2,"type":"beta","c2id":4,"serial":"DDAA002"}

Get object inside more than one object MYSQL JSON

How can I select array "1" inside the "flavor" object from json code in mysql
Attribute name: settings
{"without":{"usd":{"new":"5","old":"8"},"weight":"5"},"color":{"2","3"},"flavor":{"1","2"}}
And how can I get a number inside the "usd" object inside "new" knowing these objects are inside the first object and they are variable, perhaps ["without" or "long" or ......]
Attribute name: settings
{"without":{"usd":{"new":"5","old":"8"},"weight":"5"},"color":{"2","3"},"flavor":{["1","2"}}
{"long":{"usd":{"new":"2","old":"3"},"weight":"2"},"medium":{"usd":{"new":"3","old":"4"},"weight":"3"},"short":{"usd":{"new":"4","old":"5"},"weight":"4"}}
{"short":{"usd":{"new":"4","old":"5"},"weight":"2"},"color":{"1","2"}}
LIKE
without = 5
long = 2
short = 4
I rebuilt the data format so that I could extract the required data
{"size":[{"id":1,"url":"without","weight":"5","price":{"usd":{"new":"5","old":"8"}}}],"color":[{"id":"2","url":"yellow"},{"id":"3","url":"green"}],"flavor":[{"id":"1","url":"berry"},{"id":"2","url":"strawberry"}]}
MYSQL
JSON_EXTRACT(details.settings, '$.color[*].url') LIKE '%yellow%'

Kusto KQL reference first object in an JSON array

I need to grab the value of the first entry in a json array with Kusto KQL in Microsoft Defender ATP.
The data format looks like this (anonymized), and I want the value of "UserName":
[{"UserName":"xyz","DomainName":"xyz","Sid":"xyz"}]
How do I split or in any other way get the "UserName" value?
In WDATP/MSTAP, for the "LoggedOnUsers" type of arrays, you want "mv-expand" (multi-value expand) in conjunction with "parsejson".
"parsejson" will turn the string into JSON, and mv-expand will expand it into LoggedOnUsers.Username, LoggedOnUsers.DomainName, and LoggedOnUsers.Sid:
DeviceInfo
| mv-expand parsejson(LoggedOnUsers)
| project DeviceName, LoggedOnUsers.UserName, LoggedOnUsers.DomainName
Keep in mind that if the packed field has multiple entries (like DeviceNetworkInfo's IPAddresses field often does), the entire row will be expanded once per entry - so a row for a machine with 3 entries in "IPAddresses" will be duplicated 3 times, with each different expansion of IpAddresses:
DeviceNetworkInfo
| where Timestamp > ago(1h)
| mv-expand parsejson(IPAddresses)
| project DeviceName, IPAddresses.IPAddress
to access the first entry's UserName property you can do the following:
print d = dynamic([{"UserName":"xyz","DomainName":"xyz","Sid":"xyz"}])
| extend result = d[0].UserName
to get the UserName for all entries, you can use mv-expand/mv-apply:
print d = dynamic([{"UserName":"xyz","DomainName":"xyz","Sid":"xyz"}])
| mv-apply d on (
project d.UserName
)
thanks for the reply, but the proposed solution didn't work for me. However instead I found the following solution:
project substring(split(split(LoggedOnUsers,',',0),'"',4),2,9)
The output of this is: UserName

What's the correct JsonPath expression to search a JSON root object using Newtonsoft.Json.NET?

Most examples deal with the book store example from Stefan Gössner, however I'm struggling to define the correct JsonPath expression for a simple object (no array):
{ "Id": 1, "Name": "Test" }
To check if this json contains Id = 1.
I tried the following expression: $..?[(#.Id == 1]), but this does find any matches using Json.NET?
Also tried Manatee.Json for parsing, and there it seems the jsonpath expression could be like $[?($.Id == 1)] ?
The path that you posted is not valid. I think you meant $..[?(#.Id == 1)] (some characters were out of order). My answer assumes this.
The JSON Path that you're using indicates that the item you're looking for should be in an array.
$ start
.. recursive search (1)
[ array item specification
?( item-based query
#.Id == 1 where the item is an object with an "Id" with value == 1 at the root
) end item-based query
] end array item specification
(1) the conditions following this could match a value no matter how deep in the hierarchy it exists
You want to just navigate the object directly. Using $.Id will return 1, which you can validate in your application.
All of that said...
It sounds to me like you want to validate that the Id property is 1 rather than to search an array for an object where the Id property is 1. To do this, you want JSON Schema, not JSON Path.
JSON Path is a query language for searching for values which meet certain conditions (e.g. an object where Id == 1.
JSON Schema is for validating that the JSON meet certain requirements (your data's in the right shape). A JSON Schema to validate that your object has a value of 1 could be something like
{
"properties": {
"Id": {"const":1}
}
}
Granted this isn't very useful because it'll only validate that the Id property is 1, which ideally should only be true for one object.

Having separate arrays, how to extract value based on the column name?

I am trying to extract data from some JSON with JQ - I have already got it down to the last level of data that I need to extract from, but I am completely stumped as to how to proceed with how this part of the data is formatted.
An example would be:
{
"values": [
[
1483633677,
42
]
],
"columns": [
"time",
"count_value"
],
"name": "response_time_error"
}
I would want to extract just the value for a certain column (e.g. count_value) and I can extract it by using [-1] in this specific case, but I want to select the column by its name in case they change in the future.
If you're only extracting a single value and the arrays will always correspond with eachother, you could find the index in the columns array then use that index into the values array.
It seems like values is an array of rows with those values. Assuming you want to output the values of all rows with the selected column:
$ jq --arg col 'count_value' '.values[][.columns | index($col)]' input.json
If the specified column name does not exist in .columns, then Jeff's filter will fail with a rather obscure error message. It might therefore be preferable to check whether the column name is found. Here is an illustration of how to do so:
jq --arg col count_value '
(.columns | index($col)) as $ix
| if $ix then .values[][$ix] else empty end' input.json
If you want an informative error message to be printed, then replace empty with something like:
error("specified column name, \($col), not found")