I am trying to extract 3 levels of data from JSON with tExtractFields.
I know tHMap can do this but I am having trouble with that approach so I am pursuing a simpler approach for now.
I am working with a Smartsheet JSON response describing a sheet within Smartsheet.
There are 3 levels
Lvl 1 - Sheet info[]
Lvl 2 - Column Info[]
Lvl 2 - Row info[]
Lvl 3 - cell info[]
Using tExtractJsonFields, I am able to retrieve information from Level 1 and Level 3.
I do not know the correct JsonQuery to correctly retrieve level 2.
My problem I would like to extract information from Level 2 Row.Id, Row.Value in the same tExtractJsonFields component. Any help would be appreciated.
tExtractJsonFields configuration
tLogRow Output
Fields 2 and 3 are null.
Clearly, I am doing something wrong.
Sample JSON
{ "id": 8566480355780484,
"columns": [
{ "id": 7605383392978820,
"title": "Item #"
},
{ "id": 1975883858765700,
"title": "Indicator"
}
],
"rows": [
{ "id": 4808422210070404,
"rowNumber": 1,
"cells": [
{
"columnId": 7605383392978820,
"value": "0002",
"displayValue": "0002"
},
{
"columnId": 1975883858765700,
"value": "Draft",
"displayValue": "Draft"
}
]
},
{ "id": 2556622396385156,
"rowNumber": 2,
"cells": [
{ "columnId": 7605383392978820,
"value": "0003",
"displayValue": "0003"
}
]
}
]
}
Not sure if there is another way, but I did find a way using an approach Talend outlines in their documentation here.
The trick is to parse the higher levels in prior tExtractJsonFields components and then let that information flow through by simply leaving those JSON queries blank in the subsequent components.
The tFilterRow component is simply to exclude items that have only null values.
Related
When returning a list of objects in a JSON response, say a GET request to a /movies endpoint, is it more common to return a JSON array or an object that wraps a JSON array? I've seen both formats in APIs and I was wondering if the standard. If there isn't, which way is preferable?
i.e.
[
{
"name": "Harry Potter",
"year": 2000
}
]
vs.
{
"movies": [
{
"name": "Harry Potter",
"year": 2000
}
]
}
In general if you have a service that only return a list, the first option is perfect fine:
[
{
"name": "Harry Potter",
"year": 2000
}
]
But if you are thinking in a general way to do it will be better add more context data, as total items counter, pagination variables or status values. So in spite of the first one is perfectly fine, I always prefer the second one, but without the name of the collection/array/table name and with more context info, as for example:
{
"items": [
{
"name": "Harry Potter",
"year": 2000
}
],
"total": 1,
"page": 1,
"pages": 1
"status": 1,
"timestamp: 121344
}
Set the array nested on movies value is a bit redundant. But for my it's only a practical approach that for my experience is more readable and used in all projects which I am related.
I'm relatively new to Power Query, but I'm pulling in this basic structure of JSON from a web api
{
"report": "Cost History",
"dimensions": [
{
"time": [
{
"name": "2019-11",
"label": "2019-11",
…
},
{
"name": "2019-12",
"label": "2019-12",
…
},
{
"name": "2020-01",
"label": "2020-01",
…
},
…
]
},
{
"Category": [
{
"name": "category1",
"label": "Category 1",
…
},
{
"name": "category2",
"label": "Category 2",
…
},
…
]
}
],
"data": [
[
[
40419.6393798211
],
[
191.44
],
…
],
[
[
2299.652439184997
],
[
0.0
],
…
]
]
}
I actually have 112 categories and 13 "times". I figured out how to do multiple queries to turn the times into column headers and the categories into row labels (I think). But the data section is alluding me. Because each item is a list within a list I'm not sure how to expand it all out. Each object in the date array will have 112 numbers and there will be 13 objects. If that all makes sense.
So ultimately I want to make it look like
2019-11 2019-20 2020-01 ...
Category 1 40419 2299
Category 2 191 0
...
First time asking a question on here, so hopefully this all makes sense and is clear. Thanks in advance for any help!
i am also researching this exact thing and looking for a solution. In PQ, it displays nested arrays as a list and there is a function to extract values choosing a separating characterenter image description here
So this becomes, this
enter image description here
= Table.TransformColumns(#"Filtered Rows", {"aligned_to_ids", each Text.Combine(List.Transform(_, Text.From), ","), type text})
However the problem i'm trying to solve is when the nested json has multiple values like this: enter image description here
And when these LIST are extracted then an error message is caused, = Table.TransformColumns(#"Extracted Values1", {"collaborators", each Text.Combine(List.Transform(_, Text.From), ","), type text})
Expression.Error: We cannot convert a value of type Record to type Text.
Details:
Value=
id=15890
goal_id=323
role_id=15
Type=[Type]
It seems the multiple values are not handled and PQ does not recognise the underlying structure to enable the columns to be expanded.
So I basically have JSON output from the JIRA Insights API, been digging around and found jq for parsing the JSON. Struggling to wrap my head around on how parse the following to only return values for the objectTypeAttributeId's that I am interested in.
For Example I'm only interested in the value of objectTypeAttributeId 887 provided that objectTypeAttributeId 911's name states as active, but then would like to return the name value of another objectTypeAttributeId
Can this be achieved using jq only? Or shoudl I be using something else?
I can filter down to this level which is the 'attributes' section of the JSON output and print each value, but struggling to find an example catering for my situation.
{
"id": 137127,
"objectTypeAttributeId": 887,
"objectAttributeValues": [
{
"value": "false"
}
],
"objectId": 9036,
"position": 16
},
{
"id": 137128,
"objectTypeAttributeId": 888,
"objectAttributeValues": [
{
"value": "false"
}
],
"objectId": 9036,
"position": 17
},
{
"id": 137296,
"objectTypeAttributeId": 911,
"objectAttributeValues": [
{
"status": {
"id": 1,
"name": "Active",
"category": 1
}
}
],
"objectId": 9036,
"position": 18
},
Can this be achieved using jq only?
Yes, jq was designed precisely for this kind of query. In your case, you could use any, select and if ... then ... else ... end, along the lines of:
if any(.[]; .objectTypeAttributeId == 911 and
any(.objectAttributeValues[]; .status.name == "Active"))
then map(select(.objectTypeAttributeId == 887))
else "whatever"
end
Is it possible to collect recursive-descent results into a single array with jq?
Would flatten help? Looks so to me, but I just cannot get it working. Take a look how far I am now at https://jqplay.org/s/6bxD-Wq0QE, anyone can make it working?
BTW,
.data.search.edges[].node | {name, topics: ..|.topics?} works, but I want all topics from the same node to be in one array, instead of having same name in all different returned results.
flatten alone will give me Cannot iterate over null, and
that's why I'm trying to use map(select(.? != null)) to filter the nulls out. However, I'd get Cannot iterate over null as well for my map-select.
So now it all comes down to how to filter out those nulls?
UPDATE:, by "collect into a single array" I meant to get something like this:
[
{
"name": "leumi-leumicard-bank-data-scraper",
"topics": ["banking", "leumi", "api", "puppeteer", "scraper", "open-api"]
}
]
instead of having same name duplicated in all different returned results. Thus recursively descends seems to me to be the option, but I'm open to any solution as long as I can get result like above. Is that possible? Thx.
One way to collect the non-falsey values:
.data.search.edges[].node
| {name, topics: [.. | .topics? | select(.)]}
The result would be:
{
"name": "leumi-leumicard-bank-data-scraper",
"topics": [
"banking",
"leumi",
"api",
"puppeteer",
"scraper",
"open-api"
]
}
{
"name": "echarts-scrappeteer",
"topics": []
}
Not sure what you're expecting to get in your results... but it seems like you're trying to get all the repositories and their topics in a flat array. I don't see any reason why you should use recurse here, you're only selecting from one class of objects. Just reference them directly.
[.data.search.edges[].node | {name,topic:(.repositoryTopics.nodes[].topic.topics)}]
For your particular input produces:
[
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "banking"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "leumi"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "api"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "puppeteer"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "scraper"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "open-api"
}
]
https://jqplay.org/s/G2inYAJNLS
If you wanted to have an array of topics within the nodes instead, just collect them in an array by putting the filter that selects the topics within [].
[.data.search.edges[].node | {name,topic:[.repositoryTopics.nodes[].topic.topics]}]
[
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": [
"banking",
"leumi",
"api",
"puppeteer",
"scraper",
"open-api"
]
},
{
"name": "echarts-scrappeteer",
"topic": []
}
]
https://jqplay.org/s/0AFneNK89i
Is there a way to deserialize a JSON that includes references to objects that already exist inside it using typescript?
For example we have a grand parent "Papa" that is associated with two parents "Dad" and "Mom" that they have together two children, the json looks like:
{
"id_": 1,
"name": "Papa",
"parents": [
{
"#class": "com.doubleip.spot.mgmt.test.domain.model.Parent",
"id_": 1,
"name": "Dad",
"children": [
{
"#class": "com.doubleip.spot.mgmt.test.domain.model.Child",
"id_": 1,
"name": "Bob"
},
{
"#class": "com.doubleip.spot.mgmt.test.domain.model.Child",
"id_": 2,
"name": "Trudy"
}
]
},
{
"#class": "com.doubleip.spot.mgmt.test.domain.model.Parent",
"id_": 2,
"name": "Mom",
"children": [
1,
2
]
}
]
}
You may see that the children of Mom are just inserted as the value of their "id_" field. This happens due to JsonIdentityInfo used in Java and fasterxml library.
So we face problem in front-end deserialisation where we use typescript angular and primeng in order to visualise our data.
So we face problem in front-end deserialisation
you need to write most of the code yourself (or generate it using more code from your Java code).
That said, there are a few hydration helpers. I recommend : https://github.com/mobxjs/serializr