Convert a JSON representation of CSV data to actual CSV data - json

This self-answered question is about transforming a JSON representation of CSV data into actual CSV data.[1]
The following JSON contains separate properties that describe the headers (column names) (columns) and arrays of corresponding row values (rows), respectively:
{
"columns": [
{
"name": "ColumnName1",
"type": "Number"
},
{
"name": "ColumnName2",
"type": "String"
},
{
"name": "ColumnName3",
"type": "String"
}
],
"rows": [
[
11111,
"ResourceType1",
"String1"
],
[
22222,
"ResourceType2",
"String2"
],
[
33333,
"ResourceType3",
"String3"
]
]
}
How can I convert this JSON input to the CSV data it represents?
[1] The question duplicates this closed question, which was closed presumably due to lack of effort, even though what it asks for is reasonably well-defined.

Note that CSV files have no concept of data types - all values are strings,
so the data-type information (from the .column.type properties) is lost, unless you choose to incorporate it
in some way as a convention that the consumer of the CSV would have to be aware of (the code below does not do that).
Assume that the JSON in the question is saved in file file.json, which can be parsed into a ([pscustomobject]) object graph with ConvertFrom-Json, via reading the file as text with Get-Content:
# Convert the JSON text into a [pscustomobject] object graph.
$fromJson = ConvertFrom-Json (Get-Content -Raw file.json)
# Process the array of column names and the arrays of row values by
# enclosing the array elements in "..." and joining them with ","
(, $fromJson.Columns.Name + $fromJson.Rows).ForEach({
$_.ForEach({ '"{0}"' -f ($_ -replace '"', '""') }) -join ','
})
Note that the above encloses the column names and values in "..." so as to also support
names and values with embedded , characters; additionally, any embedded " characters are properly escaped by doubling them.
If you know that the input data neither contains values with embedded , nor
", you can simply omit the inner .ForEach() array method
call above, which will result in unquoted values.
The above outputs:
"ColumnName1","ColumnName2","ColumnName3"
"11111","ResourceType1","String1"
"22222","ResourceType2","String2"
"33333","ResourceType3","String3"
To convert the above in-memory to ([pscustomobject]) objects representing the CSV data, use ConvertFrom-Csv (... represents the command above):
... | ConvertFrom-Csv
To save the above to a CSV file, use Set-Content:
... | Set-Content -Encoding utf8 out.csv

Related

Filtering JSON by timestamp in powershell

I have retrieved json log data from a rest API as follows
[
{
"id": "6523276",
"type": "logs",
"attributes": {
"created-at": "2022-02-22T10:50:26Z",
"action": "delete",
"resource-name": "DocumentABC.docx",
"user-name": "Joe Smith"
}
},
{
"id": "6523275",
"type": "logs",
"attributes": {
"created-at": "2022-02-22T10:03:22Z",
"action": "create",
"resource-name": "Document123.docx",
"user-name": "Joe Smith"
}
},
{
"id": "6523274",
"type": "logs",
"attributes": {
"created-at": "2022-02-22T06:42:21Z",
"action": "open",
"resource-name": "123Document.docx",
"user-name": "Joe Smith"
}
}
]
I need to Post the json to another web app but I only want the last hour of logs.
In the json example above, the current time was 2022-02-22T10:55:22Z, therefore I'm only interested in the first two log entries.
For example
[
{
"id": "6523276",
"type": "logs",
"attributes": {
"created-at": "2022-02-22T10:50:26Z",
"action": "delete",
"resource-name": "DocumentABC.docx",
"user-name": "Joe Smith"
}
},
{
"id": "6523275",
"type": "logs",
"attributes": {
"created-at": "2022-02-22T10:03:22Z",
"action": "create",
"resource-name": "Document123.docx",
"user-name": "Joe Smith"
}
}
]
Here is my powershell v7 script
$json = $json | ConvertFrom-Json
$filterTime = (Get-date).AddHours(-1)
$RFCfilterTime = [Xml.XmlConvert]::ToString($filterTime,[Xml.XmlDateTimeSerializationMode]::Utc)
$Filteredjson = $json | Where-Object $json.attributes[0] -ge $RFCfilterTimefilterDate
$jsonToPost = ConvertTo-Json -InputObject #($Filteredjson) -Depth 5
The problem is ConvertFrom-Json changes the 'created-at' from RFC3339 format to 'datetime' format.
Therefore the Where-Object filter doesn't work...
id type attributes
-- ---- ----------
6523276 logs #{created-at=22/02/2022 10:50:26 AM; action…
6523275 logs #{created-at=22/02/2022 10:03:22 AM; action…
6523274 logs #{created-at=22/02/2022 6:42:21 AM; action=…
How do I change all of the 'created-at' objects back to RCF3339 format?
Is the
$json | Where-Object $json.attributes[0] -ge $RFCfilterTimefilterDate
statement being used correctly?
Is there any easier way altogether?
Your approach should work in principle, but there was a problem with your Where-Object statement - see the bottom section.
Mathias' answer shows how to work with the [datetime] instances that result from ConvertTo-Json's parsing directly, but a bit more work is required:
Indeed, in PowerShell (Core) v6+ ConvertFrom-Json (which with JSON web services is used implicitly by Invoke-RestMethod) automatically deserializes ISO 8601-format date-time strings such as "2022-02-22T10:03:22Z" into [datetime] System.DateTime instances, and, conversely, on (re)serialization with ConvertTo-Json, [datetime] instances are (re)converted to ISO 8601 strings.
While this enables convenient chronological comparisons with other [datetime] instances, such as returned by Get-Date, there is a major pitfall: Only [datetime] instances that have the same .Kind property value compare meaningfully (possible values are Local, Utc, and Unspecified, the latter being treated like Local in comparisons).
Unfortunately, as of PowerShell 7.2.1, you don't get to control what .Kind of [datetime] instances Convert-FromJson constructs - it is implied by the specific date-time string format of each string recognized as an ISO 8601 date.
Similarly, on (re)serialization with ConvertTo-Json, the .Kind value determines the string format.
See this answer for details.
In your case, because your date-time strings have the Z suffix denoting UTC, [datetime] instances with .Kind Utc are constructed.
Therefore, you need to ensure that your comparison timestamp is a Utc [datetime] too, which calling .ToUniversalTime() on the Local instance that Get-Date outputs ensures:
# Note the need for .ToUniversalTime()
$filterTime = (Get-Date).ToUniversalTime().AddHours(-1)
# Note: Only works as intended if all date-time strings are "Z"-suffixed
$filteredData = $data | Where-Object { $_.attributes.'created-at' -ge $filterTime }
However, at least hypothetically a given JSON document may contain differing date-time string formats that result in different .Kind values.
The way to handle this case - as well as the case where the string format is consistent, but not necessarily known ahead of time - you can use the generally preferable [datetimeoffset] (System.DateTimeOffset) type, which automatically recognizes timestamps as equivalent even if their expression (local vs. UTC) is different:
# Note the use of [datetimeoffset]
[datetimeoffset] $filterTime = (Get-Date).AddHours(-1)
# With this approach the specific format of the date-time strings is irrelevant,
# as long as they're recognized as ISO 8601 strings.
$filteredData = $data |
Where-Object { [datetimeoffset] $_.attributes.'created-at' -ge $filterTime }
Note: Strictly speaking, it is sufficient for the LHS of the comparison to be of type [datetimeoffset] - a [datetime] RHS is then also handled correctly.
Potential future improvements:
GitHub issue #13598 proposes adding a -DateTimeKind parameter to ConvertFrom-Json, so as to allow explicitly requesting the kind of interest, and to alternatively construct [datetimeoffset] instances.
As for what you tried:
Is the $json | Where-Object $json.attributes[0] -ge $RFCfilterTimefilterDate
statement being used correctly?
No:
You're using simplified syntax in which the LHS of the comparison (the -Property parameter) must be the name of a single (non-nested) property directly available on each input object.
Because nested property access is required in your case, the regular script-block-based syntax ({ ... }) must be used, in which case the input object at hand must be referenced explicitly via the automatic $_ variable.
.attributes[0] suggests you were trying to access the created-at property by index, which, however, isn't supported in PowerShell; you need to:
either: spell out the property's name, if known: $_.attributes.'created-at' - note the need to quote in this case, due to use of the nonstandard - char. in the name.
or: use the intrinsic .psobject member that provides reflection information about any given object: $_.attributes.psobject.Properties.Value[0]
Thus, with spelling out the property name - and with making sure that the LHS [datetime] value is represented as an ISO 8601-formatted string too, via .ToString('o') - your statement should have been:
$json | Where-Object {
$_.attributes.'created-at'.ToString('o') -ge $RFCfilterTimefilterDate
}
The fact that newer version of ConvertFrom-Json implicitly parses timestamps as [datetime] is actually to your advantage - [datetime] values are comparable, so this simply means you can skip the step where you convert the threshold value to a string:
$data = $json | ConvertFrom-Json
$filterTime = (Get-Date).AddHours(-1)
$filteredData = $data | Where-Object {$_.attributes.'created-at' -ge $filterTime}
$jsonToPost = ConvertTo-Json -InputObject #($filteredData) -Depth 5

Inputting JSON data in Powershell

Currently, I'm attempting to call upon an API to run a POST, with JSON data as the body. So I was wondering if anyone would be able to tell me how I need to format the text below inside the variable $postParams. I'm pretty new at working with JSON so I'm having so trouble with this.
Currently, I only have the following and don't know what to do about the second line on.
$postParams = #{name='Example'}
Here's is the entire data I was hoping to add to $postParams. So if you could help me with the 2nd, 4th, and 8th that'd be awesome. Thanks!
{
"name":"Example",
"template":{"name":"Template"},
"url":"http://localhost",
"page":{"name":"Landing Page"},
"smtp":{"name":"Sending Profile"},
"launch_date":"2019-10-08T17:20:00+00:00",
"send_by_date":null,
"groups":[{"name":"test group"}]
}
You'll need a here-string and ConvertFrom-Json.
here-string:
Quotation marks are also used to create a here-string. A here-string is a single-quoted or double-quoted string in which quotation marks are interpreted literally. A here-string can span multiple lines. All the lines in a here-string are interpreted as strings, even though they are not enclosed in quotation marks.
The resulting code:
# Use a PowerShell here string to take JSON as it is
$jsonString = #"
{
"name":"Example",
"template":{"name":"Template"},
"url":"http://localhost",
"page":{"name":"Landing Page"},
"smtp":{"name":"Sending Profile"},
"launch_date":"2019-10-08T17:20:00+00:00",
"send_by_date":null,
"groups":[{"name":"test group"}]
}
"#
# Pipe the string to create a new JSON object
$jsonObject = $jsonString | ConvertFrom-Json
# The resulting JSON object has properties matching the properties in the orig. JSON
$jsonObject.name
$jsonObject.url
# Nested property
$jsonObject.template.name
# Nested property in array
$jsonObject.groups[0].name
I've posted an online version of the above code at tio.run, so you can play around with it.
If you want to update several properties of the $jsonObject you can do the following:
$jsonObject.name = "NEW NAME"
$jsonObject.url = "NEW URL"
$jsonObject | ConvertTo-Json
ConvertTo-Json will take your object and create an appropriate JSON string:
{
"name": "NEW NAME",
"template": {
"name": "Template"
},
"url": "NEW URL",
"page": {
"name": "Landing Page"
},
"smtp": {
"name": "Sending Profile"
},
"launch_date": "2019-10-08T17:20:00+00:00",
"send_by_date": null,
"groups": [
{
"name": "test group"
}
]
}
If you $jsonObject has more than two levels of depth, use the -Depth parameter, otherwise not all object information will be included in the JSON string.
ConvertTo-Json:
-Depth
Specifies how many levels of contained objects are included in the JSON representation. The default value is 2.
Here is a tio.run link to a ConvertTo-Json example.
Hope that helps.
I can't test it currently, but try this.
$postParams = #'
{
"name":"Example",
"template":{"name":"Template"},
"url":"http://localhost",
"page":{"name":"Landing Page"},
"smtp":{"name":"Sending Profile"},
"launch_date":"2019-10-08T17:20:00+00:00",
"send_by_date":null,
"groups":[{"name":"test group"}]
}
'#
Make a hashtable, then convert to JSON:
$Hashtable = #{
Key1 = "Value1"
Key2 = "Value2"
}
$Json = $Hashtable | ConvertTo-Json

Compare-Object in Powershell for 2 objects based on a field within. Objects populated by JSON and XML

Apologies for my lack of powershell knowledge, have been searching far and wide for a solution as i am not much of a programmer.
Background:
I am currently trying to standardise some site settings in Incapsula. To do this i want to maintain a local XML with rules and use some powershell to pull down the existing rules and compare them with what is there to ensure im not doubling up. I am taking this approach of trying to only apply the deltas as:
For most settings incapsula is not smart enough to know it already exists
What can be posted to the API is different varies from what is returned by the API
Examples:
Below is an example of what the API will return on request, this is in a JSON format.
JSON FROM WEBSITE
{
"security": {
"waf": {
"rules": [{
"id": "api.threats.sql_injection",
"exceptions": [{
"values": [{
"urls": [{
"value": "google.com/thisurl",
"pattern": "EQUALS"
}],
"id": "api.rule_exception_type.url",
"name": "URL"
}],
"id": 256354634
}]
}, {
"id": "api.threats.cross_site_scripting",
"action": "api.threats.action.block_request",
"exceptions": [{
"values": [{
"urls": [{
"value": "google.com/anotherurl",
"pattern": "EQUALS"
}],
"id": "api.rule_exception_type.url",
"name": "URL"
}],
"id": 78908790780
}]
}]
}
}
}
And this is the format of the XML with our specific site settings in it
OUR XML RULES
<waf>
<ruleset>
<rule>
<id>api.threats.sql_injection</id>
<exceptions>
<exception>
<type>api.rule_exception_type.url</type>
<url>google.com/thisurl</url>
</exception>
<exception>
<type>api.rule_exception_type.url</type>
<url>google.com/thisanotherurl</url>
</exception>
</exceptions>
</rule>
<rule>
<id>api.threats.cross_site_scripting</id>
<exceptions>
<exception>
<type>api.rule_exception_type.url</type>
<url>google.com/anotherurl</url>
</exception>
<exception>
<type>api.rule_exception_type.url</type>
<url>google.com/anotherurl2</url>
</exception>
</exceptions>
</rule>
</ruleset>
</waf>
I have successfully been able to compare other settings from the site against the XML using the compare-object command, however they had a bit simpler nesting and didn't give me as much trouble. I'm stuck to whether it is a logic problem or a limitation with compare object. An example code is below, it will require the supplied json and xml saved as stack.json/xml in the same directory and should produce the mentioned result :
$existingWaf = Get-Content -Path stack.json | ConvertFrom-Json
[xml]$xmlFile = Get-Content -Path stack.xml
foreach ($rule in $xmlFile)
{
$ruleSet = $rule.waf.ruleset
}
foreach ($siteRule in $ExistingWaf.security.waf.rules)
{
foreach ($xmlRule in $ruleSet)
{
if ($xmlRule.rule.id -eq $siteRule.id)
{
write-output "yes"
$delta = Compare-Object -ReferenceObject #($siteRule.exceptions.values.urls.value | Select-Object) -DifferenceObject #($xmlRule.rule.exceptions.exception.url | Select-Object) -IncludeEqual | where {$xmlRule.rule.id -eq $siteRule.id}
$delta
}
}
}
This is kind of working but not quite what i wanted. I do get a compare between the objects but not for the specific id's, it shows me the results below:
InputObject SideIndicator
----------- -------------
google.com/thisurl ==
google.com/thisanotherurl =>
google.com/anotherurl =>
google.com/anotherurl2 =>
google.com/anotherurl ==
google.com/thisurl =>
google.com/thisanotherurl =>
google.com/anotherurl2 =>
Where as i am more after
InputObject SideIndicator
----------- -------------
google.com/thisurl ==
google.com/thisanotherurl =>
google.com/anotherurl ==
google.com/anotherurl2 =>
Hopefully that makes sense.
Is it possible to only do the compares only on the values where the ids match?
Please let me know if you have any further questions.
Thanks.
The problem was your iteration logic, which mistakenly processed multiple rules from the XML document in a single iteration:
foreach ($xmlRule in $ruleSet) didn't enumerate anything - instead it processed the single <ruleset> element; to enumerate the child <rule> elements, you must use $ruleSet.rule.
$xmlRule.rule.exceptions.exception.url then implicitly iterated over all <rule> children and therefore reported the URLs across all of them, which explains the extra lines in your Compare-Object output.
Here's a streamlined, annotated version of your code:
$existingWaf = Get-Content -LiteralPath stack.json | ConvertFrom-Json
$xmlFile = [xml] (Get-Content -raw -LiteralPath stack.xml )
# No need for a loop; $xmlFile is a single [System.Xml.XmlDocument] instance.
$ruleSet = $xmlFile.waf.ruleset
foreach ($siteRule in $ExistingWaf.security.waf.rules)
{
# !! Note the addition of `.rule`, which ensures that the rules
# !! are enumerated *one by one*.
foreach ($xmlRule in $ruleSet.rule)
{
if ($xmlRule.id -eq $siteRule.id)
{
# !! Note: `$xmlRule` is now a single, rule, therefore:
# `$xmlRule.rule.[...]-> `$xmlRule.[...]`
# Also note that neither #(...) nor Select-Object are needed, and
# the `| where ...` (Where-Object) is not needed.
Compare-Object -ReferenceObject $siteRule.exceptions.values.urls.value `
-DifferenceObject $xmlRule.exceptions.exception.url -IncludeEqual
}
}
}
Additional observations regarding your code:
There is no need to ensure that operands passed to Compare-Object are arrays, so there's no need to wrap them in array sub-expression operator #(...). Compare-Object handles scalar operands fine.
... | Select-Object is a virtual no-op - the input object is passed through[1]
... | Where-Object {$xmlRule.rule.id -eq $siteRule.id} is pointless, because it duplicates the enclosing foreach loop's condition.
Generally speaking, because you're not referencing the pipeline input object at hand via automatic variable $_, your Where-Object filter is static and will either match all input objects (as in your case) or none.
[1] There is a subtle, invisible side effect that typically won't make a difference: Select-Object adds an invisible [psobject] wrapper around the input object, which on rare occasions does cause different behavior later -
see this GitHub issue.

jq construct with value strings spanning multiple lines

I am trying to form a JSON construct using jq that should ideally look like below:-
{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": [
"event"
],
"traffic_including": [
"unattributed_traffic"
],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
"attribution_campaign_id",
"attribution_creative",
"attribution_date_adjusted",
"attribution_date_utc",
"attribution_matched_by",
"attribution_matched_to",
"attribution_network",
"attribution_network_id",
"attribution_seconds_since",
"attribution_site_id",
"attribution_site_id",
"attribution_tier",
"attribution_timestamp",
"attribution_timestamp_adjusted",
"attribution_tracker",
"attribution_tracker_id",
"attribution_tracker_name",
"count",
"custom_dimensions",
"device_id_adid",
"device_id_android_id",
"device_id_custom",
"device_id_idfa",
"device_id_idfv",
"device_id_kochava",
"device_os",
"device_type",
"device_version",
"dimension_count",
"dimension_data",
"dimension_sum",
"event_name",
"event_time_registered",
"geo_city",
"geo_country",
"geo_lat",
"geo_lon",
"geo_region",
"identity_link",
"install_date_adjusted",
"install_date_utc",
"install_device_version",
"install_devices_adid",
"install_devices_android_id",
"install_devices_custom",
"install_devices_email_0",
"install_devices_email_1",
"install_devices_idfa",
"install_devices_ids",
"install_devices_ip",
"install_devices_waid",
"install_matched_by",
"install_matched_on",
"install_receipt_status",
"install_san_original",
"install_status",
"request_ip",
"request_ua",
"timestamp_adjusted",
"timestamp_utc"
]
}
What I have tried unsuccessfully thus far is below:-
json_construct=$(cat <<EOF
{
"api_key": "6AEC90B5-4169-59AF-7AC9-D655F83B4825",
"app_guid": "komacca-s-rewards-app-au-ios-production-cv8tx71",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv"
"columns_order": ["attribution_attribution_action","attribution_campaign","attribution_campaign_id","attribution_creative","attribution_date_adjusted","attribution_date_utc","attribution_matched_by","attribution_matched_to","attributio
network","attribution_network_id","attribution_seconds_since","attribution_site_id","attribution_tier","attribution_timestamp","attribution_timestamp_adjusted","attribution_tracker","attribution_tracker_id","attribution_tracker_name","
unt","custom_dimensions","device_id_adid","device_id_android_id","device_id_custom","device_id_idfa","device_id_idfv","device_id_kochava","device_os","device_type","device_version","dimension_count","dimension_data","dimension_sum","ev
t_name","event_time_registered","geo_city","geo_country","geo_lat","geo_lon","geo_region","identity_link","install_date_adjusted","install_date_utc","install_device_version","install_devices_adid","install_devices_android_id","install_
vices_custom","install_devices_email_0","install_devices_email_1","install_devices_idfa","install_devices_ids","install_devices_ip","install_devices_waid","install_matched_by","install_matched_on","install_receipt_status","install_san_
iginal","install_status","request_ip","request_ua","timestamp_adjusted","timestamp_utc"]
}
EOF)
followed by:-
echo "$json_construct" | jq '.'
I get the following error:-
parse error: Expected separator between values at line 10, column 15
I am guessing it is because of the string literal which spans to multiple lines that jq is unable to parse it.
Use jq itself:
my_formatted_json=$(jq -n '{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
...,
"timestamp_utc"
]
}')
Your input "JSON" is not valid JSON, as indicated by the error message.
The first error is that a comma is missing after the key/value pair: "delivery_format": "csv", but there are others -- notably, JSON strings cannot be split across lines. Once you fix the key/value pair problem and the JSON strings that are split incorrectly, jq . will work with your text. (Note that once your input is corrected, the longest JSON string is quite short -- 50 characters or so -- whereas jq has no problems processing strings of length 10^8 quite speedily ...)
Generally, jq is rather permissive when it comes to JSON-like input, but if you're ever in doubt, it would make sense to use a validator such as the online validator at jsonlint.com
By the way, the jq FAQ does suggest various ways for handling input that isn't strictly JSON -- see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json
Along the lines of chepner's suggestion since jq can read raw text data you could just use a jq filter to generate a legal json object from your script variables. For example:
#!/bin/bash
# whatever logic you have to obtain bash variables goes here
key=XXXXXXXXXX-7AC9-D655F83B4825
guid=XXXXXXXXXXXXXX
# now use jq filter to read raw text and construct legal json object
json_construct=$(jq -MRn '[inputs]|map(split(" ")|{(.[0]):.[1]})|add' <<EOF
api_key $key
app_guid $guid
EOF)
echo $json_construct
Sample Run (assumes executable script is in script.sh)
$ ./script.sh
{ "api_key": "XXXXXXXXXX-7AC9-D655F83B4825", "app_guid": "XXXXXXXXXXXXXX" }
Try it online!

Need to get all key value pairs from a JSON containing a specific character '/'

I have a specific json content for which I need to get all keys which contains the character / in their values.
JSON
{ "dig": "sha256:d2aae00e4bc6424d8a6ae7639d41cfff8c5aa56fc6f573e64552a62f35b6293e",
"name": "example",
"binding": {
"wf.example.input1": "/path/to/file1",
"wf.example.input2": "hello",
"wf.example.input3":
["/path/to/file3",
"/path/to/file4"],
"wf.example.input4": 44
}
}
I know I can get all the keys containing file path or array of file paths using query jq 'paths(type == "string" and contains("/"))'. This would give me an output like:
[ "binding", "wf.example.input1" ]
[ "binding", "wf.example.input3", 0]
[ "binding", "wf.example.input3", 1 ]
Now that i have all the elements that contains some file paths as their values, is there a way to fetch both key and value for the same and then store them as another JSON? For example, in JSON mentioned for this question, I need to get the output as another JSON containing all the matched paths. My output JSON should look something like below.
{ "binding":
{ "wf.example.input1": "/path/to/file1",
"wf.example.input3": [ "/path/to/file3", "/path/to/file4" ]
}
}
The following jq filter will produce the desired output if given input that is very similar to the example, but it is far from robust and glosses over some details that are unclear from the problem description. However, it should be easy enough to modify the filter in accordance with more precise specifications:
. as $in
| reduce paths(type == "string" and test("/")) as $path ({};
($in|getpath($path)) as $x
| if ($path[-1]|type) == "string"
then .[$path[-1]] = $x
else .[$path[-2]|tostring] += [$x]
end )
| {binding: .}
Output:
{
"binding": {
"wf.example.input1": "/path/to/file1",
"wf.example.input3": [
"/path/to/file3",
"/path/to/file4"
]
}
}