altering json value to be substring - json

I have a value in a json result set I would like to alter to be a substring value of the original
{
"label": "web page check",
"target": "http://www.example.com/random/page"
},
{
"label": "web page check1 ",
"target": "http://www.example1.com/random/page"
},
what I would like to do is return it as
{
"label": "web page check",
"target": "https://www.example.com"
},
{
"label": "web page check",
"target": "https://www.example1.com"
}
I have tried
jq '.[].target=(match(^https:\/\/[0-9a-zA-z.]*|^http:\/\/[0-9a-zA-z.]*).string)'
jq -c '.[] | {label: .label, target: (.target |=match(^https:\/\/[0-9a-zA-z.]*|^http:\/\/[0-9a-zA-z.]*).string})'

Using capture is often easier than using match. In your case, the following would be sufficient to modify the "target" values assuming your input is an array of objects along the lines suggested by the snippet:
map(.target |= (capture("https?(?<s>://[^/]*)") | "https" + .s))
Equivalently:
map(.target |= sub( "https?(?<s>://[^/]*).*"; "https" + .s) )

The first argument to sub (requires jq 1.5) can be any PCRE.
.[].target |= sub("(?<=com).*$"; "")

Related

jq merge json via dynamic sub keys

I think I'm a step off from figuring out how to jq reduce via filter a key to another objects sub-key.
I'm trying to combine files (simplified from Elasticsearch's ILM Explain & ILM Policy API responses):
$ echo '{".siem-signals-default": {"modified_date": "siem", "version": 1 }, "kibana-event-log-policy": {"modified_date": "kibana", "version": 1 } }' > ip1.json
$ echo '{"indices": {".siem-signals-default-000001": {"action": "complete", "index": ".siem-signals-default-000001", "policy" : ".siem-signals-default"} } }' > ie1.json
Such that the resulting JSON is:
{
".siem-signals-default-000001": {
"modified_date": "siem",
"version": 1
"action": "complete",
"index": ".siem-signals-default-000001",
"policy": ".siem-signals-default"
}
}
Where ie1 is base JSON and for a child-object, its sub-element policy should line up to ip1's key and copy its sub-elements into itself. I've been trying to build off this, this, and this (from StackOverflow, also this, this, this from external sources). I'll list various rabbit hole attempts building off these, but they're all insufficient:
$ ((cat ie1.json | jq '.indices') && cat ip1.json) | jq -s 'map(to_entries)|flatten|from_entries' | jq '. as $v| reduce keys[] as $k({}; if true then .[$k] += $v[$k] else . end)'
{
".siem-signals-default": {
"modified_date": "siem",
"version": 1
},
".siem-signals-default-000001": {
"action": "complete",
"index": ".siem-signals-default-000001",
"policy": ".siem-signals-default"
},
"kibana-event-log-policy": {
"modified_date": "kibana",
"version": 1
}
}
$ jq --slurpfile ip1 ip1.json '.indices as $ie1|$ie1+{ilm: $ip1 }' ie1.json
{
".siem-signals-default-000001": {
"action": "complete",
"index": ".siem-signals-default-000001",
"policy": ".siem-signals-default"
},
"ilm": [
{
".siem-signals-default": {
"modified_date": "siem",
"version": 1
},
"kibana-event-log-policy": {
"modified_date": "kibana",
"version": 1
}
}
]
}
I also expected something like this to work, but it compile errors
$ jq -s ip1 ip1.json '. as $ie1|$ie1 + {ilm:(keys[] as $k; $ip1 | select(.policy == $ie1[$k]) | $ie1[$k] )}' ie1.json
jq: error: ip1/0 is not defined at <top-level>, line 1:
ip1
jq: 1 compile error
From this you can see, I've determined various ways to join the separate files, but though I have code I thought would play into filtering, it's not correct / taking effect. Does anyone have an idea how to get the filter part working? TIA
This assumes you are trying to combine the .indices object stored in ie1.json with an object within the object stored in ip1.json. As the keys upon to match are different, I further assumed that you want to match the field name from the .indices object, reduced by cutting off everything that comes after the last dash -, to the same key in the object from ip1.json.
To this end, ip1.json is read in from input as $ip (alternatively you can use jq --argfile ip ip1.json for that), then the .indices object is taken from the first input ie1.json and to the inner object accessed via with_entries(.value …) is added the result of a lookup within $ip at the matching and accordingly reduced .key.
jq '
input as $ip | .indices | with_entries(.value += $ip[.key | sub("-[^-]*$";"")])
' ie1.json ip1.json
{
".siem-signals-default-000001": {
"action": "complete",
"index": ".siem-signals-default-000001",
"policy": ".siem-signals-default",
"modified_date": "siem",
"version": 1
}
}
Demo
If instead of the .indices object's inner field nane you want to have the content of field .index as reference (which in your sample data has the same value), you can go with map_values instead of with_entries as you don't need the field's name anymore.
jq '
input as $ip | .indices | map_values(. += $ip[.index | sub("-[^-]*$";"")])
'ie1.json ip1.json
Demo
Note: I used sub with a regex to manipulate the key name, which you can easily adjust to your liking if in reality it is more complicated. If, however, the pattern is infact as simple as cutting off after the last dash, then using .[:rindex("-")] instead will also get the job done.
I also received offline feedback of a simple "workable for my use case" but not exact answer:
$ jq '.indices | map(. * input[.policy])' ie1.json ip1.json
[
{
"action": "complete",
"index": ".siem-signals-default-000001",
"policy": ".siem-signals-default",
"modified_date": "siem",
"version": 1
}
]
Posting in case someone runs into similar, but other answer's better.

Combine files in jq based on similar ID object and reform data

Preface: If the following is not possible with jq, then I completely accept that as an answer and will try to force this with bash.
I have two files that contain some IDs that, with some massaging, should be able to be combined into a single file. I have some content that I'll add to that as well (as seen in output). Essentially "mitre_test" should get compared to "sys_id". When compared, the "mitreid" from in2.json becomes technique_ID in the output (and is generally the unifying field of each output object).
Caveats:
There are some junk "desc" values placed in the in1.json that are there to make sure this is as programmatic as possible, and there are actually numerous junk inputs on the true input file I am using.
some of the mitre_test values have pairs and are not in a real array. I can split on those and break them out, but find myself losing the other information from in1.json.
Notice in the "metadata" for the output that is contains the "number" values from in1.json, and stored in a weird way (but the way that the receiving tool requires).
in1.json
[
{
"test": "Execution",
"mitreid": "T1204.001",
"mitre_test": "90b"
},
{
"test": "Defense Evasion",
"mitreid": "T1070.001",
"mitre_test": "afa"
},
{
"test": "Credential Access",
"mitreid": "T1556.004",
"mitre_test": "14b"
},
{
"test": "Initial Access",
"mitreid": "T1200",
"mitre_test": "f22"
},
{
"test": "Impact",
"mitreid": "T1489",
"mitre_test": "fa2"
}
]
in2.json
[
{
"number": "REL0001346",
"desc": "apple",
"mitre_test": "afa"
},
{
"number": "REL0001343",
"desc": "pear",
"mitre_test": "90b"
},
{
"number": "REL0001366",
"desc": "orange",
"mitre_test": "14b,f22"
},
{
"number": "REL0001378",
"desc": "pineapple",
"mitre_test": "90b"
}
]
The output:
[{
"techniqueID": "T1070.001",
"tactic": "defense-evasion",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001346"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1204.001",
"tactic": "execution",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001343"
},
{
"name": "DET_ID",
"value": "REL0001378"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1556.004",
"tactic": "credential-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1200",
"tactic": "initial-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
}
]
I'm assuming I have some splitting to do on mitre_test with something like .mitre_test |= split(",")), and there are some joins I'm assuming, but doing so causes data loss or mixing up of the data. You'll notice the static data in the output exists as well, but is likely easy to place in and as such isn't as much of an issue.
Edit: reduced some of the match IDs so that it is easier to look at while analyzing the in1 and in2 files. Also simplified the two inputs to have a similar structure so that the answer is easier to understand later.
The requirements are somewhat opaque but it's fairly clear that if the task can be done by computer, it can be done using jq.
From the description, it would appear that one of the unusual aspects of the problem is that the "dictionary" defined by in1.json must be derived by splitting the key names that are CSV (comma-separated values). Here therefore is a jq def that will do that:
# Input: a JSON dictionary for which some keys are CSV,
# Output: a JSON dictionary with the CSV keys split on the commas
def refine:
. as $in
| reduce keys_unsorted[] as $k ({};
if ($k|index(","))
then ($k/",") as $keys
| . + ($keys | map( {(.): $in[$k]}) | add)
else .[$k] = $in[$k]
end );
You can see how this works by running:
INDEX($mitre.records[]; .mitre_test) | refine
using an invocation of jq such as:
jq --argfile mitre in1.json -f program.jq in2.json
For the joining part of the problem, there are many relevant Q&As on SO, e.g.
How to join JSON objects on particular fields using jq?
There is probably a much more elegant way to do this, but I ended up manually walking around things and piping to new output.
Explanation:
Read in both files, pull the fields I need.
Break out the mitre_test values that were previously just a comma separated set of values with map and try.
Store the none-changing fields as a variable and then manipulate mitre_test to become an appropriately split array, removing nulls.
Group by mitre_test values, since they are the common thing that the output is based on.
Cleanup more nulls.
Sort output to look like I want it.
jq . in1.json in2.json | \
jq '.[] |{number: .number, test: .test, mitreid: .mitreid, mitre_test: .mitre_test}' |\
jq -s '[. |map(try(.mitre_test |= split(",")) // .)|\
.[] | [.number,.test,.mitreid] as $h | .mitre_test[] |$h + [.] | \
{DET_ID: .[0], tactic: .[1], techniqueID: .[2], mitre_test: .[3]}] |\
del(.[][] | nulls)' |jq '[group_by(.mitre_test)[]|{mitre_test: .[0].mitre_test, techniqueID: [.[].techniqueID],tactic: [.[].tactic], DET_ID: [.[].DET_ID]}]|\
del(.[].techniqueID[] | nulls) | del(.[].tactic[] | nulls) | del(.[].DET_ID[] | nulls)' | \
jq '.[]| [{techniqueID: .techniqueID[0],tactic: .tactic[0], metadata: [{name: "DET_ID",value: .DET_ID[]}]}] | .[] | \
select((.metadata|length)>0)'
It was a long line, so I split it among some of the basic ideas.

Omitting null values for sub() in JQ

I'm trying to change # to %23 in every context value, but I'm having problem with null values.
The shortened JSON is:
{
"stats": {
"suites": 1
},
"results": [
{
"uuid": "676-a46b-47a1-a49f-4da4e46c1120",
"title": "",
"suites": [
{
"uuid": "gghjh-56a9-4713-b139-0d5b36bc7fbc",
"title": "Login process",
"tests": [
{
"pass": false,
"fail": true,
"pending": false,
"context": "\"screenshots/login.spec.js/Login process -- should login #11 (failed).png\""
},
{
"pass": false,
"fail": false,
"pending": true,
"context": null
}
]
}
]
}
]
}
And the JQ command I think it's closest to correct is:
jq '.results[].suites[].tests[].context | strings | sub("#";"%23")'
But the problem is that I need to get in return full edited file. How could I achieve that?
You were close. To retain the original structure, you need to use the update operator (|=) instead of pipe. Enclosing the entire expression to the left of it in parentheses is also necessary, otherwise the original input will be invisible to |=.
(.results[].suites[].tests[].context | strings) |= sub("#"; "%23")
Online demo
change # to %23 in every context value
You might wish to consider:
walk( if type=="object" and (.context|type)=="string"
then .context |= gsub("#"; "%23")
else . end )

Replace subkey without exact path in jq

Example JSON file:
{
"u": "stuff",
"x": [1,2,3],
"y": {
"field": "value"
},
"z": {
"zz": {
"name": "change me",
"more": "stuff"
},
"randomKey": {
"name": "change me",
"random": "more stuff"
}
}
}
How can I update all the name fields to "something", maintaining the rest of the JSON file the same?
{
"u": "stuff",
"x": [1,2,3],
"y": {
"field": "value"
},
"z": {
"zz": {
"name": "something",
"more": "stuff"
},
"randomKey": {
"name": "something",
"random": "more stuff"
}
}
}
With a direct path, this would be easy, but the parent keys (z and randomKey in these case) varies.
I tried something like:
jq '.z | .. | .name? |= "something"' file.json
And it's updating the names, but putting also all the recursive stuff..
If it is acceptable to change the "name" field wherever it occurs, you could use walk/1:
walk(if type == "object" and has("name") then .name = "something" else . end)
Please note that walk/1 was only included with jq after jq 1.5 was released. If your jq does not have it, then you can find its definition on the jq FAQ, for example.
If you only want to modify the "name" field in the "z" context, then consider:
.z |= with_entries(if .value.name?
then .value.name = "something"
else . end)
Assuming every value within z has a name property, you could do this:
$ jq --arg newname 'something' '.z[].name = $newname' input.json
Using [] on an object will yield all the values contained in that object. And for each of those values, we were simply setting the name to the new name.
If you needed to be more selective with what gets updated, you'll have to add more conditions to what objects to update. In general, I'd use peak's approach, but here's another way it could be achieved using a structure similar to the first approach, assuming we only want to update objects that already have a name property:
$ jq --arg newname 'something' '(.z[] | select(has("name")).name) = $newname' input.json
It's important to wrap the LHS of the assignment in parentheses, we don't want to change the context prior to the assignment, otherwise we won't see the rest of the results.

Update one value in array of dicts, using jq

I want to update a value in a dict, which I can only identify by another value in the dict. That is, given this input:
[
{
"format": "geojson",
"id": "foo"
},
{
"format": "geojson",
"id": "bar"
},
{
"format": "zip",
"id": "baz"
}
]
I want to change baz's accompanying format to 'csv':
[
{
"format": "geojson",
"id": "foo"
},
{
"format": "geojson",
"id": "bar"
},
{
"format": "csv",
"id": "baz"
}
]
I have found that this works:
jq 'map(if .id=="baz" then .format="csv" else . end)' my.json
But this seems rather verbose, so I wonder if there is a more elegant way to express this. jq seems to be missing some kind of expression selector, the equivalent of might be [#id='baz'] in xpath.
(When I started this question, I had [.[] |...], then I discovered map, so it's not quite as bad as I thought.)
A complex assignment is what you're looking for:
jq '(.[] | select(.id == "baz") | .format) |= "csv"' my.json
Perhaps not shorter but it is more elegant, as requested. See the last section of the docs at: http://stedolan.github.io/jq/manual/#Assignment
Edit: using map:
jq 'map((select(.id == "baz") | .format) |= "csv")' my.json