Get value if object or string if string in jq array - json

I have a JSON object that looks like this:
[{"name":"NAME_1"},"NAME_2"]
I would like an output of
["NAME_1", "NAME_2"]
Some of the entries in the array are an object with a key "name" and some are just a string of the name. I am trying to extract an array of the names. Using
jq -cr '.[].name // []'
throws an error as it is trying to index .name of the string object. Is there a way to check if it is a string, and if so just use its value instead of .name?

echo '[{"name":"NAME_1"},"NAME_2"]' \
| jq '[ .[] | if (.|type) == "object" then .name else . end ]'
[
"NAME_1"
"NAME_2"
]
Ref:
https://stedolan.github.io/jq/manual/#ConditionalsandComparisons
https://stedolan.github.io/jq/manual/#type
As #LéaGris comments, a simpler version
jq '[ .[] | .name? // . ]' file
https://stedolan.github.io/jq/manual/#ErrorSuppression/OptionalOperator:%3f
https://stedolan.github.io/jq/manual/#Alternativeoperator://

You can use the type function which returns "object" for objects.
jq '.[] | if type == "object" then .name else . end' file.json
To get the output as array, just wrap the whole expression into [ ... ].

Just use the error suppression operator with ?, map and scalars
jq 'map( .name?, scalars )'
Note that by using scalars, it is assumed that other than objects with name, all others are names of form NAME_*. If there are other strings as well, and you need to exclude some of them you might need to add some additional logic to do that. e.g. using startswith(..) with a string of your choice.
map( .name?, select( scalars | startswith("NAME") ) )
Demo

With your shown samples only, please try following jq code. Using tostream function here to get the required values from requirement.
jq -c '[.[] | tostream | if .[1] != null then .[1] else empty end]' Input_file

Related

Convert value of json from int to string using jq

Given a json that looks something like:
[{"id":1,"firstName":"firstName1","lastName":"lastName1"},
{"id":2,"firstName":"firstName2","lastName":"lastName2"},
{"id":3,"firstName":"firstName3","lastName":"lastName3"}]
What would be the best way to convert the id value from an int to a string and then saving the file?
I have tried:
echo "$(jq -r '[.[] | .id = .id|tostring]' test.json)" > test.json
But that seems to put each entry into a string and adds the backslashes
[
"{\"id\":1,\"firstName\":\"firstName1\",\"lastName\":\"lastName1\"}",
"{\"id\":2,\"firstName\":\"firstName2\",\"lastName\":\"lastName2\"}",
"{\"id\":3,\"firstName\":\"firstName3\",\"lastName\":\"lastName3\"}"
]
| has a lower priority than the assignment (=). The expression .id = .id | tostring is interpreted as (.id = .id) | tostring.
The assignment does change anything and can be removed. The script becomes [ .[] | tostring ], that explains the output (each object is serialized as JSON into a string).
The solution is to use parentheses to enforce the desired order of execution.
The command is:
jq '[ .[] | .id = (.id | tostring) ]' test.json
Do not use process expansion ($(...)) to compose an echo command line. It is inefficient and not needed.
Redirect the output of jq directly to a file. Use a different file than the input file (or it ends up destroying your data).
jq '[ .[] | .id = (.id | tostring) ]' test.json > output.json

How pack multiline key=value string to array of objects?

I have multiline string like
a=aValue
b=bValue
c=cValue
a=dValue
b=eValue
c=fValue
How using jq get json like this?
[
{"a": "aValue", "b": "bValue", "c": "cValue"},
{"a": "dValue", "b": "eValue", "c": "fValue"}
]
Here's an answer that is not tied to the number of distinct keys, and avoids slurping the lines (i.e., has minimal memory requirements):
jq -Rn 'foreach (inputs, null) as $in ({};
if $in == null then .emit = .object
else ($in | capture("(?<key>[^=]*)=(?<value>.*)") // null) as $kv
| if $kv == null
then .
elif (.object | . and has($kv.key))
then .emit = .object | .object = ([$kv]|from_entries)
else .emit = null | .object += ([$kv]|from_entries)
end
end ;
select(.emit).emit )'
The trick here is to use inputs,null so that the "end of file" condition is handled properly.
Note that the above produces a stream, so if you want all the objects in an array, simply enclose the entire jq program in square brackets:
jq -Rn '[ .... ]'
If you don't have any problem with slurping the input, this should do the trick:
jq -R -s '[ split("\n") | map(split("=") | {(.[0]): .[1]}) | _nwise(3) | add ]' file
Online demo
_nwise(n) is an undocumented internal function that given an array emits a stream of subarrays of length n. You can see how it's implemented here. For everything else, see the manual.
If values may contain equals signs, use .[1:] | join("=") instead of .[1].
For a generally-applicable, memory-friendly approach, see peak's answer.

Create JSON from string with format "key1=value1,key2=value2" using jq

I'm trying to create a json file from a string with the following format:
string="key1=value1,key2=value2"
Is there a way to create a json using jq by specifying the = and , symbols as separators for the keys and values?
The output I'm looking for would be:
{"key1": "value1", "key2” :”value2"}
I've tried to use this post as a reference:
Create JSON using jq from pipe-separated keys and values in bash -- however, it expects input that contains a line with only keys, before later lines with only values; here, the keys and values are all interspersed.
Here's a reduce-free solution that assumes string is the shell variable (not part of the string to be parsed), and that parsing of the string can be accomplished by first splitting on ",":
jq -R 'split(",")
| map( index("=") as $i | {(.[0:$i]) : .[$i+1:]})
| add' <<< "$string"
Notice that this allows "=" to appear within the values.
The only trickiness here is that when a key name is specified programmatically, it must be enclosed within parentheses.
Supplemental question
string="key1=value1|key2=value2,value3|key3=value4"
In this case, you would first split on "|", and then find the first occurrence of "=":
split("|")
| map( index("=") as $i | {(.[0:$i]) : .[$i+1:]})
| add
| map_values(if index(",") then split(",") else . end)
Output:
{
"key1": "value1",
"key2": [
"value2",
"value3"
],
"key3": "value4"
}
string="key1=value1,key2=value2"
jq -Rc '
split(",")
| [.[] | match( "([^=]*)=(.*)" )]
| reduce .[].captures as $item ({}; .[$item[0].string]=$item[1].string)
' <<<"$string"
echo -n "key1=value1,key2=value2" | \
jq -csR '[split(",")[]|split("=") | {(.[0]): .[1]}]|add'
this gives
{"key1":"value1","key2":"value2"}

How to get root keys and key types using jq

Let's take this simple data file: http://data.cdc.gov/data.json
I know how to get the root key names:
jq keys_unsorted[] -r data.json
which produces:
#context
#id
#type
conformsTo
describedBy
dataset
And I know how to get the key types:
jq 'map(type)' data.json
Which produces:
[
"string",
"string",
"string",
"string",
"string",
"array"
]
Is there not some way of combining this in returning pairs? (what I am really trying to do is find the key name of the first root level array, if any). I can write a routine to figure it out, but this seem inelegant.
Bonus side question: How to you determine the type of a key (e.g., I would send "dataset" to jq in some form and would get "array" in return)?
The simplest approach to writing queries that depend on both key names and values is to use one of the "*_entries" family of filters. In your case:
$ jq -c 'to_entries[] | [.key, (.value|type)]' data.json
["#context","string"]
["#id","string"]
["#type","string"]
["conformsTo","string"]
["describedBy","string"]
["dataset","array"]
If you wanted this presented in a more human-readable fashion, consider using #csv or #tsv, e.g.
$ jq -r 'to_entries[] | [.key, (.value|type)] | #csv' data.json
"#context","string"
"#id","string"
"#type","string"
"conformsTo","string"
"describedBy","string"
"dataset","array"
Or with less noise:
$ jq -r 'to_entries[] | "\(.key) \(.value|type)"' data.json
#context string
#id string
#type string
conformsTo string
describedBy string
dataset array
Bonus question
Here's a parametric approach to the second question. Let the file query.jq contain:
.[$key]|type
Then:
$ jq -r --arg key dataset -f query.jq data.json
array
jq 'first(path(.[] | select(type == "array"))[0])' < data.json
The top-level items .[] are filtered out with select(type == "array") which selects only items of array type; path() returns array representation of the path in ., i.e. the key names of the array items; first() extracts the first path.
So the result of the command is key name of the first top-level array item.
Sample Output
"dataset"
How to you determine the type of a key (e.g., I would send "dataset" to jq in some form and would get "array" in return).
You probably mean "the type of a value", because the keys must be strings in JSON. If the path is known (e.g. .dataset), then the type of the object can be fetched with the type function:
jq '.dataset | type' < data.json
Sample Output
"array"

jq: selecting a subset of keys from an object

Given an input json string of keys from an array, return an object with only the entries that had keys in the original object and in the input array.
I have a solution but I think that it isn't elegant ({($k):$input[$k]} feels especially clunky...) and that this is a chance for me to learn.
jq -n '{"1":"a","2":"b","3":"c"}' \
| jq --arg keys '["1","3","4"]' \
'. as $input
| ( $keys | fromjson )
| map( . as $k
| $input
| select(has($k))
| {($k):$input[$k]}
)
| add'
Any ideas how to clean this up?
I feel like Extracting selected properties from a nested JSON object with jq is a good starting place but i cannot get it to work.
solution with inside check:
jq 'with_entries(select([.key] | inside(["key1", "key2"])))'
the inside operator works for most of time; however, I just found the inside operator has side effect, sometimes it selected keys not desired, suppose input is { "key1": val1, "key2": val2, "key12": val12 } and select by inside(["key12"]) it will select both "key1" and "key12"
use the in operator if need an exact match: like this will select .key2 and .key12 only
jq 'with_entries(select(.key | in({"key2":1, "key12":1})))'
because the in operator checks key from an object only (or index exists? from an array), here it has to be written in an object syntax, with desired keys as keys, but values do not matter; the use of in operator is not a perfect one for this purpose, I would like to see the Javascript ES6 includes API's reverse version to be implemented as jq builtin
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/includes
jq 'with_entries(select(.key | included(["key2", "key12"])))'
to check an item .key is included? from an array
You can use this filter:
with_entries(
select(
.key as $k | any($keys | fromjson[]; . == $k)
)
)
Here is some additional clarification
For the input object {"key1":1, "key2":2, "key3":3} I would like to drop all keys that are not in the set of desired keys ["key1","key3","key4"]
jq -n --argjson desired_keys '["key1","key3","key4"]' \
--argjson input '{"key1":1, "key2":2, "key3":3}' \
' $input
| with_entries(
select(
.key == ($desired_keys[])
)
)'
with_entries converts {"key1":1, "key2":2, "key3":3} into the following array of key value pairs and maps the select statement on the array and then turns the resulting array back into an object.
Here is the inner object in the with_entries statement.
[
{
"key": "key1",
"value": 1
},
{
"key": "key2",
"value": 2
},
{
"key": "key3",
"value": 3
}
]
we can then select the keys from this array that meet our criteria.
This is where the magic happens... here is a look at whats going on in the middle of this command. The following command takes the expanded array of values and turns them into a list of objects that we can select from.
jq -cn '{"key":"key1","value":1}, {"key":"key2","value":2}, {"key":"key3","value":3}
| select(.key == ("key1", "key3", "key4"))'
This will yield the following result
{"key":"key1","value":1}
{"key":"key3","value":3}
The with entries command can be a little tricky but its easy to remember that it takes a filter and is defined as follows
def with_entries(f): to_entries|map(f)|from_entries;
This is the same as
def with_entries(f): [to_entries[] | f] | from_entries;
The other part of the question that confuses people is the multiple matches on the right hand side of the ==
Consider the following command. We see the output is an outer production of all the left hand lists and the right hand lists.
jq -cn '1,2,3| . == (1,1,3)'
true
true
false
false
false
false
false
false
true
If that predicate is in a select statement, we keep the input when the predicate is true. Note you can duplicate the inputs here too.
jq -cn '1,2,3| select(. == (1,1,3))'
1
1
3
Jeff's answer has a couple of unnecessary inefficiencies, both of which are addressed by the following, on the assumption that --argjson keys is used instead of --arg keys:
with_entries( select( .key as $k | $keys | index($k) ) )
Even better, if your jq has IN:
with_entries(select(.key | IN($keys[])))
If you are sure that all keys in the input array are present in the original object, you can use the object construction shortcut.
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3"}'
{
"1": "a",
"3": "c"
}
Numbers should be quoted to force jq to interpret them as keys instead of literals. In the case of keys not resembling a number, quotes are not needed:
$ echo '{"key1":"a","key2":"b","key3":"c"}' | jq '{key1, key3}'
{
"key1": "a",
"key3": "c"
}
Adding a non-existent key will yield a null value, unlikely what OP wanted:
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3", "4"}'
{
"1": "a",
"3": "c",
"4": null
}
but those can be filtered out:
$ echo '{"1":"a","2":"b","3":"c"}' | jq '{"1", "3", "4"} | with_entries(select(.value != null))'
{
"1": "a",
"3": "c"
}
Although this answer doesn't receive a valid input json array as OP asked, I find it useful for just filtering some keys you know are present.
An example usecase: get aud and iss from a JWT. The following is very succint:
echo "jwt-as-json" | jq '{aud, iss}'