I want create a more simple json with the same original structure but with one a small sample.
As example, If I have this json:
{
"field1": [
{
"a": "F1A1",
"b": "F1B1"
},
{
"a": "F1A2",
"b": "F1B2"
},
{
"a": "F1A3",
"b": "F1B3"
},
{
"a": "F1A4",
"b": "F1B4"
}
],
"field2": [
{
"a": "F2A1",
"b": "F2B1"
},
{
"a": "F2A2",
"b": "F2B2"
}
],
"field3": [
{
"a": "F3A1",
"b": "F3B1"
},
{
"a": "F3A2",
"b": "F3B2"
}
]
}
I want to get the first array element from the first field. So I was expecting this:
{
"field1": [
{
"a": "F1A1",
"b": "F1B1"
}
],
}
I executed jq "select(.field1[0])" tmp.json but it returns the original json.
Bonus:
As bonus, how to do the same but extracting let's say field1 and elements in the array with a=="F1A1" and a=="F1A4", so will expect?:
{
"field1": [
{
"a": "F1A1",
"b": "F1B1"
},
{
"a": "F1A4",
"b": "F1B4"
}
]
}
reduce the oouter object to your field using {field1}, then map this field to an array containing only the first item:
jq '{field1} | map_values([first])'
{
"field1": [
{
"a": "F1A1",
"b": "F1B1"
}
]
}
To filter for certain items use select:
jq '{field1} | map_values(map(select(.a == "F1A1" or .a == "F1A4")))'
{
"field1": [
{
"a": "F1A1",
"b": "F1B1"
},
{
"a": "F1A4",
"b": "F1B4"
}
]
}
As you can see, select does something different. It passes on its input if the argument evaluates to true. Therefore its output is either all or nothing, never just a filtered part. (Of course, you can use select to achieve specific filtering, as shown above.)
I have 2 files (which are quite long):
file1.json (540 objects - i'll write just 2 mockups for ease of use)
[
{
"a": "apple",
"b": "banana",
"c": ["car1", "car2", "car3"],
"d": ["doodle1", "doodle2", "doodle3"],
"e": "elephant"
},
{
"a": "aqua",
"b": "bay",
"c": ["carrot", "chile", "cucumber"],
"d": ["dice", "drop", "dang"],
"e": "elastic"
}
]
file2.json (540 objects - i'll write just 2 mockups for ease of use)
[
{
"l": ["link1", "link2", "link3"]
},
{
"l": ["link4", "link5", "link6"]
}
]
expected result
[
{
"a": "apple",
"b": "banana",
"c": ["car1", "car2", "car3"],
"d": ["doodle1", "doodle2", "doodle3"],
"e": "elephant",
"l": ["link1", "link2", "link3"]
},
{
"a": "aqua",
"b": "bay",
"c": ["carrot", "chile", "cucumber"],
"d": ["dice", "drop", "dang"],
"e": "elastic",
"l": ["link4", "link5", "link6"]
}
]
Is this possible to achieve this with jq or should I process it through other programming languages like python or javascript?
jq is a perfect choice for all kinds of JSON processing. In this case, you could transpose an array of the contents of both files, then add up the aligned items using a map:
jq -n '[inputs] | transpose | map(add)' file1.json file2.json
Demo
I have one stream output stored in csv file, I need help converting csv to json:
my csv looks like:
cat output.csv
"k","a1",1,"b1","c1","d1",1
"l","a2",2,"b2","c2","d2",2
"m","a3",3,"b3","c3","d3",3
"n","a4",4,"b4","c4","d4",4
"o","a5",5,"b5","c5","d5",5
Required output:
note: I need key configuration to be added to json.
{
"configuration": {
"k": {
"a": "a1",
"number1": "1",
"c": "b1",
"d": "c1",
"e": "d1",
"number2": "1"
},
"l": {
"a": "a2",
"number1": "2",
"c": "b2",
"d": "c2",
"e": "d2",
"number2": "2"
},
.
.
.
}
}
So far tried with jq:
my function is:
cat api.jq
[
inputs |
split(",") |
map(ltrimstr("\"")) |
map(rtrimstr("\"")) |
{
a: .[1],
number1: .[2],
c: .[3],
d: .[4],
e: .[5],
number2: .[6]
}
] | {configuration: .}
Output:
jq -nRf api.jq output.csv
{
"cluster_configuration": [
{
"a": "a1",
"number1": "1",
"c": "b1",
"d": "c1",
"e": "d1",
"number2": "1"
},
{
"a": "a2",
"number1": "2",
"c": "b2",
"d": "c2",
"e": "d2",
"number2": "2"
},
{
"a": "a3",
"number1": "3",
"c": "b3",
"d": "c3",
"e": "d3",
"number2": "3"
},
{
"a": "a4",
"number1": "4",
"c": "b4",
"d": "c4",
"e": "d4",
"number2": "4"
},
{
"a": "a5",
"number1": "5",
"c": "b5",
"d": "c5",
"e": "d5",
"number2": "5"
}
]
}
Here's a possible solution with Miller (available here for several OSs), an interesting tool that supports multiple input/output formats:
mlr --icsv -N put -q '
#map[$1] = {"a": $2, "number1": $3, "c": $4, "d": $5, "e": $6, "number2": $7};
end { dump { "configuration": #map } }
' file.csv
{
"configuration": {
"k": {
"a": "a1",
"number1": 1,
"c": "b1",
"d": "c1",
"e": "d1",
"number2": 1
},
"l": {
...
note: for forcing the numbers to be treated as strings you can use the --infer-none option.
Insofar as your goal is to make a be a key, using from_entries is suitable for that:
[
inputs |
split(",") |
map(ltrimstr("\"")) |
map(rtrimstr("\"")) |
{
"key": .[1],
"value": {
number: .[2],
c: .[3],
d: .[4],
e: .[5],
number: .[6]
}
}
] |
from_entries |
{ configuration: . }
When run with
jq -R -f api.jq <output.csv
...the output is:
{
"configuration": {
"a2": {
"number": "2",
"c": "b2",
"d": "c2",
"e": "d2"
},
"a3": {
"number": "3",
"c": "b3",
"d": "c3",
"e": "d3"
},
"a4": {
"number": "4",
"c": "b4",
"d": "c4",
"e": "d4"
},
"a5": {
"number": "5",
"c": "b5",
"d": "c5",
"e": "d5"
}
}
}
If robustness of CSV parsing is a concern, you could easily adapt
the parser at rosettacode.org. The following converts the CSV rows to JSON arrays; since the "main" program below uses inputs, you'd use the -R and -n command-line options.
## The PEG * operator:
def star(E): (E | star(E)) // . ;
## Helper functions:
# Consume a regular expression rooted at the start of .remainder, or emit empty;
# on success, update .remainder and set .match but do NOT update .result
def consume($re):
# on failure, match yields empty
(.remainder | match("^" + $re)) as $match
| .remainder |= .[$match.length :]
| .match = $match.string;
def parse($re):
consume($re)
| .result = .result + [.match] ;
def ws: consume(" *");
### Parse a string into comma-separated values
def quoted_field_content:
parse("((\"\")|([^\"]))*")
| .result[-1] |= gsub("\"\""; "\"");
def unquoted_field: parse("[^,\"]*");
def quoted_field: consume("\"") | quoted_field_content | consume("\"");
def field: (ws | quoted_field | ws) // unquoted_field;
def record: field | star(consume(",") | field);
def csv2array:
{remainder: .} | record | .result;
inputs | csv2array
I know you raise this question as a bash+jq question, but, if it was a bash+python question, the solution would be trivial:
# csv2json.py
import sys, csv, json
data = { "configuration": { } }
for [k,a,n1,c,d,e,n2] in csv.reader(sys.stdin.readlines()):
data["configuration"][k] = { "a": a, "number1": n1, "c": c, "d": d, "e": e, "number2": n2 }
print(json.dumps(data, indent=2))
Then, in bash (I'm assuming Ubuntu here), we could go:
python3 csv2json.py < output.csv
jq's from_entries can be used to generate objects with chosen keys.
In the below example, we first use Miller to convert CSV to JSON more robustly (in a manner that supports values with commas or quotes) before proceeding with jq
Add a header line at the top so Miller knows what key name to associate with each value:
cat <(echo k,a,number1,c,d,e,number2) output.csv > output_with_header.csv
Convert the csv to json with miller:
mlr --icsv --ojson cat output_with_header.csv > output.json
transform with jq, generating lists of maps with key and value elements and then combining them with from_entries:
jq '{configuration: ([.[]|{key: .k,value: (.|del(.k))}]|from_entries)}' output.json
This results in:
{
"configuration": {
"k": {
"a": "a1",
"number1": 1,
"c": "b1",
"d": "c1",
"e": "d1",
"number2": 1
},
"l": {
"a": "a2",
"number1": 2,
"c": "b2",
"d": "c2",
"e": "d2",
"number2": 2
},
"m": {
"a": "a3",
"number1": 3,
"c": "b3",
"d": "c3",
"e": "d3",
"number2": 3
},
"n": {
"a": "a4",
"number1": 4,
"c": "b4",
"d": "c4",
"e": "d4",
"number2": 4
},
"o": {
"a": "a5",
"number1": 5,
"c": "b5",
"d": "c5",
"e": "d5",
"number2": 5
}
}
}
All together as a oneliner:
cat <(echo k,a,number1,c,d,e,number2) output.csv | mlr --icsv --ojson cat | jq '{configuration: ([.[]|{key: .k,value: (.|del(.k))}]|from_entries)}'