I have one stream output stored in csv file, I need help converting csv to json:
my csv looks like:
cat output.csv
"k","a1",1,"b1","c1","d1",1
"l","a2",2,"b2","c2","d2",2
"m","a3",3,"b3","c3","d3",3
"n","a4",4,"b4","c4","d4",4
"o","a5",5,"b5","c5","d5",5
Required output:
note: I need key configuration to be added to json.
{
"configuration": {
"k": {
"a": "a1",
"number1": "1",
"c": "b1",
"d": "c1",
"e": "d1",
"number2": "1"
},
"l": {
"a": "a2",
"number1": "2",
"c": "b2",
"d": "c2",
"e": "d2",
"number2": "2"
},
.
.
.
}
}
So far tried with jq:
my function is:
cat api.jq
[
inputs |
split(",") |
map(ltrimstr("\"")) |
map(rtrimstr("\"")) |
{
a: .[1],
number1: .[2],
c: .[3],
d: .[4],
e: .[5],
number2: .[6]
}
] | {configuration: .}
Output:
jq -nRf api.jq output.csv
{
"cluster_configuration": [
{
"a": "a1",
"number1": "1",
"c": "b1",
"d": "c1",
"e": "d1",
"number2": "1"
},
{
"a": "a2",
"number1": "2",
"c": "b2",
"d": "c2",
"e": "d2",
"number2": "2"
},
{
"a": "a3",
"number1": "3",
"c": "b3",
"d": "c3",
"e": "d3",
"number2": "3"
},
{
"a": "a4",
"number1": "4",
"c": "b4",
"d": "c4",
"e": "d4",
"number2": "4"
},
{
"a": "a5",
"number1": "5",
"c": "b5",
"d": "c5",
"e": "d5",
"number2": "5"
}
]
}
Here's a possible solution with Miller (available here for several OSs), an interesting tool that supports multiple input/output formats:
mlr --icsv -N put -q '
#map[$1] = {"a": $2, "number1": $3, "c": $4, "d": $5, "e": $6, "number2": $7};
end { dump { "configuration": #map } }
' file.csv
{
"configuration": {
"k": {
"a": "a1",
"number1": 1,
"c": "b1",
"d": "c1",
"e": "d1",
"number2": 1
},
"l": {
...
note: for forcing the numbers to be treated as strings you can use the --infer-none option.
Insofar as your goal is to make a be a key, using from_entries is suitable for that:
[
inputs |
split(",") |
map(ltrimstr("\"")) |
map(rtrimstr("\"")) |
{
"key": .[1],
"value": {
number: .[2],
c: .[3],
d: .[4],
e: .[5],
number: .[6]
}
}
] |
from_entries |
{ configuration: . }
When run with
jq -R -f api.jq <output.csv
...the output is:
{
"configuration": {
"a2": {
"number": "2",
"c": "b2",
"d": "c2",
"e": "d2"
},
"a3": {
"number": "3",
"c": "b3",
"d": "c3",
"e": "d3"
},
"a4": {
"number": "4",
"c": "b4",
"d": "c4",
"e": "d4"
},
"a5": {
"number": "5",
"c": "b5",
"d": "c5",
"e": "d5"
}
}
}
If robustness of CSV parsing is a concern, you could easily adapt
the parser at rosettacode.org. The following converts the CSV rows to JSON arrays; since the "main" program below uses inputs, you'd use the -R and -n command-line options.
## The PEG * operator:
def star(E): (E | star(E)) // . ;
## Helper functions:
# Consume a regular expression rooted at the start of .remainder, or emit empty;
# on success, update .remainder and set .match but do NOT update .result
def consume($re):
# on failure, match yields empty
(.remainder | match("^" + $re)) as $match
| .remainder |= .[$match.length :]
| .match = $match.string;
def parse($re):
consume($re)
| .result = .result + [.match] ;
def ws: consume(" *");
### Parse a string into comma-separated values
def quoted_field_content:
parse("((\"\")|([^\"]))*")
| .result[-1] |= gsub("\"\""; "\"");
def unquoted_field: parse("[^,\"]*");
def quoted_field: consume("\"") | quoted_field_content | consume("\"");
def field: (ws | quoted_field | ws) // unquoted_field;
def record: field | star(consume(",") | field);
def csv2array:
{remainder: .} | record | .result;
inputs | csv2array
I know you raise this question as a bash+jq question, but, if it was a bash+python question, the solution would be trivial:
# csv2json.py
import sys, csv, json
data = { "configuration": { } }
for [k,a,n1,c,d,e,n2] in csv.reader(sys.stdin.readlines()):
data["configuration"][k] = { "a": a, "number1": n1, "c": c, "d": d, "e": e, "number2": n2 }
print(json.dumps(data, indent=2))
Then, in bash (I'm assuming Ubuntu here), we could go:
python3 csv2json.py < output.csv
jq's from_entries can be used to generate objects with chosen keys.
In the below example, we first use Miller to convert CSV to JSON more robustly (in a manner that supports values with commas or quotes) before proceeding with jq
Add a header line at the top so Miller knows what key name to associate with each value:
cat <(echo k,a,number1,c,d,e,number2) output.csv > output_with_header.csv
Convert the csv to json with miller:
mlr --icsv --ojson cat output_with_header.csv > output.json
transform with jq, generating lists of maps with key and value elements and then combining them with from_entries:
jq '{configuration: ([.[]|{key: .k,value: (.|del(.k))}]|from_entries)}' output.json
This results in:
{
"configuration": {
"k": {
"a": "a1",
"number1": 1,
"c": "b1",
"d": "c1",
"e": "d1",
"number2": 1
},
"l": {
"a": "a2",
"number1": 2,
"c": "b2",
"d": "c2",
"e": "d2",
"number2": 2
},
"m": {
"a": "a3",
"number1": 3,
"c": "b3",
"d": "c3",
"e": "d3",
"number2": 3
},
"n": {
"a": "a4",
"number1": 4,
"c": "b4",
"d": "c4",
"e": "d4",
"number2": 4
},
"o": {
"a": "a5",
"number1": 5,
"c": "b5",
"d": "c5",
"e": "d5",
"number2": 5
}
}
}
All together as a oneliner:
cat <(echo k,a,number1,c,d,e,number2) output.csv | mlr --icsv --ojson cat | jq '{configuration: ([.[]|{key: .k,value: (.|del(.k))}]|from_entries)}'
I'm currently working through an issue, and can't seem to figure this one out. Here's some data so you know what I'm talking about below:
foo.json
{
"Schedule": [
{
"deviceId": 123,
"reservationId": 123456,
"username": "jdoe"
},
{
"deviceId": 456,
"reservationId": 589114,
"username": "jsmith"
}
],
"serverTime": 1522863125.019958
}
bar.json
[
{
"a": {
"b": "10.0.0.1",
"c": "hostname1"
},
"deviceId": 123
},
{
"a": {
"b": "10.0.0.2",
"c": "hostname2"
},
"deviceId": 456
}
]
foobar.json
{
"Schedule": [
{
"deviceId": 123,
"reservationId": 123456,
"username": "jdoe",
"a": {
"b": "10.0.0.1",
"c": "hostname1"
}
}
},
{
"deviceId": 456,
"reservationId": 789101,
"username": "jsmith",
"a": {
"b": "10.0.0.2",
"c": "hostname2"
}
}
],
"serverTime": 1522863125.019958
}
I'm trying to use jq to do this, and had some help from this post: https://github.com/stedolan/jq/issues/1090
The goal is to be able to combine JSON, using some key as a common point between the documents. The data may be nested any amount of levels.. In this case foo.json has nested data only two levels deep, but needs to be combined with data nested 1 level deep.
Any and all suggestions would be super helpful. I'm also happy to clarify and answer questions if needed. Thank you!
With foobar.jq as follows:
def dict(f):
reduce .[] as $o ({}; .[$o | f | tostring] = $o ) ;
($bar | dict(.deviceId)) as $dict
| .Schedule |= map(. + ($dict[.deviceId|tostring] ))
the invocation:
jq -f foobar.jq --argfile bar bar.json foo.json
yields the output shown below.
Notice that the referents in the dictionary contain the full object (including the key/value pair for "deviceId"), but it's not necessary to del(.deviceId) because of the way + is defined in jq.
Output
{
"Schedule": [
{
"deviceId": 123,
"reservationId": 123456,
"username": "jdoe",
"a": {
"b": "10.0.0.1",
"c": "hostname1"
}
},
{
"deviceId": 456,
"reservationId": 589114,
"username": "jsmith",
"a": {
"b": "10.0.0.2",
"c": "hostname2"
}
}
],
"serverTime": 1522863125.019958
}
I have a particular JSON data which contain float value that I need to conditionally process over an array of JSON. This is an example of one JSON instance:
[
{
"a": "0",
"b": "66.67",
"c": "0",
"d": "0"
},
{
"a": "12.33",
"b": "0",
"c": "60.2",
"d": "19.3"
},
{
"a": "70.0",
"b": "92.67",
"c": "0",
"d": "0"
}
]
and I wish to conditionally select like
cat mydata.json | jq '.[] | select((.a > 50) and (.b > 50))'
and it should sound like
{
"a": "70.0",
"b": "92.67",
"c": "0",
"d": "0"
}
The problem is my original data is a string value and I have no idea how to parse it for a conditional selection.
Simply with jq's tonumber function:
jq '.[] | select((.a|tonumber) > 50 and (.b|tonumber) > 50)' mydata.json
The output:
{
"a": "70.0",
"b": "92.67",
"c": "0",
"d": "0"
}