How do I pretty print JSON with multiple levels of minimization? - json

We have standard pretty printed JSON:
{
"results": {
"groups": {
"alpha": {
"items": {
"apple": {
"attributes": {
"class": "fruit"
}
},
"pear": {
"attributes": {
"class": "fruit"
}
},
"dog": {
"attributes": {
"class": null
}
}
}
},
"beta": {
"items": {
"banana": {
"attributes": {
"class": "fruit"
}
}
}
}
}
}
}
And we have JMin:
{"results":{"groups":{"alpha":{"items":{"apple":{"attributes":{"class":"fruit"}},"pear":{"attributes":{"class":"fruit"}},"dog":{"attributes":{"class":null}}}},"beta":{"items":{"banana":{"attributes":{"class":"fruit"}}}}}}}
But I want to be able to print JSON like this on the fly:
{
"results" : {
"groups" : {
"alpha" : {
"items" : {
"apple":{"attributes":{"class":"fruit"}},
"pear":{"attributes":{"class":"fruit"}},
"dog":{"attributes":{"class":null}}
}
},
"beta" : {
"items" : {
"banana":{"attributes":{"class":"fruit"}}}
}
}
}
}
The above I would describe as "pretty-print JSON, minimized at level 5". Are there any tools that do that?

I wrote my own JSON formatter, based on this script:
#! /usr/bin/env python
VERSION = "1.0.1"
import sys
import json
from optparse import OptionParser
def to_json(o, level=0):
if level < FOLD_LEVEL:
newline = "\n"
space = " "
else:
newline = ""
space = ""
ret = ""
if isinstance(o, basestring):
o = o.encode('unicode_escape')
ret += '"' + o + '"'
elif isinstance(o, bool):
ret += "true" if o else "false"
elif isinstance(o, float):
ret += '%.7g' % o
elif isinstance(o, int):
ret += str(o)
elif isinstance(o, list):
#ret += "[" + ",".join([to_json(e, level+1) for e in o]) + "]"
ret += "[" + newline
comma = ""
for e in o:
ret += comma
comma = "," + newline
ret += space * INDENT * (level+1)
ret += to_json(e, level+1)
ret += newline + space * INDENT * level + "]"
elif isinstance(o, dict):
ret += "{" + newline
comma = ""
for k,v in o.iteritems():
ret += comma
comma = "," + newline
ret += space * INDENT * (level+1)
#ret += '"' + str(k) + '"' + space + ':' + space
ret += '"' + str(k) + '":' + space
ret += to_json(v, level+1)
ret += newline + space * INDENT * level + "}"
elif o is None:
ret += "null"
else:
#raise TypeError("Unknown type '%s' for json serialization" % str(type(o)))
ret += str(o)
return ret
#main():
FOLD_LEVEL = 10000
INDENT = 4
parser = OptionParser(usage='%prog json_file [options]', version=VERSION)
parser.add_option("-f", "--fold-level", action="store", type="int",
dest="fold_level", help="int (all json is minimized to this level)")
parser.add_option("-i", "--indent", action="store", type="int",
dest="indent", help="int (spaces of indentation, default is 4)")
parser.add_option("-o", "--outfile", action="store", type="string",
dest="outfile", metavar="filename", help="write output to a file")
(options, args) = parser.parse_args()
if len(args) == 0:
infile = sys.stdin
elif len(args) == 1:
infile = open(args[0], 'rb')
else:
raise SystemExit(sys.argv[0] + " json_file [options]")
if options.outfile == None:
outfile = sys.stdout
else:
outfile = open(options.outfile, 'wb')
if options.fold_level != None:
FOLD_LEVEL = options.fold_level
if options.indent != None:
INDENT = options.indent
with infile:
try:
obj = json.load(infile)
except ValueError, e:
raise SystemExit(e)
with outfile:
outfile.write(to_json(obj))
outfile.write('\n')
The script accepts fold level, indent and output file from the command line:
$ jsonfold.py -h
Usage: jsonfold.py json_file [options]
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-f FOLD_LEVEL, --fold-level=FOLD_LEVEL
int (all json is minimized to this level)
-i INDENT, --indent=INDENT
int (spaces of indentation, default is 4)
-o filename, --outfile=filename
write output to a file
To get my example from above, fold at the 5th level:
$ jsonfold.py test2 -f 5
{
"results": {
"groups": {
"alpha": {
"items": {
"pear": {"attributes":{"class":"fruit"}},
"apple": {"attributes":{"class":"fruit"}},
"dog": {"attributes":{"class":None}}
}
},
"beta": {
"items": {
"banana": {"attributes":{"class":"fruit"}}
}
}
}
}
}

Related

How to print JSON objects in AWK

I was looking for some built-in functions inside awk to easily generate JSON objects. I came across several answers and decided to create my own.
I'd like to generate JSON from multidimensional arrays, where I store table style data, and to use separate and dynamic definition of JSON schema to be generated from that data.
Desired output:
{
"Name": JanA
"Surname": NowakA
"ID": 1234A
"Role": PrezesA
}
{
"Name": JanD
"Surname": NowakD
"ID": 12341D
"Role": PrezesD
}
{
"Name": JanC
"Surname": NowakC
"ID": 12342C
"Role": PrezesC
}
Input file:
pierwsza linia
druga linia
trzecia linia
dane wspólników
imie JanA
nazwisko NowakA
pesel 11111111111A
funkcja PrezesA
imie Ja"nD
nazwisko NowakD
pesel 11111111111
funkcja PrezesD
imie JanC
nazwisko NowakC
pesel 12342C
funkcja PrezesC
czwarta linia
reprezentanci
imie Tomek
Based on input file i created a multidimensional array:
JanA NowaA 1234A PrezesA
JanD NowakD 12341D PrezesD
JanC NowakC 12342C PrezesC
I'll take a stab at a gawk solution. The indenting isn't perfect and the results aren't ordered (see "Sorting" note below), but it's at least able to walk a true multidimensional array recursively and should produce valid, parsable JSON from any array. Bonus: the data array is the schema. Array keys become JSON keys. There's no need to create a separate schema array in addition to the data array.
Just be sure to use the true multidimensional array[d1][d2][d3]... convention of constructing your data array, rather than the concatenated index array[d1,d2,d3...] convention.
Update:
I've got an updated JSON gawk script posted as a GitHub Gist. Although the script below is tested as working with OP's data, I might've made improvements since this post was last edited. Please see the Gist for the most thoroughly tested, bug-squashed version.
#!/usr/bin/gawk -f
BEGIN { IGNORECASE = 1 }
$1 ~ "imie" { record[++idx]["name"] = $2 }
$1 ~ "nazwisko" { record[idx]["surname"] = $2 }
$1 ~ "pesel" { record[idx]["ID"] = $2 }
$1 ~ "funkcja" { record[idx]["role"] = $2 }
END { print serialize(record, "\t") }
# ==== FUNCTIONS ====
function join(arr, sep, _p, i) {
# syntax: join(array, string separator)
# returns a string
for (i in arr) {
_p["result"] = _p["result"] ~ "[[:print:]]" ? _p["result"] sep arr[i] : arr[i]
}
return _p["result"]
}
function quote(str) {
gsub(/\\/, "\\\\", str)
gsub(/\r/, "\\r", str)
gsub(/\n/, "\\n", str)
gsub(/\t/, "\\t", str)
return "\"" str "\""
}
function serialize(arr, indent_with, depth, _p, i, idx) {
# syntax: serialize(array of arrays, indent string)
# returns a JSON formatted string
# sort arrays on key, ensures [...] values remain properly ordered
if (!PROCINFO["sorted_in"]) PROCINFO["sorted_in"] = "#ind_num_asc"
# determine whether array is indexed or associative
for (i in arr) {
_p["assoc"] = or(_p["assoc"], !(++_p["idx"] in arr))
}
# if associative, indent
if (_p["assoc"]) {
for (i = ++depth; i--;) {
_p["end"] = _p["indent"]; _p["indent"] = _p["indent"] indent_with
}
}
for (i in arr) {
# If key length is 0, assume its an empty object
if (!length(i)) return "{}"
# quote key if not already quoted
_p["key"] = i !~ /^".*"$/ ? quote(i) : i
if (isarray(arr[i])) {
if (_p["assoc"]) {
_p["json"][++idx] = _p["indent"] _p["key"] ": " \
serialize(arr[i], indent_with, depth)
} else {
# if indexed array, dont print keys
_p["json"][++idx] = serialize(arr[i], indent_with, depth)
}
} else {
# quote if not numeric, boolean, null, already quoted, or too big for match()
if (!((arr[i] ~ /^[0-9]+([\.e][0-9]+)?$/ && arr[i] !~ /^0[0-9]/) ||
arr[i] ~ /^true|false|null|".*"$/) || length(arr[i]) > 1000)
arr[i] = quote(arr[i])
_p["json"][++idx] = _p["assoc"] ? _p["indent"] _p["key"] ": " arr[i] : arr[i]
}
}
# I trial and errored the hell out of this. Problem is, gawk cant distinguish between
# a value of null and no value. I think this hack is as close as I can get, although
# [""] will become [].
if (!_p["assoc"] && join(_p["json"]) == "\"\"") return "[]"
# surround with curly braces if object, square brackets if array
return _p["assoc"] ? "{\n" join(_p["json"], ",\n") "\n" _p["end"] "}" \
: "[" join(_p["json"], ", ") "]"
}
Output resulting from OP's example data:
[{
"ID": "1234A",
"name": "JanA",
"role": "PrezesA",
"surname": "NowakA"
}, {
"ID": "12341D",
"name": "JanD",
"role": "PrezesD",
"surname": "NowakD"
}, {
"ID": "12342C",
"name": "JanC",
"role": "PrezesC",
"surname": "NowakC"
}, {
"name": "Tomek"
}]
Sorting
Although the results by default are ordered in a manner only gawk understands, it is possible for gawk to sort the results on a field. If you'd like to sort on the ID field for example, add this function:
function cmp_ID(i1, v1, i2, v2) {
if (!isarray(v1) && v1 ~ /"ID"/ ) {
return v1 < v2 ? -1 : (v1 != v2)
}
}
Then insert this line within your END section above print serialize(record):
PROCINFO["sorted_in"] = "cmp_ID"
See Controlling Array Traversal for more information.
My updated awk implementation of simple array printer with regex based validation for each column(running using gawk):
function ltrim(s) { sub(/^[ \t]+/, "", s); return s }
function rtrim(s) { sub(/[ \t]+$/, "", s); return s }
function sTrim(s){
return rtrim(ltrim(s));
}
function jsonEscape(jsValue) {
gsub(/\\/, "\\\\", jsValue)
gsub(/"/, "\\\"", jsValue)
gsub(/\b/, "\\b", jsValue)
gsub(/\f/, "\\f", jsValue)
gsub(/\n/, "\\n", jsValue)
gsub(/\r/, "\\r", jsValue)
gsub(/\t/, "\\t", jsValue)
return jsValue
}
function jsonStringEscapeAndWrap(jsValue) {
return "\42" jsonEscape(jsValue) "\42"
}
function jsonPrint(contentArray, contentRowsCount, schemaArray){
result = ""
schemaLength = length(schemaArray)
for (x = 1; x <= contentRowsCount; x++) {
result = result "{"
for(y = 1; y <= schemaLength; y++){
result = result "\42" sTrim(schemaArray[y]) "\42:" sTrim(contentArray[x, y])
if(y < schemaLength){
result = result ","
}
}
result = result "}"
if(x < contentRowsCount){
result = result ",\n"
}
}
return result
}
function jsonValidateAndPrint(contentArray, contentRowsCount, schemaArray, schemaColumnsCount, errorArray){
result = ""
errorsCount = 1
for (x = 1; x <= contentRowsCount; x++) {
jsonRow = "{"
for(y = 1; y <= schemaColumnsCount; y++){
regexValue = schemaArray[y, 2]
jsonValue = sTrim(contentArray[x, y])
isValid = jsonValue ~ regexValue
if(isValid == 0){
errorArray[errorsCount, 1] = "\42" sTrim(schemaArray[y, 1]) "\42"
errorArray[errorsCount, 2] = "\42Value " jsonValue " not match format: " regexValue " \42"
errorArray[errorsCount, 3] = x
errorsCount++
jsonValue = "null"
}
jsonRow = jsonRow "\42" sTrim(schemaArray[y, 1]) "\42:" jsonValue
if(y < schemaColumnsCount){
jsonRow = jsonRow ","
}
}
jsonRow = jsonRow "}"
result = result jsonRow
if(x < contentRowsCount){
result = result ",\n"
}
}
return result
}
BEGIN{
rowsCount =1
matchCount = 0
errorsCount = 0
shareholdersJsonSchema[1, 1] = "Imie"
shareholdersJsonSchema[2, 1] = "Nazwisko"
shareholdersJsonSchema[3, 1] = "PESEL"
shareholdersJsonSchema[4, 1] = "Funkcja"
shareholdersJsonSchema[1, 2] = "\\.*"
shareholdersJsonSchema[2, 2] = "\\.*"
shareholdersJsonSchema[3, 2] = "^[0-9]{11}$"
shareholdersJsonSchema[4, 2] = "\\.*"
errorsSchema[1] = "PropertyName"
errorsSchema[2] = "Message"
errorsSchema[3] = "PositionIndex"
resultSchema[1]= "ShareHolders"
resultSchema[2]= "Errors"
}
/dane wspólników/,/czwarta linia/{
if(/imie/ || /nazwisko/ || /pesel/ || /funkcja/){
if(/imie/){
shareholdersArray[rowsCount, 1] = jsonStringEscapeAndWrap($2)
matchCount++
}
if(/nazwisko/){
shareholdersArray[rowsCount, 2] = jsonStringEscapeAndWrap($2)
matchCount ++
}
if(/pesel/){
shareholdersArray[rowsCount, 3] = $2
matchCount ++
}
if(/funkcja/){
shareholdersArray[rowsCount, 4] = jsonStringEscapeAndWrap($2)
matchCount ++
}
if(matchCount==4){
rowsCount++
matchCount = 0;
}
}
}
END{
shareHolders = jsonValidateAndPrint(shareholdersArray, rowsCount - 1, shareholdersJsonSchema, 4, errorArray)
shareHoldersErrors = jsonPrint(errorArray, length(errorArray) / length(errorsSchema), errorsSchema)
resultArray[1,1] = "\n[\n" shareHolders "\n]\n"
resultArray[1,2] = "\n[\n" shareHoldersErrors "\n]\n"
resultJson = jsonPrint(resultArray, 1, resultSchema)
print resultJson
}
Produces output:
{"ShareHolders":
[
{"Imie":"JanA","Nazwisko":"NowakA","PESEL":null,"Funkcja":"PrezesA"},
{"Imie":"Ja\"nD","Nazwisko":"NowakD","PESEL":11111111111,"Funkcja":"PrezesD"},
{"Imie":"JanC","Nazwisko":"NowakC","PESEL":null,"Funkcja":"PrezesC"}
]
,"Errors":
[
{"PropertyName":"PESEL","Message":"Value 11111111111A not match format: ^[0-9]{11}$ ","PositionIndex":1},
{"PropertyName":"PESEL","Message":"Value 12342C not match format: ^[0-9]{11}$ ","PositionIndex":3}
]
}

Convert spark decision tree model debug string to nested JSON in scala

Similar to the tree json parsing quoted here, I am trying to implement a simple visualization of decision trees in scala. It is exactly same as the display method available in databricks notebooks.
I am new to scala and struggling to get the logic right. I understand we have to make recursive calls to build the children and break when the final prediction values are shown. i have attempted a code here using the below mentioned input model debug string
def getStatmentType(x: String): (String, String) = {
val ifPattern = "If+".r
val ifelsePattern = "Else+".r
var t = ifPattern.findFirstIn(x.toString)
if(t != None){
("If", (x.toString).replace("If",""))
}else {
var ts = ifelsePattern.findFirstIn(x.toString)
if(ts != None) ("Else", (x.toString).replace("Else", ""))
else ("None", (x.toString).replace("(", "").replace(")",""))
}
}
def delete[A](test:List[A])(i: Int) = test.take(i) ++ test.drop((i+1))
def BuildJson(tree:List[String]):List[Map[String, Any]] = {
var block:List[Map[String, Any]] = List()
var lines:List[String] = tree
loop.breakable {
while (lines.length > 0) {
println("here")
var (cond, name) = getStatmentType(lines(0))
println("initial" + cond)
if (cond == "If") {
println("if" + cond)
// lines = lines.tail
lines = delete(lines)(0)
block = block :+ Map("if-name" -> name, "children" -> BuildJson(lines))
println("After pop Else State"+lines(0))
val (p_cond, p_name) = getStatmentType(lines(0))
// println(p_cond + " = "+ p_name+ "\n")
cond = p_cond
name = p_name
println(cond + " after="+ name+ "\n")
if (cond == "Else") {
println("else" + cond)
lines = lines.tail
block = block :+ Map("else-name" -> name, "children" -> BuildJson(lines))
}
}else if( cond == "None") {
println(cond + "NONE")
lines = delete(lines)(0)
block = block :+ Map("predict" -> name)
}else {
println("Finaly Break")
println("While loop--" +lines)
loop.break()
}
}
}
block
}
def treeJson1(str: String):JsValue = {
val str = "If (feature 0 in {1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,10.0,11.0,12.0,13.0})\n If (feature 0 in {6.0})\n Predict: 17.0\n Else (feature 0 not in {6.0})\n Predict: 6.0\n Else (feature 0 not in {1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,10.0,11.0,12.0,13.0})\n Predict: 20.0"
val x = str.replace(" ","")
val xs = x.split("\n").toList
var js = BuildJson(xs)
println(MapReader.mapToJson(js))
Json.toJson("")
}
Expected output:
[
{
'name': 'Root',
'children': [
{
'name': 'feature 0 in {1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,10.0,11.0,12.0,13.0}',
'children': [
{
'name': 'feature 0 in {6.0}',
'children': [
{
'name': 'Predict: 17.0'
}
]
},
{
'name': 'feature 0 not in {6.0}',
'children': [
{
'name': 'Predict: 6.0'
}
]
}
]
},
{
'name': 'feature 0 not in {1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,10.0,11.0,12.0,13.0}',
'children': [
{
'name': 'Predict: 20.0'
}
]
}
]
you don`t need to parse the debugstring, instead, you can parse from the rootnode of the model.
refer to enter link description here

How to convert jCard into first-class JSON?

I need a stable and secure convertion algorithm (any language), that can produce the final output as "first class" JSON objects. Example:
jCard format: [["version", {}, "text", "4.0"],["fn", {}, "text", "Forrest Gump"]]
First-class JSON format: {"version":"4.0","fn":"Forrest Gump"}.
In python
first, I create one function named jcard_to_json that takes a jCard as input and converts it to a "first-class" JSON object. The function iterates over the items in the jCard, and for each item, it adds a key-value pair to the json_obj dictionary, where the key is the first item in the jCard thing and the value is the fourth item
Example:-
def jcard_to_json(jcard):
json_obj = {}
for item in jcard:
json_obj[item[0]] = item[3]
return json_obj
jcard = [["version", {}, "text", "4.0"], ["fn", {}, "text", "Forrest
Gump"]]
json_obj = jcard_to_json(jcard)
In Java.
import org.json.JSONArray;
import org.json.JSONException;
import org.json.JSONObject;
public class JCardTest {
public static void main(String[] args) throws JSONException {
String jcardStr = "[\"vcard\","
+ " ["
+ " [\"version\", {}, \"float\", \"4.0\"],"
+ " [\"fn\", {}, \"text\", \"John Doe\"],"
+ " [\"gender\", {}, \"text\", \"M\"],"
+ " [\"categories\", {}, \"text\", \"computers\", \"cameras\"],"
+ " [\"number\", {}, \"integer\", 12345],"
+ " [\"adr\","
+ " { \"type\": \"work\" },"
+ " \"text\","
+ " ["
+ " \"\","
+ " \"Suite D2-630\","
+ " \"2875 Laurier\","
+ " \"Quebec\","
+ " \"QC\","
+ " \"G1V 2M2\","
+ " \"Canada\""
+ " ]"
+ " ]"
+ " ]"
+ " ]";
JSONArray jcard = new JSONArray(jcardStr);
jcard = jcard.getJSONArray(1);
JSONObject result = new JSONObject();
for (int i = 0; i < jcard.length(); i++) {
JSONArray arr = jcard.getJSONArray(i);
String name = arr.getString(0);
String dataType = arr.getString(2);
if (arr.length() == 4) {
switch (dataType) {
case "integer": {
long val = arr.getLong(3);
result.put(name, val);
}
break;
case "float": {
double val = arr.getDouble(3);
result.put(name, val);
}
break;
default:
Object val = arr.get(3);
if (val instanceof JSONArray) {
result.put(name, (JSONArray) val);
} else {
result.put(name, val.toString());
}
break;
}
} else {
JSONArray resArr = new JSONArray();
for (int j = 3; j < arr.length(); j++) {
resArr.put(arr.get(j).toString());
}
result.put(name, resArr);
}
}
System.out.println(result);
}
}
This ignores the 'parameter" part (arr.get(1)) of the jCard entries.
the code is verbose to not to hide important details. it could be written more compact.
Example is based on examples in the jCard RFC7095.

How to yield a JSON object from a for loop in scala?

for (character <- content) {
if (character == '\n') {
val current_line = line.mkString
line.clear()
current_line match {
case docStartRegex(_*) => {
startDoc = true
endText = false
endDoc = false
}
case docnoRegex(group) => {
docID = group.trim
}
case docTextStartRegex(_*) => {
startText = true
}
case docTextEndRegex(_*) => {
endText = true
startText = false
}
case docEndRegex(_*) => {
endDoc = true
startDoc = false
es_json = Json.obj(
"_index" -> "ES_SPARK_AP",
"_type" -> "document",
"_id" -> docID,
"_source" -> Json.obj(
"text" -> textChunk.mkString(" ")
)
)
// yield es_json
textChunk.clear()
}
case _ => {
if (startDoc && !endDoc && startText) {
textChunk += current_line.trim
}
}
}
} else {
line += character
}
}
The above for-loop parses through a text file and creates a JSON object of each chunk parsed in a loop. This is JSON will be sent to for further processing to Elasticsearch. In python, we can yield the JSON and use generator easily like:
def func():
for i in range(num):
... some computations ...
yield {
JSON ## JSON is yielded
}
for json in func(): ## we parse through the generator here.
process(json)
I cannot understand how I can use yield in similar fashion using scala?
If you want lazy returns, scala does this using Iterator types. Specifically if you want to handle line by line values, I'd split it into lines first with .lines
val content: String = ???
val results: Iterator[Json] =
for {
lines <- content.lines
line <- lines
} yield {
line match {
case docEndRegex(_*) => ...
}
}
You can also use a function directly
def toJson(line: String): Json =
line match {
case "hi" => Json.obj("line" -> "hi")
case "bye" => Json.obj("what" -> "a jerk")
}
val results: Iterator[Json] =
for {
lines <- content.lines
line <- lines
} yield toJson(line)
This is equivalent to doing
content.lines.map(line => toJson(line))
Or somewhat equivalently in python
lines = (line.strip() for line in content.split("\n"))
jsons = (toJson(line) for line in lines)

awk to translate config file to json

I have a config file like this one:
[sectionOne]
key1_1=value1_1
key1_n=value1_n
#this is a comment
[sectionTwo]
key2_1=value2_1
key2_n=value2_n
;this is a comment also
[SectionThree]
key3_1=value3_1
key3_n=value3_n
[SectionFor]
...
I need to translate this into json, using minimal shell tools (no perl,python,php, just sed,awk available)
The desired output is :
[
{"sectionOne": { "key1_1": "value1_1","key1_n": "value1_n"} },
{"sectionTwo": { "key2_1": "value2_1","key2_n": "value2_n"} },
{"sectionThree": { "key3_1": "value3_1","key3_n": "value3_n"}}
....
]
I tried several ways/hours, no success
Thank you in advance
There's some inconsistencies between your sample input and desired output so it's hard to be sure but this should be close and easy to tweak if not 100% what you want:
$ cat file
[sectionOne]
key1_1=value1_1
key1_n=value1_n
#this is a comment
[sectionTwo]
key2_1=value2_1
key2_n=value2_n
;this is a comment also
[SectionThree]
key3_1=value3_1
key3_n=value3_n
$
$ cat tst.awk
BEGIN{
FS="="
print "["
}
/^([#;]|[[:space:]]*$)/ {
next
}
gsub(/[][]/,"") {
printf "%s{\"%s\": { ", rs, $0
rs="} },\n"
fs=""
next
}
{
printf "%s\"%s\": \"%s\"", fs, $1, $2
fs=","
}
END{
print rs "]"
}
$
$ awk -f tst.awk file
[
{"sectionOne": { "key1_1": "value1_1","key1_n": "value1_n"} },
{"sectionTwo": { "key2_1": "value2_1","key2_n": "value2_n"} },
{"SectionThree": { "key3_1": "value3_1","key3_n": "value3_n"} },
]
awk 'BEGIN{ print "[" }
/^[#;]/{ next } # Ignore comments
/^\[/{ gsub( "[][]", "" ); printf "%s{\"%s\": { ", s ? "}},\n" : "", $0; n=0; s=1 }
/=/ { gsub( "=", "\":\"" ); printf "%c\"%s\" ", n ? "," : "", $0; n=1 }
END{ print "}}\n]" }
'
Here's a solution in bash using awk:
#!/bin/bash
awk -F"=" 'BEGIN{in_section=0; first_field=0; printf "["}
{
last=length($1);
if ( (substr($1,1,1) == "[") && (substr($1,last, last) == "]")) {
if (in_section==1) {
printf "} },";
}
section=substr($1, 2, last-2);
printf "\n{\"%s\":", section;
printf " {";
first_field=1;
in_section=1;
} else if ( substr($1, 1, 1) == "#" || substr($1, 1, 1) == ";"){
} else if ( ($1 != "") && ($2 != "") ) {
if (first_field==0) {
printf ", ";
}
printf "\"%s\": \"%s\"", $1, $2;
first_field=0;
}
}
END{printf "} }\n]\n"}'