How can i match fields with wildcards using jq? - json

I have a JSON object of the following form:
{
"Task11c-0-20181209-12:59:30-65611" : {
"attributes" : {
"configname" : "Task11c",
"datetime" : "20181209-12:59:30",
"experiment" : "Task11c",
"inifile" : "lab1.ini",
"iterationvars" : "",
"iterationvarsf" : "",
"measurement" : "",
"network" : "Manhattan1_1C",
"processid" : "65611",
"repetition" : "0",
"replication" : "#0",
"resultdir" : "results",
"runnumber" : "0",
"seedset" : "0"
},
......
},
......
"Task11b-12-20181209-13:03:17-65612" : {
....
....
},
.......
}
I reported only the first part, but in general I have many other sub-objects which match a string like Task11c-0-20181209-12:59:30-65611. They all have in common the initial word Task. I want to extract the processid from each sub-object. I'm trying to use a wildcard like in bash, but it seems not to be possible.
I also read about the match() function, but it works with strings and not json objects.
Thanks for the support.

Filter keys that start with Test and get only the attribute of your choice using the select() expression
jq 'to_entries[] | select(.key|startswith("Task")).value.attributes.processid' json

Related

Convert MongoDB Document to Extended JSON in Shell

I am looking for a Shell tool that can convert a mongodb document into extended JSON.
So if the original JSON file looks like this:
{
"_id" : ObjectId("5a8c60b8c83eaf000fb39547"),
"name" : "myName",
"created" : ISODate("2018-02-20T17:54:00.091Z"),
"components" : [
...
The result would be something like this:
{
"$oid" : "5a8c60b8c83eaf000fb39547",
"name" : "myName",
"created" : { "$date" : "2018-02-20T17:54:00.091Z"},
"components" : [
...
The MongoDB shell speaks Javascript, so the answer is simple: use JSON.stringify(). If your command is db.serverStatus(), then you can simply do this:
JSON.stringify(db.serverStatus())
This won't output the proper "strict mode" representation of each of the fields ({ "floatApprox": <number> } instead of { "$numberLong": "<number>" }), but if what you care about is getting standards-compliant JSON out, this'll do the trick.

read.json only reading the first object in Spark

I have a multiLine json file, and I am using spark's read.json to read the json, the problem is that it is only reading the first object from that json file
val dataFrame = spark.read.option("multiLine", true).option("mode", "PERMISSIVE").json(path)
dataFrame.rdd.saveAsTextFile("DataFrame")
Sample json:
{
"_id" : "589895e123c572923e69f5e7",
"thing" : "54eb45beb5f1e061454c5bf4",
"timeline" : [
{
"reason" : "TRIP_START",
"timestamp" : "2017-02-06T17:20:18.007+02:00",
"type" : "TRIP_EVENT",
"location" : [
11.1174091,
69.1174091
],
"endLocation" : [],
"startLocation" : []
},
"reason" : "TRIP_END",
"timestamp" : "2017-02-06T17:25:26.026+02:00",
"type" : "TRIP_EVENT",
"location" : [
11.5691428,
48.1122443
],
"endLocation" : [],
"startLocation" : []
}
],
"__v" : 0
}
{
"_id" : "589895e123c572923e69f5e8",
"thing" : "54eb45beb5f1e032241c5bf4",
"timeline" : [
{
"reason" : "TRIP_START",
"timestamp" : "2017-02-06T17:20:18.007+02:00",
"type" : "TRIP_EVENT",
"location" : [
11.1174091,
50.1174091
],
"endLocation" : [],
"startLocation" : []
},
"reason" : "TRIP_END",
"timestamp" : "2017-02-06T17:25:26.026+02:00",
"type" : "TRIP_EVENT",
"location" : [
51.1174091,
69.1174091
],
"endLocation" : [],
"startLocation" : []
}
],
"__v" : 0
}
I get only the first entry with id = 589895e123c572923e69f5e7.
Is there something that I am doing wrong?
Are you sure multiple multi line JSON is supported?
Each line must contain a separate, self-contained valid JSON object... For a regular multi-line JSON file, set the multiLine option to true
http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets
Where a "regular JSON file" means the entire file is a singular JSON object / array, however, simply putting {} around your data won't work because you need a key for every object, and so you'd need a top level key, maybe say "objects". Similarly, you can try an array, but wrapping with []. Either way, these will only work if every object in that array or object is separated by commas.
tl;dr - the whole file needs to be one valid JSON object when multiline=true
You're only getting one object because it parses the first set of brackets, and that's it.
If you have full control over the JSON file, the indented layout is purely for human consumption. Just flatten the objects and let Spark parse it as the API is intended to be used
Keep one line and one JsValue in file, remove .option("multiLine", true).
like this:
{"name":"Michael"}
{"name":"Andy", "age":30}
{"name":"Justin", "age":19}

Is there any way to preserve the order while generating patch for json files ?

I am new to Json stuff i.e. JSON PATCH.
I have scenario where I need to figure out between two version of Json files of same object, for that I am using json-patch-master.
But unfortunately the patch generated interpreting it differently i.e. the order differently hence getting unexpected/invalid results.
Could anyone help me how to preserve the order while generating Json Patch ?
**Here is the actual example.
Original Json file :**
[ {
"name" : "name1",
"roolNo" : "1"
}, {
"name" : "name2",
"roolNo" : "2"
}, {
"name" : "name3",
"roolNo" : "3"
}, {
"name" : "name4",
"roolNo" : "4"
} ]
**Modified/New Json file: i.e. removed 2nd node of original file.**
[ {
"name" : "name1",
"roolNo" : "1"
}, {
"name" : "name3",
"roolNo" : "3"
}, {
"name" : "name4",
"roolNo" : "4"
} ]
**Patch/Diff Generated :**
[ {"op":"remove","path":"/3"},
{"op":"replace","path":"/1/name","value":"name3"},
{"op":"replace","path":"/1/roolNo","value":"3"},
{"op":"replace","path":"/2/name","value":"name4"},
{"op":"replace","path":"/2/roolNo","value":"4"}]
Very time I generate Diff/Patch it is giving different path/diff results.
And moreover the interpretation is different i.e. order is not preserving.
**Is there any way to get expected results i.e. [ {"op":"remove","path":"/1"} ] , in other words generated a patch/diff based some order so will get what is expected. ?
How to handle this kind of scenario ?**
Please help me.
Thank you so much.
~Shyam
We are currently working on this issue in Starcounter-Jack/JSON-Patch.
It seems to work nice with native Array.Observe- http://jsfiddle.net/tomalec/p4s7aw96/.
Try Starcounter-Jack/JSON-Patch issues/65_ArrayObserve branch
we will release it as new version once shim and performance will be checked.
Feel free to add you comments at JSON-Patch issue board

MongoDB AND Comparison Fails

I have a Collection named StudentCollection with two documents given below,
> db.studentCollection.find().pretty()
{
"_id" : ObjectId("52d7c0c744b4dd77efe93df7"),
"regno" : 101,
"name" : "Ajeesh",
"gender" : "Male",
"docs" : [
"voterid",
"passport",
"drivinglic"
]
}
{
"_id" : ObjectId("52d7c6a144b4dd77efe93df8"),
"regno" : 102,
"name" : "Sathish",
"gender" : "Male",
"dob" : ISODate("2013-12-09T21:05:00Z")
}
Why does the below query returns a document when it doesn't fulfil the criteria which I gave in find command. I know it's a bad & stupid query for AND comparison. I tried this with MySQL and it doesn't return anything as expected but why does NOSQL makes problem. I hope it's considering the last field for comparison.
> db.studentCollection.find({regno:101,regno:102}).pretty()
{
"_id" : ObjectId("52d7c6a144b4dd77efe93df8"),
"regno" : 102,
"name" : "Sathish",
"gender" : "Male",
"dob" : ISODate("2013-12-09T21:05:00Z")
}
Can anyone brief why does Mongodb works this way?
MongoDB leverages JSON/BSON and names should be unique (http://www.ietf.org/rfc/rfc4627.txt # 2.2.) Found this in another post How to generate a JSON object dynamically with duplicate keys? . I am guessing the value for 'regno' gets overridden to '102' in your case.
If what you want is an OR query, try the following:
db.studentCollection.find ( { $or : [ { "regno" : "101" }, {"regno":"102"} ] } );
Or even better, use $in:
db.studentCollection.find ( { "regno" : { $in: ["101", "102"] } } );
Hope this helps!
Edit : Typo!
MongoDB converts your query to a Javascript document. Since you have not mentioned anything for $and condition in your document, your query clause is getting overwritten by the last value which is "regno":"102". Hence you get last document as result.
If you want to use an $and, you may use any of the following:
db.studentCollection.find({$and:[{regno:"102"}, {regno:"101"}]});
db.studentCollection.find({regno:{$gte:"101", $lte:"102"}});

is it possible to extract the specific data in a JSON data , without reading all the values

I have this JSON Data .
My question is that , is it possible to extract the specific data in a JSON data , without reading all the values .
I mean is it possible to query the data as we do in SQL ??
{ "_id" : ObjectId("4e61501e6a73bc73f82f91f3"), "created_at" : "2011-09-02 17:52:30.285", "cust_id" : "sdtest", "moduleName" : "balances", "responses" : [
{
"questionNum" : "1",
"answer" : "Hard",
"comments" : "is that you john wayne?"
},
{
"questionNum" : "2",
"answer" : "Somewhat",
"comments" : "ARg!"
},
{
"questionNum" : "3",
"answer" : "",
"comments" : "Yes"
}
] }
Yes, but you will need to write extra code to do it, or use a third party library. There are a few available: http://www.google.co.uk/search?q=json+linq+sql
Well, unless you use an incremental JSON parser, you'll have to parse the whole JSON first. After that, it depends on your programming language's abilities of how you can filter. For example, in Python
import json
obj = json.loads(jsonData)
answeredQuestions = filter(lambda response: response.answer, obj["responses"])