grouped count with a column and index position mongodb - json

This is the structure of a document I have in one monggodb collection.
I wanted to understand how one can do a mongo aggregate of grouped count over key "code" and the index position in the nested json (not the priority as it can be any number but within schedules nested there can be just 5 values):
{
"_id" : ObjectId("5749e9fde4b0064e7362b560"),
"_class" : "com.weirdcompanyname.core.collectionname",
"rfId" : 1,
"scheduleds" : [
{
"code" : "556e4835f1eae40bdfa2f2001f2afc76",
"type" : "HT",
"priority" : 0
},
{
"code" : "8b2ab67af4f60e42f7ea64813b5795cf",
"type" : "HT",
"priority" : 1
},
{
"code" : "ed17101eb918b4d8c7c598e4884523ea",
"type" : "HT",
"priority" : 2
},
{
"code" : "7e0ffb4db",
"type" : "QZ",
"priority" : 3
},
{
"code" : "1453dfa1794f39b05f0259ad04699073",
"type" : "HT",
"priority" : 4
}
],
"created" : ISODate("2016-05-28T18:57:00.878Z")
}
The result I'm trying to find is:
code index_position count
556e4835f1eae40bdfa2f2001f2afc76 0 100
8b2ab67af4f60e42f7ea64813b5795cf 1 100
ed17101eb918b4d8c7c598e4884523ea 2 100
7e0ffb4db 3 100
1453dfa1794f39b05f0259ad04699073 4 100
I could get my head around unwinding the nested json in single arrays and then grouping the code over code and maybe other column, let's say priority and have the count but the problem is to get the index position.
Is this even doable on mongo, I've read around a lot of stuff about it and I figured if I have value for which I need a position then it can be doable but I don't really have a value to look for, what I'm looking for is each code and its index position in the "scheduleds" and count.
This is what I could do with my limited mongo querying skills:
db.collectionname.aggregate([{'$match':{'date_key':{'$gte': yesterday_beginning, '$lte': yesterday_end}}}, {'$unwind':'$scheduleds'}, {'$group':{'_id':{'code':'$scheduleds.code','priority':'$scheduleds.priority'}, 'rfid':{'$addToSet':'$rfId'}}}, {'$project':{'_id':0, 'code':'$_id.code', 'priority':'$_id.priority', 'totalRfid':{'$size':'$rfid'}}}, { $limit : 1000 }],{ allowDiskUse:true})

Alain1405 says here that MongoDB 3.2 supports unwinding of the array index.
Instead of passing a path the $unwind operator, you can pass an
object with the field path and the field includeArrayIndex which
will hold the array index.
From MongoDB official documentation:
{
$unwind:
{
path: <field path>,
includeArrayIndex: <string>,
preserveNullAndEmptyArrays: <boolean>
}
}

Related

How do I display certain parts of JSON array response from Alamofire request

How do I access the JSON array to display the output of "AdjustedScheduleTime" from the Trip section?
I got it working for StopLabel as shown below, but I'm struggling to access AdjustedScheduleTime.
I tried the following:
["GetNextTripsForStopResponse"]["GetNextTripsForStopResult"]["Route"]["RouteDirection"]["Trips"]["Trip"]["AdjustedScheduleTime"]
but doesn't work.
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view, typically from a nib.
let parameters = [
"appID": "5rt5rydg", //incorrect appID
"apiKey": "3b5fb15rdgy5454hdrfhr", //incorrect apiKey
"routeNo": "14",
"stopNo": "8600",
"format": "JSON"
]
AF.request("https://api.octranspo1.com/v1.2/GetNextTripsForStop?", method: .post, parameters: parameters,encoding:
URLEncoding.httpBody, headers: nil).responseJSON{ response in
let swiftyJsonVar = JSON(response.result.value!)
print(swiftyJsonVar)
if let busInfo = swiftyJsonVar["GetNextTripsForStopResult"]["StopLabel"].string {
print(": ",busInfo)
print("Label1: ", self.label1.text = busInfo)
}
}
}
This is the results:
{
"GetNextTripsForStopResult" : {
"Error" : "",
"Route" : {
"RouteDirection" : {
"RouteLabel" : "St-Laurent",
"Error" : "",
"RequestProcessingTime" : "20190112151425",
"Trips" : {
"Trip" : [
{
"AdjustmentAge" : "0.38",
"GPSSpeed" : "0.5",
"Latitude" : "45.429457",
"Longitude" : "-75.684117",
"TripDestination" : "St-Laurent",
"LastTripOfSchedule" : false,
"TripStartTime" : "14:31",
"BusType" : "4LB - IN",
"AdjustedScheduleTime" : "11"
},
{
"AdjustmentAge" : "4.32",
"GPSSpeed" : "0.5",
"Latitude" : "45.413749",
"Longitude" : "-75.689748",
"TripDestination" : "St-Laurent",
"LastTripOfSchedule" : false,
"TripStartTime" : "14:46",
"BusType" : "4LB - IN",
"AdjustedScheduleTime" : "22"
},
{
"AdjustmentAge" : "0.55",
"GPSSpeed" : "31.3",
"Latitude" : "45.399587",
"Longitude" : "-75.727631",
"TripDestination" : "St-Laurent",
"LastTripOfSchedule" : false,
"TripStartTime" : "15:01",
"BusType" : "4L - IN",
"AdjustedScheduleTime" : "37"
}
]
},
"RouteNo" : 14,
"Direction" : "Eastbound"
}
},
"StopLabel" : "MCARTHUR \/ IRWIN MILLER",
"StopNo" : "8600"
}
}
: MCARTHUR / IRWIN MILLER //This is the desired output for StopLabel
Ok, so do you explain JSON. Here's a shot.
First some rules:
When you see opening { it means dictionary, you have to pick a key next
When you see opening [ it means array. you have to pick an index
When you see "SomeString": its a key in an array.
Dictionaries have keys, arrays have index. Pick accordingly..
So when we walk through this response:
We see that we start with {. We have a dictionary! We're expecting to see some keys next.
So lets pick a key: We only have one and it's "GetNextTripsForStopResult". so far we have: swiftyJsonVar["GetNextTripsForStopResult"]
We now look at the content of "GetNextTripsForStopResult". We see it's also a dictionary. Again we should have some keys. We do. We have Error, Route, StopLabel and more. Let's pick a key. Since we're trying to get to a "AdjustedScheduleTime", lets pick Route. so far we have ["GetNextTripsForStopResult"]["Route"]
Now lets look at the contents of Route. Its a dictionary again.
Again we pick a key and keep repeating till we hit Trip. You should have ["GetNextTripsForStopResult"]["Route"]["RouteDirection"]["Trips"]["Trip"]
Lets look at what we have in Trip Whats this?..its an array!
We have to pick an index now. We need to chose somehow. Thats the tricky part. In order to do that we need some more information. So lets just ARBITRARILY chose one. Lets take the last one. so we have: ["GetNextTripsForStopResult"]["Route"]["RouteDirection"]["Trips"]["Trip"][2]
Now we can get our final key AdjustedScheduleTime. So let's pick it!
["GetNextTripsForStopResult"]["Route"]["RouteDirection"]["Trips"]["Trip"][2]["AdjustedScheduleTime"]
Keep in mind:
These hard coded indexes are almost NEVER what you want. Maybe you need to show all the AdjustedScheduleTime to the user or let the user chose one, or add all of them up. That really depends on your application and what you're trying to accomplish. I chose the last index (2) arbitrarily without having any knowledge of your application, the api you're calling and what you're trying to achieve. Its VERY possible that you don't want the last index.

how to stream a json using flink?

i 'm actually working on a stream, receiving a bunch of strings and need to make a count of all the strings. the sums is aggragated, that mean for the second record the sum was added to the day before
the output must be some json file looking like
{
"aggregationType" : "day",
"days before" : 2,
"aggregates" : [
{"date" : "2018-03-03",
"sum" : 120},
{"date" :"2018-03-04",
"sum" : 203}
]
}
i created a stream looking like :
val eventStream : DataStream [String] =
eventStream
.addSource(source)
.keyBy("")
.TimeWindow(Time.days(1), Time.days(1))
.trigger(new MyTriggerFunc)
.aggregation(new MyAggregationFunc)
.addSink(sink)
thank you in advance for the help :)
Note on working with JSON in Flink:
Use JSONDeserializationSchema to deserialize the events, which will produce ObjectNodes. You can map the ObjectNode to YourObject for convenience or continue working with the ObjectNode.
Tutorial on working with ObjectNode: http://www.baeldung.com/jackson-json-node-tree-model
Back to your case, you can do it like the following:
val eventStream : DataStream [ObjectNode] =
oneMinuteAgg
.addSource(source)
.windowAll()
.TimeWindow(Time.minutes(1))
.trigger(new MyTriggerFunc)
.aggregation(new MyAggregationFunc)
will output a stream of 1min aggregates
[
{
"date" :2018-03-03
"sum" : 120
},
{
"date" :2018-03-03
"sum" : 120
}
]
then chain another operator to the "oneMinuteAgg" that will add the 1min aggregates into 1day aggregates:
[...]
oneMinuteAgg
.windowAll()
.TimeWindow(Time.days(1))
.trigger(new Whatever)
.aggregation(new YourDayAggF)
that will output what you need
{
"aggregationType" : "day"
"days before" : 4
"aggregates : [{
"date" :2018-03-03
"sum" : 120
},
{
"date" :2018-03-03
"sum" : 120
}]
}
I used windowAll() assuming you don't need to key the stream.

MongoDB queries return no results

I'm having a problem with querying a MongoDB dataset ("On Street Crime in Camden" from data.gov.uk)
The database name is Crime_Data_in_Camden and the collection name is Street_Crime_Camden. The query to find all records, db.Street_Crime_Camden.find(), works fine but anything else returns nothing at
all. Here is a portion of the metadata:
{
"id" : 509935,
"name" : "Ward Name",
"dataTypeName" : "text",
"fieldName" : "ward_name",
"position" : 13,
"renderTypeName" : "text",
"tableColumnId" : 258836,
"width" : 100,
"cachedContents" : {
"largest" : "West Hampstead",
"non_null" : 79813,
"null" : 0,
"top" : [ {
"item" : "Regent's Park",
"count" : 20
}, {
"item" : "Swiss Cottage",
"count" : 19
}, {
"item" : "Holborn and Covent Garden",
"count" : 18
}
}
}
I've tried 3 attempts at a basic query:
db.Street_Crime_Camden.find({"ward_name":"West Hampstead"});
db.Street_Crime_Camden.find({'meta.ward_name':'West Hampstead'});
db.Street_Crime_Camden.find({meta:{ward_name:"West Hampstead"} });
According to any documentation or tutorial that I've seen any of these approaches should be valid. And I know that there are hundreds of rows (or documents) that match those terms, so why are these queries returning nothing? Advice would be appreciated.
The common theme in the three aproaches you tried is some form of ward_name = West Hampstead but there is no attribute named ward_name in the document you shared with us.
Based on the document you show in your question the only way of addressing an attribute with the value West Hampstead is:
db.Street_Crime_Camden.find({"cachedContents.largest": "West Hampstead"});
For background; you address attributes in your documents by using dot notation so the document you included in your question could be found by any of the following find commands:
db.Street_Crime_Camden.find({"name": "Ward Name"});
db.Street_Crime_Camden.find({"position": 13});
db.Street_Crime_Camden.find({"cachedContents.top.item": "Swiss Cottage"});
db.Street_Crime_Camden.find({"cachedContents.top.1.count": 20});
... etc
These examples might help you to understand how to form find criteria. The MongoDB docs are also useful.

How to display 'c' array values alone from the given JSON document below using MongoDB?

I am a newbie to MongoDB. I am experimenting the various ways of extracting fields from a document inside collection.
Here in the below JSON document, I am finding it difficult to get extract it according to my need
{
"_id":1,
"dependencies":{
"a":[
"hello",
"hi"
],
"b":[
"Hmmm"
],
"c":[
"Vanilla",
"Strawberry",
"Pista"
],
"d":[
"Carrot",
"Cauliflower",
"Potato",
"Cabbage"
]
},
"productid":"25",
"date":"Thu Jul 30 11:36:49 PDT 2015"
}
I need to display the following output:
c:[
"Vanilla",
"Strawberry",
"Pista"
]
Can anyone please help me in solving it?
MongoDB Aggregation comes into rescue to get the result you are looking for :
$Project--> Passes along the documents with only the specified fields to the next stage in the pipeline. The specified fields can be existing fields from the input documents or newly computed fields.
db.collection.aggregate( [
{ $project :
{ c: "$dependencies.c", _id : 0 }
}
]).pretty();
As per the output you required, we just need to project ( display) the field "dependencies.c" , so we are creating a new field "c" and assigining the value of the "dependencies.c" into it.
Also by defalut "_id" field will be display along with the result, since you dont need it, so we are suppressing of the _id field by assigining "_id" : <0 or false>, so that it will not display the _id field in the output.
The above query will fetch you the result as below :
"c" : [
"Vanilla",
"Strawberry",
"Pista"
]

Query json data on MongoDB

I have a rather complex structure on my json and I cannot find how to query it to get the rows I am interested in. Here is a sample of my data:
{
"_id" : ObjectId("5282bf9ce4b05216ca1b68f8"),
"authorID" : ObjectId("5282a8c3e4b0d7f4f4d07b9a"),
"blogID" : "7180831558698033600",
"blogs" : {
"$" : {
"posts" : [
[
{
"author" : {
"displayName" : "mms",
...
...
...
}}}
So, I am interested in finding all json entries that have the author displayName equal to "mms".
My collection name is bz so, a find all query would be: db.dz.find()
What criteria do I have to put inside the find() to only get json document with author displayName equal to mms?
Any ideas?
Thank you in advance!
Suppose you have replaced field name "$" with "dollarSign".
Then db.dz.find({"blogs.dollarSign.posts.author.displayName": "mms"}) will fetch whole documents according to your requirements.