Right structure for a series of dates: values - json

I'm having a hard time trying to figure out what is the right JSON structure for the following set of data. I've got a sensor that logs humidity of a given room on a daily basis. Logs look like:
...
2015-01-19 8%
2015-01-20 13%
...
I'd like to convert it to JSON. My first bet was:
{
'2015-01-19': 8,
'2015-01-20': 13
}
But, is it correct? Shouldn't it be:
[
{ '2015-01-19', 8 },
{ '2015-01-20', 13}
]
Or:
[
{
'date': '2015-01-19',
'value': 8
},
{
'date': '2015-01-20',
'value': 13
}
]
And, at the end of the day, is there a series of best practices I could refer to in order to help me determine what's the best structure on my own?

Your first example is simple and easy, though perhaps not extensible if you decide to add more attributes later. If that's unlikely, you should use that method.
Your second example is not valid JSON.
Your third example makes some sense, though it is not a very compact encoding (wastes space).
A fourth method you should consider is to use separate arrays. This is not necessarily intuitive at first, but it does work well, is compact yet extensible, and is directly compatible with some tools such as HighCharts. That is:
{
'dates': ['2015-01-19', '2015-01-20'],
'humidity': [8, 13]
}

Related

Is it possible to get a sorted list of attribute values from a JSON array using JSONPath

Given JSON like:
[
{
"Serial no": 994,
},
{
"Serial no": 456,
}
]
I know this query will give me an array of all Serial no values, in the order they are in the JSON: $..['Serial no']
I'm not sure exactly what sorting capabilities JSONPath has but I think you can use / and \ to sort - but how are they used to modify my query string in this case? I am only interested doing this in pure JSONPath, not JS or post-query sorting - that's easy, I just want to know if I can avoid it.
This is a source I found suggesting sorting is supported but it might be product-specific?
I'm using http://www.jsonquerytool.com/ to test this

How can I use RegEx to extract data within a JSON document

I am no RegEx expert. I am trying to understand if can use RegEx to find a block of data from a JSON file.
My Scenario:
I am using an AWS RDS instance with enhanced monitoring. The monitoring data is being sent to a CloudWatch log stream. I am trying to use the data posted in CloudWatch to be visible in log management solution Loggly.
The ingestion is no problem, I can see the data in Loggly. However, the whole message is contained in one big blob field. The field content is a JSON document. I am trying to figure out if I can use RegEx to extract only certain parts of the JSON document.
Here is an sample extract from the JSON payload I am using:
{
"engine": "MySQL",
"instanceID": "rds-mysql-test",
"instanceResourceID": "db-XXXXXXXXXXXXXXXXXXXXXXXXX",
"timestamp": "2017-02-13T09:49:50Z",
"version": 1,
"uptime": "0:05:36",
"numVCPUs": 1,
"cpuUtilization": {
"guest": 0,
"irq": 0.02,
"system": 1.02,
"wait": 7.52,
"idle": 87.04,
"user": 1.91,
"total": 12.96,
"steal": 2.42,
"nice": 0.07
},
"loadAverageMinute": {
"fifteen": 0.12,
"five": 0.26,
"one": 0.27
},
"memory": {
"writeback": 0,
"hugePagesFree": 0,
"hugePagesRsvd": 0,
"hugePagesSurp": 0,
"cached": 505160,
"hugePagesSize": 2048,
"free": 2830972,
"hugePagesTotal": 0,
"inactive": 363904,
"pageTables": 3652,
"dirty": 64,
"mapped": 26572,
"active": 539432,
"total": 3842628,
"slab": 34020,
"buffers": 16512
},
My Question
My question is, can I use RegEx to extract, say a subset of the document? For example, CPU Utilization, or Memory etc.? If that is possible, how do I write the RegEx? If possible, I can use it to drill down into the extracted document to get individual data elements as well.
Many thanks for your help.
First I agree with Sebastian: A proper JSON parser is better.
Anyway sometimes the dirty approach must be used. If your text layout will not change, then a regexp is simple:
E.g. "total": (\d+\.\d+) gets the CPU usage and "total": (\d\d\d+) the total memory usage (match at least 3 digits not to match the first total text, memory will probably never be less than 100 :-).
If changes are to be expected make it a bit more stable: ["']total["']\s*:\s*(\d+\.\d+).
It may also be possible to match agains return chars like this: "cpuUtilization"\s*:\s*\{\s*\n.*\n\s*"irq"\s*:\s*(\d+\.\d+) making it a bit more stable (this time for irq value).
And so on and so on.
You see that you can get fast into very complex expressions. That approach is very fragile!
P.S. Depending of the exact details of the regex of loggy, details may change. Above examples are based on Perl.

Using JSON-based Database for unordered data

I am working on a simple app for Android. I am having some trouble using the Firebase database since it uses JSON objects and I am used to relational databases.
My data will consists of two users that share a value. In relational databases this would be represented in a table like this:
**uname1** **uname2** shared_value
In which the usernames are the keys. If I wanted the all the values user Bob shares with other users, I could do a simple union statement that would return the rows where:
uname1 == Bob or unname == Bob
However, in JSON databases, there seems to be a tree-like hierarchy in the data, which is complicated since I would not be able to search for users at the top level. I am looking for help in how to do this or how to structure my database for best efficiency if my most common search will be one similar to the one above.
In case this is not enough information, I will elaborate: My database would be structured like this:
{
'username': 'Bob'
{
'username2': 'Alice'
{
'shared_value' = 2
}
}
'username': 'Cece'
{
'username2': 'Bob'
{
'shared_value' = 4
}
}
As you can see from the example, Bob is included in two relationships, but looking into Bobs node doesn't show that information. (The relationship is commutative, so who is "first" cannot be predicted).
The most intuitive way to fix this would be duplicate all data. For example, when we add Bob->Alice->2, also add Alice->Bob->2. In my experience with relational databases, duplication could be a big problem, which is why I haven't done this already. Also, duplication seems like an inefficient fix.
Is there a reason why you don't invert this? How about a collection like:
{ "_id": 2, "usernames":[ "Bob", "Alice"]}
{ "_id": 4, "usernames":[ "Bob", "Cece"]}
If you need all the values for "Bob", then index on "usernames".
EDIT:
If you need the two usernames to be a unique key, then do something like this:
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 }
But this would still permit the creation of:
{ "_id": {"uname1":"Alice", "uname2":"Bob"}, "value": 78 }
(This issue is also present in your as-is relational model, btw. How do you handle it there?)
In general, I think implementing an array by creating multiple columns with names like "attr1", "attr2", "attr3", etc. and then having to search them all for a possible value is an artifact of relational table modeling, which does not support array values. If you are converting to a document-oriented storage, these really should be an embedded list of values, and you should use the document paradigm and model them as such, instead of just reimplementing your table rows as documents.
You can still have old structure:
[
{ username: 'Bob', username2: 'Alice', value: 2 },
{ username: 'Cece', username2: 'Bob', value: 4 },
]
You may want to create indexes on 'username' and 'username2' for performance. And then just do the same union.
To create a tree-like structure, the best way is to create an "ancestors" array that stores all the ancestors of a particular entry. That way you can query for either ancestors or descendants and all documents that are related to a particular value in the tree. Using your example, you would be able to search for all descendants of Bob's, or any of his ancestors (and related documents).
The answer above suggest:
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 }
That is correct. But you don't get to see the relationship between Bob and Cece with this design. My suggestion, which is from Mongo, is to store ancestor keys in an ancestor array.
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 , "ancestors": [{uname: "Cece"}]}
With this design you still get duplicates, which is something that you do not want. I would design it like this:
{"username": "Bob", "ancestors": [{"username": "Cece", "shared_value": 4}]}
{"username": "Alice", "ancestors": [{"username": "Bob", "shared_value": 2}, {"username": "Cece"}]}

CSV Parser through angularJS

I am building a CSV file parser through node and Angular . so basically a user upload a csv file , on my server side which is node the csv file is traversed and parsed using node-csv
. This works fine and it returns me an array of object based on csv file given as input , Now on angular end I need to display two table one is csv file data itself and another is cross tabulation analysis. I am facing problem while rendering data, so for a table like
I am getting parse responce as
For cross tabulation we need data in a tabular form as
I have a object array which I need to manipulate in best possible way so as to make easily render on html page . I am not getting a way how to do calculation on data I get so as to store cross tabulation result .Any idea on how should I approach .
data json is :
[{"Sample #":"1","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"2","Gender":"Male","Handedness;":"Left-handed;"},{"Sample #":"3","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"4","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":"5","Gender":"Male","Handedness;":"Left-handed;"},{"Sample #":"6","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":"7","Gender":"Female","Handedness;":"Right-handed;"},{"Sample #":"8","Gender":"Female","Handedness;":"Left-handed;"},{"Sample #":"9","Gender":"Male","Handedness;":"Right-handed;"},{"Sample #":";"}
There are many ways you can do this and since you have not been very specific on the usage, I will go with the simplest one.
Assuming you have an object structure such as this:
[
{gender: 'female', handdness: 'lefthanded', id: 1},
{gender: 'male', handdness: 'lefthanded', id: 2},
{gender: 'female', handdness: 'righthanded', id: 3},
{gender: 'female', handdness: 'lefthanded', id: 4},
{gender: 'female', handdness: 'righthanded', id: 5}
]
and in your controller you have exposed this with something like:
$scope.members = [the above array of objects];
and you want to display the total of female members of this object, you could filter this in your html
{{(members | filter:{gender:'female'}).length}}
Now, if you are going to make this a table it will obviously make some ugly and unreadable html so especially if you are going to repeat using this, it would be a good case for making a directive and repeat it anywhere, with the prerequisite of providing a scope object named tabData (or whatever you wish) in your parent scope
.directive('tabbed', function () {
return {
restrict: 'E',
template: '<table><tr><td>{{(tabData | filter:{gender:"female"}).length}}</td></tr><td>{{(tabData | filter:{handedness:"lefthanded"}).length}}</td></table>'
}
});
You would use this in your html like so:
<tabbed></tabbed>
And there are ofcourse many ways to improve this as you wish.
This is more of a general data structure/JS question than Angular related.
Functional helpers from Lo-dash come in very handy here:
_(data) // Create a chainable object from the data to execute functions with
.groupBy('Gender') // Group the data by its `Gender` attribute
// map these groups, using `mapValues` so the named `Gender` keys persist
.mapValues(function(gender) {
// Create named count objects for all handednesses
var counts = _.countBy(gender, 'Handedness');
// Calculate the total of all handednesses by summing
// all the values of this named object
counts.Total = _(counts)
.values()
.reduce(function(sum, num) { return sum + num });
// Return this named count object -- this is what each gender will map to
return counts;
}).value(); // get the value of the chain
No need to worry about for-loops or anything of the sort, and this code also works without any changes for more than two genders (even for more than two handednesses - think of the aliens and the ambidextrous). If you aren't sure exactly what's happening, it should be easy enough to pick apart the single steps and their result values of this code example.
Calculating the total row for all genders will work in a similar manner.

reactivemongo - merging two BSONDocuments

I am looking for the most efficient and easy way to merge two BSON Documents. In case of collisions I have already handlers, for example if both documents include Integer, I will sum that, if a string also, if array then will add elements of the other one, etc.
However due to BSONDocument immutable nature it is almost impossible to do something with it. What would be the easiest and fastest way to do merging?
I need to merge the following for example:
{
"2013": {
"09": {
value: 23
}
}
}
{
"2013": {
"09": {
value: 13
},
"08": {
value: 1
}
}
}
And the final document would be:
{
"2013": {
"09": {
value: 36
},
"08": {
value: 1
}
}
}
There is a method in BSONDocument.add, however it doesn't check uniqueness, it means I would have at the end 2 BSON documents with "2013" as a root key, etc.
Thank you!
If I understand you inquiry, you are looking to aggregate field data via composite id. MongoDB has a fairly slick aggregate framework. Part of that framework is the $group pipeline aggregate keyword. This will allow you to specify and _id to group by which could be defined as a field or a document as in your example, as well as perform aggregation using accumulators such as $sum.
Here is a link to the manual for the operators you will probably need to use.
http://docs.mongodb.org/manual/reference/operator/aggregation/group/
Also, please remove the "merge" tag from your original inquiry to reduce confusion. Many MongoDB drivers include a Merge function as part of the BsonDocument representation as a way to consolidate two BsonDocuments into a single BsonDocument linearly or via element overwrites and it has no relation to aggregation.
Hope this helps.
ndh