JSON Data Optimization by removing repeated column names - json

I have a basic Json question - I have a JSON file. Every object in this file has columns repeated.
[
{
id: 1,
name: "ABCD"
},
{
id: 2,
name: "ABCDE"
},
{
id: 3,
name: "ABCDEF"
}
]
For optimization I was thinking to remove repeated column names.
{
"cols": [
"id",
"name"
],
"rows": [
[
"1",
"ABCD"
],
[
"2",
"ABCDE"
]
]
}
What I am trying to understand is - is this a better approach? Are there any disadvantages of this format? Say for writing unit tests?

EDIT
The second case (after your editing) is valid json. You can derive it to the following class using json2csharp
public class RootObject
{
public List<string> cols { get; set; }
public List<List<string>> rows { get; set; }
}
The very important point to note about a valid json is that it has no other way but to repeat the column names (or, keys in general) to represent values in json. You can test the validity of your json putting it # jsonlint.com
But if you want to optimize json by compressing it using some compression library like gzip (likewise), then I would recommend Json.HPack.
According to this format, it has many compression levels ranging from 0 to 4 (4 is the best).
At compression level 0:
you have to remove keys (property names) from the structure creating a header on index 0 with each property name. Then your compressed json would look like:
[
[
"id",
"name"
],
[
1,
"ABCD"
],
[
2,
"ABCDE"
],
[
3,
"ABCDEF"
]
]
In this way, you can compress your json at any levels as you want. But in order to work with any json library, you must have to decompress it to valid json first like the one you provided earlier with repeated property names.
For your kind information, you can have a look at the comparison between different compression techniques:

{
"cols": [
"id",
"name"
],
"rows": [
"1",
"ABCD"
], [
"2",
"ABCDE"
], [
"3",
"ABCDEF"
]
}
In this approach it will be hard to determine which value stand for which item (id,name). Your first approach was good if you use this JSON for communication.

A solution for it, is use any type (by your preference) of Object-Relational-Mapper,
By that, you can compress your JSON data and still using legible structure/code.
Please, see this article: What is "compressed JSON"?

Related

How to use jq to reconstruct complete contents of json file, operating only on part of interest?

All the examples I've seen so far "reduce" the output (filter out) some part. I understand how to operate on the part of the input I want to, but I haven't figured out how to output the rest of the content "untouched".
The particular example would be an input file with several high level entries "array1", "field1", "array2", "array3" say. Each array contents is different. The specific processing I want to do is to sort "array1" entries by a "name" field which is doable by:
jq '.array1 | sort_by(.name)' test.json
but I also want this output as "array1" as well as all the other data to be preserved.
Example input:
{
"field1": "value1",
"array1":
[
{ "name": "B", "otherdata": "Bstuff" },
{ "name": "A", "otherdata": "Astuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
Expected output:
{
"field1": "value1",
"array1":
[
{ "name": "A", "otherdata": "Astuff" },
{ "name": "B", "otherdata": "Bstuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
I've tried using map but I can't seem to get the syntax correct to be able to handle any type of input other than the array I want to be sorted by name.
Whenever you use the assignment operators (=, |=, +=, etc.), the context of the expression is kept unchanged. So as long as your top-level filter(s) are assignments, in the end, you'll get the rest of the data (with your changes applied).
In this case, you're just sorting the array1 array so you could just update the array.
.array1 |= sort_by(.name)

Json array in mongoDB

I want to get objects according to an ID they have in an array in a json file in mongodb.
I tried a lot of ways to get them with no success:
db.collection.find({"Id":"2"})
db.collection.find({"Messages.Id":"2"})
db.collection.find({"Messages":{$elemMatch:{"Id":"2"}}})
db.collection.find({"Messages.Id":{$elemMatch:{"Id":"2"}}})
{
"Messages" : [
{
"text":"aaa",
"Id" : [ "1", "2" ]
},
{
"texts" : "bbb",
"Id" : [ "1", "3" ]
}
]
}
Even though that's how it's supposed to be done according to the mongodb documentation.
So I thought something was wrong with my json design (I tried changing it but that didn't help either).
Can anyone suggest to me a good design or query to get the objects with a certain id will work?
UPDATE:
I want for example that if in the query i request the id 2
only the first message and all of it will be displayed (I don't mind if the Id field wont be displayed)
{
"text":"aaa",
"Id":["1","2"]
}
To find single elements that match you will need to utilize the positional operator ($).
db.collection.find({"Messages.Id": "2"}, {"Messages.$": 1, _id: 0})
For finding multiple matches, you would use the aggregation pipeline:
db.collection.aggregate([
{ $unwind: "$Messages" },
{ $match: {"Messages.Id": "1"}},
{ $group: { _id: null, messages: { $push: "$Messages"}}}
])

Obtain a different JSON object structure in AngularJS

I'm Working on AngularJS.
In this part of the project my goal is to obtain a JSON structure after filling a form with some particulars values.
Here's the fiddle of my simple form: Fiddle
With the form I will do a query to KairosDB, that is my NoSql Database, I will query data from it by a JSON object. The form is structured in this way:
a Name
a certain Number of Tags, with Tag Id ("ch" for example) and tag value ("932" for example)
a certain Number of Aggregators to manipulate data coming from DB
Start Timestamp and End Timestamp (now they are static and only included in the final JSON Object)
After filling this form, with my code I'll obtain for example this JSON object:
{
"metrics": [
{
"tags": [
{
"id": "ch",
"value": "932"
},
{
"id": "ch",
"value": "931"
}
],
"aggregators": {
"name": "sum",
"sampling": [
{
"value": "1",
"unit": "milliseconds",
"type": "SUM"
}
]
}
}
],
"cache_time": 0,
"start_absolute": 123,
"end_absolute": 1234
}
Unfortunately, KairosDB accepts a different structure, and as you could see, Tag id "ch" doesn't hase an "id" string before, or for example, Tag values coming from the same tag id are grouped together
{
"metrics": [
{
"tags": {
"ch": [
"932",
"931"
]
},
"name": "AIENR",
"aggregators": [
{
"name": "sum",
"sampling": {
"value": "1",
"unit": "milliseconds"
}
}
]
}
],
"cache_time": 0,
"start_absolute": 1367359200000,
"end_absolute": 1386025200000
}
My question is: Is there a way to obtain the JSON structure like the one accepted by Kairos DB with an Angular JS form?. Thanks to everyone.
I've seen this topic as the one more similar to mine but it isn't in AngularJS.
Personally, I'd do the refactoring work in the backend - Have what ever server interfaces sends and receives data do the manipulation - Otherwise you'll end up needing to refactor your data inside Angular anywhere you want to use that dataset.
Where as doing it in the backend would put it in a single access point.
Of course, you could do it in Angular, just replace userString in the submitData method with a copy of the array and replace the tags section with data in the new format, and likewise refactor the returned result to the correct format when you get a reply.

How should a JSON response be formatted?

I have a REST service that returns a list of objects. Each object contains objectcode and objectname.
This is my first time building a REST service, so I'm not sure how to format the response.
Should it be:
{
"objects": {
"count": 2,
"object": [
{
"objectcode": "1",
"objectname": "foo"
},
{
"objectcode": "2",
"objectname": "bar"
},
...more objects
]
}
}
OR
[
{
"objectcode": "1",
"objectname": "foo"
},
{
"objectcode": "2",
"objectname": "bar"
},
...more objects
]
I realize this might be a little subjective, but which would be easier to consume? I would also need to support XML formatted response later.
They are the same to consume, as a library handles both just fine. The first one has an advantage over the second though: You will be able to expand the response to include other information additional to the objects (for example, categories) without breaking existing code.
Something like
{
"objects": {
"count": 2,
"object": [
{
"objectcode": "1",
"objectname": "foo"
},
{
"objectcode": "2",
"objectname": "bar"
},
...more objects
]
}
"categories": {
"count": 2,
"category" : [
{ "name": "some category"}
]
}
}
Additionally, the json shouldn't be formatted in any way, so remove whitespace, linebreaks etc. Also, the count isn't really necessary, as it will be saved while parsing the objects themselves.
I often see the first one. Sometimes it's easier to manipulate data to have meta-data. For exemple google API use first one : http://maps.googleapis.com/maps/api/geocode/json?address=1600+Amphitheatre+Parkway,+Mountain+View,+CA&sensor=true
It's not only the question of personal preference; it's also the question fo your requirements. For example, if I was in the same situation and I did need object count on client side then I'd go with first approach otherwise I will choose the second one.
Also please note that "classic" REST server mostly will work a bit different way. If some REST function is to return a list of objects then it should return only a list of URLs to those objects. The URLs should be pointing to details endpoints - so by querying each endpoint you may get details on specific single object.
As a client I would prefer the second format. If the first format only includes the number of "objects", this is redundant information.

JSON format problem when using Factual geo data

I'm using Factual api to fetch location data. Their restful service return data in JSON format as follow, but they are not using "usual" JSON format. There's no attribute key, instead, there's a “fields” that explains all the field keys.
So the question is how to retrieve the attribute I need? Please give an example if possible. Thanks in advance.
{
"response": {
"total_rows": 2,
"data": [
[
"ZPQAB5GAPEHQHDy5vrJKXZZYQ-A",
"046b39ea-0951-4add-be40-5d32b7037214",
"Hanko Sushi Iso Omena",
60.16216,
24.73907
],
[
"2TptHCm_406h45y0-8_pJJXaEYA",
"27dcc2b5-81d1-4a72-b67e-2f28b07b9285",
"Masabi Sushi Oy",
60.21707,
24.81192
]
],
"fields": [
"subject_key",
"factual_id",
"name",
"latitude",
"longitude"
],
"rows": 2,
"cache-state": "CACHED",
"big-data": true,
"subject_columns": [
1
]
},
"version": "2",
"status": "ok"
}
If you know the field name, and the data isn't guaranteed to stay in the same order, I would do a transform on the data so I can reference the fields by name:
var fieldIndex = {}
for (key in x.response.fields)
{
fieldIndex[x.response.fields[key]] = key;
}
for (key in x.response.data)
{
alert(x.response.data[key][fieldIndex.name]);
}
// Field map
var _subject_key = 0,
_factual_id = 1,
_name = 2,
_latitude = 3,
_longitude = 4;
// Example:
alert(_json.response.data[0][_factual_id]);
Demo: http://jsfiddle.net/AlienWebguy/9TEJJ/
I work at Factual. Just wanted to mention that we've launched the beta of version 3 of our API. Version 3 solves this problem directly, by including the attribute keys inline with the results, as you would hope. (Your question applies to version 2 of our API. If you're able to upgrade to version 3 you'll find some other nice improvements as well. ;-)