JSON appears to be a nice way to represent a complex data structure in plain text. If we think of this complex data structure as analogous to an OOP object - an instance of a class - then is there a commonly used JSON-like format that represents the class itself (just the data part - forget methods)? Can JSON itself be used for this?
To put it another way, if JSON encodes name-value pairs, what should I use if I want to encode only the names?
The reason I want this is that I am designing a protocol to use with jQuery (to which I am a complete novice by the way). The client will communicate to the server the structure of the JSON object it wants back, and the server will return a JSON object of that structure with the values added.
The key point is that it is the client that is in full control of what data fields (name-value pairs) the server returns. It's a bit different from all the examples of jQuery that I've found so far on the web where the client makes a request (which usually includes a very limited set of parameters, if any) and the server makes the decision as to what fields to return in the JSON reply.
(Obviously, what the client asks for must be congruent with the server's data model; if the server has an array of widgets each with its own price, the client can't ask for an array of prices each with its own widget.)
This must be a common problem, and I don't want to reinvent the wheel. I want to adopt a solution that is already in common use across the web.
Edit
I just found JSON Schema. This is not what I am looking for. It contains way more than I need.
Edit
I'm looking more for a 'this is how it is usually done' answer, rather than a 'you could try…' answer. (I can invent dozens of possible answers myself.)
To encode only names within JSON, you could use a key/value pair where the key is either the class name or just a key named 'values' - with the value being an array of strings that are the names to be returned by the server. For example:
{ 'class_name' : [ "name1", "name2", "name3" ] }
The server can then either detect the class name from the key used and return the supplied values for the names in the array if the class supports it or ignore if it does not.
I'm looking more for a 'this is how it is usually done' answer
There is no single "correct" way to do what you want. Many people have their implementation. It depends on various factors -- what you want to do, where you want to do, how efficiently you want it to do?
For simple structures I would prefer and suggest the answer given by #dbr9979.
For nested structures, you can have nested arrays. Something like:
{
"nestedfield1": {
"nestedfield11":["nestedfield111", "nestedfield112"],
"nestedfield12":["nestedfield121", "nestedfield122"],
"__SIMPLE_FIELDS__": ["simplefield13", "simplefield14"]
}
}
The point is, if the key is __SIMPLE_FIELDS__, the value is an array of simple fields (string, numbers etc..), else the key stands for the key in the object.
For something more complex, what I would suggest is you have predefined structures, that both the server and the client know of. This is particularly useful when you have to make multiple identical requests. Assign some unique number for each of them. Something like:
1 => <the structure above>
2 => ["simplefield1", "simplefield2" ..]
3 => etc .. etc
The server stores the above structure and the relevant number in the database or something. And now, as it may be obvious by now, client sends across the id of the required structure, and the server responds in the appropriate fashion.
I think what you meant by this:
the client that is in full control of what data fields (name-value pairs) the server returns.
is like the difference between SELECT * FROM Bags and SELECT color, price FROM Bag in SQL. Am I interpreting you correctly?
You could query with:
{
'resource': 'Bag',
'field_names': ['color', 'price']
}
which will return the response:
{
'status': 'success',
'result': [
{'color': 'red', 'price': 50},
{'color': 'blue', 'price': 45},
]
}
most likely though, you may not actually need your request to be a JSON object; I've seen implementations where the field names is taken from the query string, like http://foo.com/bag?fields=color,price
I was looking for Partial Response.
RESTful API Design: can your API give developers just the information they need? explains it all and gives examples from LinkedIn, Facebook, and Google. Google and Facebook both have similar approaches. Here's how Lie Ryan's example would look using Google's approach:
url?fields=status,result(color,price)
Since Google and Facebook are behind this, I would not be surprised to see this become a de facto standard.
In my case I am likely to run into a length limitation on the URL and so have to use POST instead, but this is an excellent starting point for me.
Related
I'm trying to get all the documents in the "businesses" collection from Firebase together with their sub-collections.
The problem is when I do the query to Firebase like this :
Stream<List<Business>> getBusinesses() {
return _db.collection('businesses').snapshots().map((snapshot) => snapshot
.docs
.map((document) => Business.fromJson(document.data()))
.toList());
}
, the sub-collections aren't passed with the JSON object document.data(), so in my code, the Business object isn't fully completed, which means there are empty fields (Appointments, ServiceProviders,
Services), instead of getting the data from the sub-collections.
So hopefully I've explained the problem well, my question is how can I fetch all the document data including its sub-collections, and parse it to a Business Object?
Thanks.
What seems to be "the problem" is actually the point of Firestore: Keeping documents shallow so you can only get the data you need. It's then up to you to structure your data the way it will likely be used in the future.
Mind you, subcollections are not fields.
What you can do here, is add a query that fetches the documents in the subcollections (Appointments, ServiceProviders, Services), for each business. You would get the business document Id to use for the query.
It would typically look something like:
_db.collection('businesses').document(documentId).collection('Appointments')
Mind you, this is potentially too much data. It might be better to fetch the docs in those subcollections only when needed/requested by the user.
Should I treat all api response as "resource" and return a JSON object or simple array would be appropriate as well ?
for instance are all of the below responses valid?
GET /rest/someresource should return collection of ids
[{id:1},{id:2}]
{{id:1},{id:2}}
[1,2]
GET /rest/someresource?id>0 search for ids bigger than zero and return collection of ids
[{id:1},{id:2}]
{{id:1},{id:2}}
[1,2]
Collection Resources
It is acceptable to return an array of resources - either a list of ids, or object structures - such a thing is commonly known as a 'collection' resource.
See http://51elliot.blogspot.com.au/2014/06/rest-api-best-practices-4-collections.html for an examination of resources and collections.
While not required by REST, it's common to use a plural noun to refer to a collection resource - e.g.
/rest/someresources
REST also requires the use of defined media types, and there are a couple available to assist with collections, e.g.:
Collection+json
Provides a structure with meta data around a list of items wherein you define the structure of each item as your resource
HAL
provides a structure with embedded collections and embedded resources
And many more
All provide a defined structure for including hypermedia links for your resource, or each resource in your collection - and if you are doing REST this is one of the things that the spec says you MUST do (even though many people don't).
Your Proposed Json Structures
Some more specific comments on your proposed json structures:
Option 2 is not valid json. Consider:
{{id:1},{id:2}}
A json object must have a name:value pair, e.g.
{somename:{id:1},someothername:{id:2}}
would be valid - but not very useful!
Also - strictly for json, the name should be enclosed in quotes. the value may be enclosed in quotes if it is a string.
So if you don't want to use a commonly used media type as referenced above, your options are 1 or 3. which should be:
[{"id":1},{"id":2}]
[1, 2]
Both are valid, however option 1 will give you more flexibility to add more properties to each element of the array if you decide in the future you would like to return more than an id. e.g. at some point in the future you might decide to return:
[{"id":1,"name":"fred"},{"id":2,"name":"wilma"}]
Option 3 will only ever be able to return a list of ids.
So personally I would go with option 1.
Depends on how RESTful you're aiming to be.
In addition to what #Chris Simon said, I'll add that if the server would only return IDs at GET /rest/someresource, the client would have to repeatedly call something like GET /rest/someresource/{id} in order to obtain data (it can display on the UI), right? This in turn would just increase the load on the server. If the id would be enough, you can probably get away with the proposed solution.
Also, once you decide you'd better be consistent.
Given that the 2nd option is not even valid, and the last is pretty limiting, I'd also go for the first option, JSON.
Just to make it clear we are talking about different representations of the same resource here:
By GET /rest/someresource both [{id:1},{id:2}] and [1,2] are valid responses, but you should make clear which one you want to see, e.g. with the prefer header. So by Prefer: return=minimal you would return [1,2] and if the header is not present, then [{id:1},{id:2}]. Just make sure that the prefer header is registered by the vary header, or you will have caching troubles.
By GET /rest/someresource?id>0 you filter your collection. So either the /rest/someresource?id>0 URI identifies a different filtered collection resource or it identifies the same collection resource, but with the filter query string your client indicates that it is waiting for a filtered representation of the resource and not the full representation. You can use the same by the minimal representation if you don't want to use the prefer header: GET /rest/someresource?return=minimal.
Note that if you want your client to query again, then you should send them hyperlinks in your response. The REST client must get the URIs (or URI templates) from these hyperlinks and it should not start to build URIs on its own.
This page states, that:
When using complex types it is important to validate that the data you are receiving from the end user is the correct type. Failing to correctly handle complex data could result in malicious users being able to store data they would not normally be able to.
What bad could actually happen (knowing, that CakePHP performs its standard security checks in the background) when accepting JSON data from the frontend?
Which additional security should be added by a CakePHP developer when processing JSON input for single columns and relying on the above introduced support for JSON column type?
Mostly it is a concern of having the right structure and coherent data.
For example if you stored serialized data coming from the user and you expect it to be a list of integers like this one:
[1, 4, 5, 6]
So you can do array_sum($values) in any part of your application. It might be possible for someone to submit an array looking like this
[{a: 2}, {s: 15}, {}, 'hello']
In which case calling array_sum() will give you warnings.
It is important to validate the information you are receiving according to the expectations you have about it in terms of structure and data types.
Let's say I have a json object:
Object = {
param1: '',
param2: '',
param3: '',
param4: {
paramA: '',
paramB: '',
paramC: '',
paramD: [AnotherJsonObject1,AnotherJsonObject2]
}
}
Will my MongoDB structure not be similar? Would this type of structuring make the data (or some of it) less searchable?
Edit 1:
By less searchable I mean: if the top level entities have sub entities which themselves have sub-entities and so on. Will I be able to reach the lowest level entities with the same efficiency of those in the top level?
I currently depend heavily on JSON files in my website. Those files need not be indexed to searchable, BUT they would fit in the DB logically.
For example: I have a director, the director has the list of movies he created, every movie in this list has itself a list of actors who play in it, and every actor has a bio.
The bio in this example doesn't need to be indexed. I can just include a link to the file that contains the actor's bio, but I am wondering whether I can just add this to the DB because this way it will all fit in logically, or will 'unnecessary' data will harm the db's ability to perform efficiently.
Mongodb stores the document in a BSON format. It will appear similar to JSON structure.
The structure you explained seems to be a proper use case of nested documents.
You can query nested fields using the . operator
Would this type of structuring make the data (or some of it) less
searchable?
It depends on your nested data structure and the kind of queries on those fields. There may be some limitations or queries may be a bit more complicated in nested structure cases in case on nested docs. However, as far the searchability of your nested docs is concerned, it entirely depends on your use case.
For eg.
director:[movies:[{movieName:"movie1", actors:[{firstName:"will", lastName:"smith"}, {firstName:"bruce", lastName:"willis"}]}]]
In the above scenario, if you have search for a director where any of the directed movies has actor with firstName as will and lastName as smith may turn out to be a bit more complex.
a simple query like
{director.movies.actors.firstName:"will", director.movies.actors.lastName:"smith"}
may return a false response
The doc : director:[movies:[actors:actors:[{firstName:"will", lastName:"willis"}, {firstName:"bruce", lastName:"smith"}]]]
will also turn out to be a positive match.
Also, negation queries like where firstName!="bruce" will also return both the documents.
You may like to go through the mongodb docs for the same
For the first case, you can refer to elemMatch
I am a newbie to "couchbase server". What i am looking for is to store 10 author names to couchbase document one after another. Someone please help me whether the structure is like a single document "author" and multiple values
{ id : 1, name : Auther 1}, { id : 2, name : Author 2}
OR store Author 1 to a document and Author 2 to another document.
If so, how can i increment the id automatically before "insert" command.
you can store all authors in a single document
{ doctype : "Authors",
AuthorNames:[
{
id: 1,
Name : "author1"
}
{
id: 2,
Name : "author2"
}
so on
]
IF you want to increase the ID, one is to enter one author name at a time in new document, but ID will be randomly generated and it would not in incremental order.
In Couchbase think more about how your application will be using the data more than how you are want to store it. For example, will your application need to get all of the 10 authors all of the time? If so, then one document might be worthwhile. Perhaps your application needs to only ever read/write one of the authors at a time. Then you might want to put each in their own, but have an object key pattern that makes it so you can get the object really fast. Objects that are used often are kept in the managed cache, other objects that are not used often may fall out of the managed cache...and that is ok.
The other factor is what your reads to writes ratio is on this data.
So like I said, it depends on how your application will be reading and writing your data. Use this as the guidance for how your data should be stored.
The single JSON document is pretty straight forward. The more advanced schema design where each author is in its own document and you access them via object key, might be a bit more complicated, but ultimately faster and more scalable depending on what I already pointed out. I will lay out an example schema and some possibilities.
For the authors, I might create each author JSON document with an object key like this:
authors::ID
Where ID is a value I keep in a special incrementer object that I will called authors::incrementer. Think of that object as a key value pair only holding an integer that happens to be the upper bound of an array. Couchbase SDKs include a special function to increment just such an integer object. With this, my application can put together that object key very quickly. If I want to go after the 5th author, I do a read by object key for "authors::5". If I need to get 10, I do a parallelized BulkGet function and get authors::1 through authors::10. If I want to get all the authors, I get the incrementer object, and get that integer and then to a parallelized bulk get. This way i can get them in order or in whatever order I feel like and I am accessing them by object key which is VERY fast in Couchbase.
All this being said, I could use a view to query this data or the upcoming "SQL for Documents" in Couchbase 4.0 or I can mix and match when I query and when I get objects by their key. Key access will ALWAYS be faster. It is the difference between asking a question then going and getting the object and simply knowing the answer and getting it immediately.