I've been looking at JSON API and keep getting hung up on scalability scenarios. Lets say you have a large collection (1000) of models, each with 3 or 4 relationships each.
From my understanding, JSON API requires you to specify at least the relationships with their associated id(s) (and optionally sideload the relationship with include). If that collection of 1000 models has to do a JOIN for every single relationship to be able to populate the valid JSON API payload like below:
...
{
"some_relationship_name": {
data: [
{ id: 1, type: "derp" }
...
]
}
}
I don't see how this can possibly scale in any reasonable way.
You don't have to specify the ids of the relationships. You can just specify the links to provide a way to fetch the links. Checkout the specification.
So you can do something like this:
{
id: '1'
type: 'base'
relationships: {
relA: {
links: {
self: '/base/1/relationships/relA',
related: '/base/1/relationships/relA/related',
}
},
...
}
attributes: {...}
}
So you don't have to JOIN anything you don't directly need. For example in a list you don't join information you only need in the detail view.
I'm not sure where you can see problem. You have 20 bytes per relations / 4 relations / 1000 records => ~100kB. Existing adapters should have no issues with processing such data fast enough.
If you need to transport less data there are several options. You can add compression but be aware that for such small data it is usually faster to transfer data than to compress them.
Other option is to send only data that you really need. Usually you don't need 1000 records in the web app immediately. So paging and lazy-loading should help you to send only those data that are really required.
Related
I am using Firebase and Xamarin Forms to deploy an app. I am trying to figure it out how to get an object (or several) matching one criteria. Let's say I have a collection of characters and each of them has different attributes like name, age, city and the last attribute is an array of string saying what kind of tools they have.
For example, having this three characters in the collection:
{ 'characters':
{
'char001': {
'name': 'John',
"tools":[ "knife", "MagicX", "laser", "fire" ]
},
'char002': {
'name': 'Albert',
"tools":[ "MagicX" ]
},
'char003': {
'name': 'Chris',
"tools":[ "pistol", "knife", "magicX" ]
}
}
}
I want to retrieve the character(s) who has a knife and magicX, so the query will give me as a result: char001, and char003.
That said, I have a large set of data, like +10.000 characters in the collection plus each character can have up to 10 items in tools.
I can retrieve the objects if the attribute tools where just one string, but having tools as an array I have to iterate throw all the items of each character and see how many of them has a knife and then the same procedure looking for the one with magicX, and the do the union of the two queries which is going to give me the result. This, in terms of speed, it's so slow.
I would like to do it on the back-end side directly, and just receive the correct data.
How could I perform the query in firebase?
Thank you so much in advance,
Cheers.
In Firebase, this is easy, assuming that characters is a collection...
If it's the case, one way to do it is to structure your "charachter" documents like so:
'char001': {
name: "John",
tools: {
knife: true,
MagicX: true,
laser: true
}
}
This way, you'll be able to perform compound EQUALITY queries and get back all the characters with the tools you're searching for. Something like:
db.collection('characters').where('tools.knife', '==', true).where('tools.magicX', '==', true)
Mind you, you can combine up to 10 equality clauses in a query.
I hope this helps, search for "firestore compound queries" for more info.
Currently, I have an API end point that needs to return a list of items in a JSON format. However, each of the items that are returned has a different structure.
E.g. Think of a feed API. However, the structure of each item within the API can be very different.
Is it standard to return API response with multiple items - each with a different structure?
A made-up sample below to show different structures.
Store, Candy, and Personnel in the example is logically the same thing in my case (3 different items). Howevever, the structuring underneath can be very different - with different key-value pairs, different levels of nesting, etc.
{
"store":{
"book":[
{
"category":"reference",
"author":"Nigel Rees",
"title":"Sayings of the Century",
"price":8.95
},
{
"category":"fiction",
"author":"Evelyn Waugh",
"title":"Sword of Honour",
"price":12.99
}
],
"bicycle":{
"color":"red",
"price":19.95
}
},
{
"candy":
{
"type":"chocolate",
"manufacturer":"Hershey's",
"cost":10.00,
"reduced_cost": 9.00
},
},
{
"Personnel":
{
"name":"chocolate",
profile:
{
"Key": "Value",
"Key": "Value",
something:
{
"Key": "Value",
"Key": "Value",
}
}
},
},
}
There are no strict rules to REST in terms of how you design your payloads. However, there are still certainly things to consider obviously when doing so. Without knowing really the specifics of your needs it's hard to give specific advice but in general when it comes to designing a JSON REST API here is what I think about.
On average, how large will my payload be. We don't want to pass large amounts of data on each request. This will make your application extremely slow and perhaps even unusable on mobile devices. For me, the limit in the absolute worse case is 1mb and maybe this is even too high. If you find your payload is too large, break it down into separate resources. For example rather than including the books in the response to your stores resource, just reference the unique id's of the books that can be accessed through /stores/books/{id}
Is it simple enough that a person who stumbles across the resource can understand the general use of it. The simpler an API is the more useful it is for users. If the structure is really complex, perhaps breaking it into several resources is a better option
This point sort of balances number #1. Try and reduce the number of requests to get a certain piece of data as much of possible (still considering the other two points above). Excessively breaking down payloads into separate resources also reduces performance.
I am just learning Firebase and I would like to know why one would need custom reference keys instead of just using childByAutoId. The examples from docs showed mostly similar to the following:
{
"users": {
"alovelace": {
"name": "Ada Lovelace",
"contacts": { "ghopper": true },
},
"ghopper": { ... },
"eclarke": { ... }
}
}
but why not use something like
{
"users": {
"gFlmT9skBHfxf7vCBCbhmxg6dll1": {
"name": "Ada Lovelace",
"contacts": { "ghopper": true },
},
"gFlmT9skBHfxf7vCBCbhmxg6dll2": { ... },
"gFlmT9skBHfxf7vCBCbhmxg6dll3": { ... }
}
}
Though I would prefer the first example for readability purposes. Aside from that, would there be any impact regarding Firebase features and other development related things like querying, updating, etc? Thanks!
Firebase's childByAutoId method is great for generating the keys in a collection:
Where the items need to be ordered by their insertion time
Where the items don't have a natural key
Where it is not a problem if the same item occurs multiple times
In a collection of users, none of these conditions (usually) apply: the order doesn't matter, users can only appear in the collection once, and the items do have a natural key.
That last one may not be clear from the sample in the documentation. Users stored in the Firebase Database usually come from a different system, often from Firebase Authentication. That system gives users a unique ID, in the case of Firebase Authentication called the UID. This UID is a unique identifier for the user. So if you have a collection of users, using their UID as the key makes it easy to find the users based on their ID. In the documentation samples, just read the keys as if they are (friendly readable versions of) the UID of that user.
In your example, imagine that you've read the node for Ada Lovelace and want to look up her contacts. You'd need the run a query on /users, which gets more and more expensive as you add users. But in the model from the documentation you know precisely what node you need to read: /users/ghopper.
I'm designing my back-end. I have a json array/queue/something which I only need any data that is at most 2 weeks old, that is continuously appended to. I only want to delete from this "queue", but not the container document. Can I use TTL for this, or does TTL only work for whole documents?
Is there a better way to do this? Should I store them in per-day or per-hour arrays as separate documents instead?
Running couchbase 2.2.
TTL in Couchbase only applies to whole documents, it's not possible to expire subsets of a document. Like you said you can always have separate documents with different expiry times in which you have a type,date and then the array of data as an element.
Then using a view like so:
function (doc, meta) {
if(meta.type == "json") {
if(doc.type == "ordered_data") {
if(doc.date) {
emit(dateToArray(doc.date));
}
}
}
}
You could emit all the related data ordered by date (flag descending set to true), it'd also allow your app to select specific dates by passing in one or more keys. I.e. selecting a date range of 2days,1week etc. When the document expires it'd be removed from the view when it updates (varies based upon your stale parameters plus ops a second/time).
Then you can do whatever joining or extra processing you need at the application layer. There are other options available but for me this would be the most sensible way to approach the problem, any problems just comment and we'll try again.
P.s. How big are you arrays going to become? If they are going to be very large then perhaps you'd need to look at a different tech or way to solve the problem.
Which is the better data format for a JSON. The requirement is to be able to store and retrieve information about how many projects are deployed on to a server.
Object-Based Design
{
"Server1":{
"project1":{
"buildNo":"290",
"deployed":"12/12/2012"
},
"project2":{
"buildNo":"291",
"deployed":"11/12/2012"
},
"project3":{
"buildNo":"209",
"deployed":"11/12/2012"
}
}
}
Array-Based Design
{
"Server1":[
{"project1":{
"buildNo":"290",
"deployed":"12/12/2012"
}},
{"project2":{
"buildNo":"291",
"deployed":"11/12/2012"
}},
{"project3":{
"buildNo":"209",
"deployed":"11/12/2012"
}}
]
}
Please do let me know your thoughts for or against either of these approaches.
Is the order of projects significant?
If it is, then an array is the simplest way to represent that.
If it is not, then array requires an unnecessary preprocessing step to map array indexes to project names before you can access them by name.
My thoughts:
Parsing:
Both are of similar complexity.
Adding/Deleting:
Both are of similar complexity
Readability/Representation of Information:
The first indicates a fixed structure whereas the second suggests that projects may be added removed later.