JsonAPI resource Id recommendation - json

This page says that for a JSONAPI recommended type for "id" is string.
identification
Every resource object MUST contain an id member and a type member. The values of the id and type members MUST be strings.
What do we get from keeping it as string. In other words if for some case ids are only number, is it still recommended to keep them as string?

They're not trying to cater for the 'some' cases. They're trying to put forward an API framework which allows clients to flexibly consume APIs. In particular, the idea is that if a client conforms to this specification, then it can start dealing with any API implementation which is also conformant. The key idea is portability. They're all about describing common characteristics between APIs.
Now, there aren't too many things you can definitely say about a resource, given that a resource is the thing which gives an API a purpose, but it's (perhaps) important to know which resource you've seen and what its type is. They're both pieces of information which any API consumer is going to need (the argument goes). Once you decide that they need to be common you need a common JSON datatype for them (because integers look different from strings in JSON).
The JSONAPI format is designed so that a single well-written client could actually process multiple different APIs with little or no alteration. It can only do that if it's able to distinguish between what's important for the client itself to process versus what's important for the client to pass on to a third party (e.g. stuff to display or manipulate in a UI).
Further on in the specification it talks about how type and id, taken together, form a namespace for any fields within the resource itself. So, for a client to use that information in any meaningful way it needs to know what the type of the id and type fields are.
Ultimately, the writers of the JSONAPI spec have said that the id and type fields are universally useful regardless of their value in the resource / API itself. As such, they both need to have reasonably well-defined semantics regardless of what makes sense for any individual API. Strings and integers behave differently so you need to make a choice. And Strings are a bit more flexible across APIs so they've made that call. They could probably have gotten away with choosing integers but they probably decided that it would be a bit restrictive in the cases where people really did want a String as an id.

Related

Is it okay to put Object ids into the dataset attribute?

so I just wanted to know if putting data in a dataset of an element is considered a security flaw even though it is meant to be seen.
For example, if instagram put the id of each post from their database into the dataset attribute in each post element
Another example would be:
Putting the id of the post in a dataset
OWASP calls this Insecure Direct Object Reference (IDOR) -- when you expose a "direct reference" (database ID, etc.) to an internal object to the client -- and it absolutely can be a security issue.
To quote OWASP here, which is a much better expert on this than any one person:
IDOR do not bring a direct security issue because, by itself, it reveals only the format/pattern used for the object identifier. IDOR brings, depending on the format/pattern in place, a capacity for the attacker to mount an enumeration attack in order to try to probe access to the associated objects.
So in essence, if you surface database IDs and thus the pattern they change in, and you have some sort of access control issue (you're relying on security by obscurity, or you have some sort of bug in your access control), an attacker can find their way to any object on your system because they know the ID scheme all objects follow and thus can enumerate to any object in your database.
This is a major flaw, but it's by no means the only one. Check out the OWASP cheat sheet on this for more details!

How do you model complex operations in REST?

I am developing an online game where characters can perform complex actions against other objects and characters. I am building a REST API, and having a lot of trouble trying to follow even some of the most basic standards. I know that REST isn't always the answer, but for a variety of reasons it makes sense for me to use REST since the rest of the API uses it appropriately.
Here are some tricky examples:
GET /characters/bob/items
This returns an array of items that Bob is carrying.
I need to perform a variety of 'operations' against these items, and im having a very difficult time modeling this as 'resources'.
Here are some potential operations, depending on the nature of the item:
throw, eat, drop, hold
This is complicated because these 'operations' are only suitable for certain items. For example, you can't eat a sword. Moreover, 'eat' essentially has a side-effect of 'deleting' the resource. Using 'throw' may also 'delete' the resource. Using 'drop' may 'transform' the resource into another resource type. 'Throw' requires that I provide a 'location'. 'Hold' requires that I supply which hand to hold the item in. So how do you model these operations as resources? None of them are 'alike' because they each require different parameters and result in completely different behaviors.
Currently, I have an 'actions' resource that I POST these arbitrary actions to. But this feels way too RPC and non-standardized/discoverable:
POST /actions/throw
{
characterId: 5,
itemId: 10,
x: 100,
y: 150
}
I try to stick to resources and GET/POST/PUT/PATCH/DELETE where possible, but the base verbs tend to map directly to CRUD calls. Other, more complex operations generally can't be mapped without additional information.
Focusing on the resources, I'd probably do something like this (posting messages to the resources):
POST /characters/bob/items/{bombId}?action=throw
POST /characters/bob/items/{foodId}?action=eat
POST /characters/bob/items/{potionId}?action=add&addedItem={ingredientId}
Return an error when the action is not appropriate for the item.
Where I want a resource to “do a complex action” while remaining RESTful, I'd POST a complex document to the resource that describes what I want to happen. (The complex document could be in XML, JSON, or any number of other formats.) This is somewhat distinct from the more common pattern of mapping POST to “create a child resource”, but the meaning of POST is “do non-idempotent action defined by body content”. That's a reasonable fit for what you're after.
As part of the HATEOAS principle of discovery, when you GET the resource which you will later POST to, part of the document returned should say what these complex action documents are and where they should be sent to. Logically, think of filling in a form and submitting it (even if the “form” is actually slots in a JSON document or something like that).

Common terminology for types of data validation?

Are there any common terms for the difference between data validation you can do on, say, an object in and of itself, and validation that requires access to some sort of external resources?
For example, if I have a user record, I can check things like "Is username present?" "Is Username at least n characters long?" etc., without requiring any additional context. But as soon as I want to do something like "Is username available?" It requires checking against other records in my system.
I'm just wondering if there are any good terms for describing the difference in these types of scenarios? "Static analysis" vs. "run-time checking" sort of fits, but it's clearly not correct.
I don't really know of any widely accepted terms for these different kinds of validation. Wikipedia provides you some guidance.
What's important is that you define/use a set of terms that everyone in your team agrees with and uses. I believe that the terms you're proposing (static vs runtime) are not good because all these rules are exercised at runtime anyway. I would propose something like intrinsic vs extrinsic or internal vs external validation.

How to avoid Anemic Domain Models and maintain Separation of Concerns?

It seems that the decision to make your objects fully cognizant of their roles within the system, and still avoid having too many dependencies within the domain model on the database, and service layers?
For example: Say that I've got an entity with a revision history, and several "lookup tables" that the data references, your entity object should have methods to get the details from some of the lookup tables, whether by providing access to the lookup table rows, or by delegating methods down to them, but in order to do so it depends on the database layer to read the data from those rows. Also, when the entity is saved, It needs to know not only how to save itself, but also to save entries into the revision history. Is it necessary to pass references to dozens of different data layer objects and service objects to the model object? This seems like it makes the logic far more complex to understand than just passing back and forth thin models to service layer objects, but I've heard many "wise men" recommending this sort of structure.
Really really good question. I have spent quite a bit of time thinking about such topics.
You demonstrate great insight by noting the tension between an expressive domain model and separation of concerns. This is much like the tension in the question I asked about Tell Don't Ask and Single Responsibility Principle.
Here is my view on the topic.
A domain model is anemic because it contains no domain logic. Other objects get and set data using an anemic domain object. What you describe doesn't sound like domain logic to me. It might be, but generally, look-up tables and other technical language is most likely terms that mean something to us but not necessarily anything to the customers. If this is incorrect, please clarify.
Anyway, the construction and persistence of domain objects shouldn't be contained in the domain objects themselves because that isn't domain logic.
So to answer the question, no, you shouldn't inject a whole bunch of non-domain objects/concepts like lookup tables and other infrastructure details. This is a leak of one concern into another. The Factory and Repository patterns from Domain-Driven Design are best suited to keep these concerns apart from the domain model itself.
But note that if you don't have any domain logic, then you will end up with anemic domain objects, i.e. bags of brainless getters and setters, which is how some shops claim to do SOA / service layers.
So how do you get the best of both worlds? How do you focus your domain objects only domain logic, while keeping UI, construction, persistence, etc. out of the way? I recommend you use a technique like Double Dispatch, or some form of restricted method access.
Here's an example of Double Dispatch. Say you have this line of code:
entity.saveIn(repository);
In your question, saveIn() would have all sorts of knowledge about the data layer. Using Double Dispatch, saveIn() does this:
repository.saveEntity(this.foo, this.bar, this.baz);
And the saveEntity() method of the repository has all of the knowledge of how to save in the data layer, as it should.
In addition to this setup, you could have:
repository.save(entity);
which just calls
entity.saveIn(this);
I re-read this and I notice that the entity is still thin because it is simply dispatching its persistence to the repository. But in this case, the entity is supposed to be thin because you didn't describe any other domain logic. In this situation, you could say "screw Double Dispatch, give me accessors."
And yeah, you could, but IMO it exposes too much of how your entity is implemented, and those accessors are distractions from domain logic. I think the only class that should have gets and sets is a class whose name ends in "Accessor".
I'll wrap this up soon. Personally, I don't write my entities with saveIn() methods, because I think even just having a saveIn() method tends to litter the domain object with distractions. I use either the friend class pattern, package-private access, or possibly the Builder pattern.
OK, I'm done. As I said, I've obsessed on this topic quite a bit.
"thin models to service layer objects" is what you do when you really want to write the service layer.
ORM is what you do when you don't want to write the service layer.
When you work with an ORM, you are still aware of the fact that navigation may involve a query, but you don't dwell on it.
Lookup tables can be a relational crutch that gets used when there isn't a very complete object model. Instead of things referencing things, you have codes, which must be looked up. In many cases, the codes devolve to little more than a static pool of strings with database keys. And the relevant methods wind up in odd places in the software.
However, if there is a more complete object model, we have first-class things instead of these degenerate lookup values.
For example, I've got some business transactions which have one of n different "rate plans" -- a kind of pricing model. Right now, the legacy relational database has the rate plan as a lookup table with a code, some pricing numbers, and (sometimes) a description.
[Everyone knows the codes -- the codes are sacred. No one is sure what the proper descriptions should be. But they know the codes.]
But really, a "rate plan" is an object that is associated with a contract; the rate plan has the method that computes the final price. When an app asks the contract for a price, the contract delegates some of the pricing work to the associated rate plan object.
There may have been some database query going on to lookup the rate plan when producing a contract price, but that's incidental to the delegation of responsibility between the two classes.
I aggree with DeadBeef - therein lies the tension. I don't really see though how a domain model is 'anemic' simply because it doesn't save itself.
There has to be much more to it. ie. It's anemic because the service is doing all the business rules and not the domain entity.
Service(IRepository) injected
Save(){
DomainEntity.DoSomething();
Repository.Save(DomainEntity);
}
'Do Something' is the business logic of the domain entity.
**This would be anemic**:
Service(IRepository) injected
Save(){
if(DomainEntity.IsSomething)
DomainEntity.SetItProperty();
Repository.Save(DomainEntity);
}
See the inherit difference ? I do :)
Try the "repository pattern" and "Domain driven design". DDD suggests to define certain entities as Aggregate-roots of other objects. Each Aggregate is encapsulated. The entities are "persistence ignorant". All the persistence-related code is put in a repository object which manages Data-access for the entity. This way you don't have to mix persistence-related code with your business logic. If you are interested in DDD, check out eric evans book.

How do you handle exceptional cases

This is often situation, but here is latest example:
Companies have various contact data (addresses, phone numbers, e-mails...) when they make job ad, they have checkboxes where they choose how they want to be contacted. It is basically descriptive data. User when reading an ad sees something like "You can apply by mail, in person...", except if it's "through web portal" or "by e-mail" because then appropriate buttons should appear. These options are stored in database, and client (owner of the site, not company making an ad) can change them (e.g. they can add "by telepathy" or whatever), yet if they tamper with "e-mail" and "web-portal" options, they screw their web site.
So how should I handle data where everything behaves same way except "this thing" that behaves this way, and "that thing" that behaves some other way, and data itself is live should be editable by client.
You've tagged your question as "language-agnostic", and not all languages cleanly support polymorphism, but that's the way I would approach this.
Each option has some type, and different types require different properties to be set. However, every type supports some sort of "render" method that can display the contact method as needed. Since the properties (phone number, or web address, etc.) are type-specific, you can validate the administrator's input when creating these "objects", to make sure that the necessary data is provided and valid. Since you implement the render method, rather than spitting out HTML provided by a user, you can ensure that the rendered page is correct. It's less flexible, but safer and more user friendly.
In the database, you can have one sparsely populated table that holds data for all types of contacts, or a "parent" table with common properties and sub-tables with type-specific properties. It depends on how many types you have and how different they are. In either case, you would have some sort of type indicator, so that you know the type of object to which the data should be be bound.
First of all, think twice do you really need it. Reason is simple. You are supposed to serve specific need and input data is a mean to provide that service. If data does not fit with existing service then what is its value and who are consumer of that specific information?
There are two possible answers: You are expanding your client base or you need to change existing service because of change of demand. In both cases you need to star from development of business model. If you describe what service you need and what information it should provide you will avoid much of specific data and come with clear requirements easy to implement in software.
I'd recommend the resolution pattern for this, based on the mention of a database. The link above describes it, but it's actually a lot simpler than it sounds. You write a database query that returns all the possible options (for example, you read the standard options and the customized options together using perhaps a UNION or a JOIN depending on your schema) - the COALESCE SQL keyword is then useful to find the first 'resolution' of the option value that isn't NULL.
Well, if all it is is that you have two options that are special, and then anything else is dealt with in the same way, then store your options as strings, and if either of the two special ones appears in that list, then show the appropriate stuff for that special item.
Just check your list of items for the two special ones. Nothing fancy.
By writing a very simple Rules Engine. You can use an out-of-the box implementation, or you can roll your own. Since your case seems so simple, I tend to roll my own, because it means less dependencies (YMMV).