Rest API best practice returning a single string vs wrap it in JSON object? - json

I have an endpoint that can return one of several states, e.g. 'Active', 'Cancelled' etc.
Is it bad practice to just return this as a string in the response, like:
"Active"
or should I wrap it in a JSON object, like:
{
"status": "Active"
}

The currently registered reference for JSON is RFC 8259
A JSON value MUST be an object, array, number, or string, or one of
the following three literal names:
false
null
true
So returning a quoted string is fine when that's the natural representation of your resource.
Where things have the potential to get complicated: discovering later that the natural representation of your resource is a message, rather than a string.
The justification for using an object is that it gives us the flexibility to introduce backwards compatible changes to the schema so that we can include more information, without breaking clients that only know about the original schema. That's a lot harder to do when your representation is "just" a string.
But it is a tradeoff: what you pay for the future flexibility is slightly less convenient handling when the one string is all you need.
If you control all of the API consumers, and can change the consumers in lock step with the schema, then you can start with the simple answer, and put in the work to fix everything if it turns out that you need the more complicated representation.
This plan isn't nearly as appealing when you don't control the clients, or when making a lockstep change is expensive.
If you are creating an externally exposed API, then it is unlikely that you control all of the clients.
That said, with careful design you will also have the option of just introducing new resources to cover the cases where you actually need messages with multiple values.
My strongest recommendation here: whatever path you take, leave a paper trail. Document everything you know when you make your choice, and what you expect the risks and complications to be going forward. That way, future you will be able to recover the context of your decision.

Related

Flatbuffers vs CBOR

Please help to suggest some merits and demerits of Flatbuffers and CBOR protocols. Both these binary formats claim to be good on their websites, but I am not able to make some good differences between the two.
Flatbuffers:
Advantage:
Strict typing in FlatBuffer, Cap’n proto and other similar solutions is seen as major key point for performance since no additional encoding/decoding is necessary.
The data model allows simple offsetting of typed objects with a compact data structure and fast access
FlatBuffers does not need a parsing/ unpacking step to a secondary representation before you can access data often coupled with per-object memory allocation.
Disadvantage:
New and not standardized like CBOR.
CBOR
Advantage:
Can create and process entirely in stream with no extra memory
Don’t have to pre-define any schema as our data is dynamic and variant
It’s an open international standard from the IETF makes it a even better choice than a proprietary one.
It’s designed for low memory, non-conversion, stream-based processing while also providing extensions for other data types
Disadvantage:
CBOR says that it follows the JSON model (so not strictly typed objects)
It starts with the same types of objects (strings, integers, maps, etc.).
PS:
It feels like managing types in CBOR will be performance costly compared to flatbuffers, but as CBOR is standardized protocol I am inclined to prefer it if this difference is not huge. Please let me know which of two will you all recommend and why.
I think you've already spelled it out quite clearly yourself. FlatBuffer's strength is being able to access the data without parsing/unpacking/allocation, which can give serious performance benefits in some scenarios. But if this doesn't matter to you, e.g. Protocol Buffers may work just as well.
Strong typing vs dynamic typing in data matters a lot too. I'd only use the latter if I wanted generic data storage with no constraints ahead of time.
Btw, if for some reason you prefer dynamic typing, but would also like to have the performance benefits of in-place access, there is actually a format that combines the two: https://google.github.io/flatbuffers/flexbuffers.html
FlatBuffers is not "proprietary". It may have been designed at Google, but it is open source and relied upon by many other companies.
I chose CBOR for my site https://kwippe.com - we use it to store all of the artwork and keyword data as compressed strings within a very small JSON structure, only a few attributes necessary to categorize the file. So the files are very small, and load very fast. I used this for over 30,000 SVG files, which I converted to JSON beforehand. All of the JSON is converted to string and compressed via a string compression library, then saved as part of the smaller JSON object that I encode to CBOR.
I've had very few problems with this CBOR system, and it was far easier to set up than FlatBuffers and some of the other binary solutions that I looked at.
I had this same question and went with CBOR for a couple reasons.
You have a CON that CBOR like JSON doesn't have strict types, true, you'll need to do a little validation to make sure the type you got is one you expected. You're right, this is what a Schema serializer gets you. You lose flexibility of changing types, but you know what you're going to get. I work on embedded in C, and static typing is important.
What you didn't list as a PRO is that CBOR 'can' retain JSON compatibility. That any valid JSON is valid CBOR, but not the other way around. A cbor can have a map item (object, key/value pair) of 1 : 2 that's integer 1 name has the value of integer 2. This isn't great a practice but there could be some uses for it. If you avoid the intentionally incompatible things, CBOR >> JSON conversion can be very handy. When would you use that? Well, I use it for logs. When my CBOR packets hit our server, they are converted to JSON and stored away already human readable for analytics. You can do this with any serializer, but we felt there was a lot less chance for "interpretation" differences in the close conversion.
The main factor for us was the schema was too difficult to share, and synchronize. If you own both sides of an A to B system, a schema is great! You get size efficiency because the map "Apples" : 100 is just stored as [1,100] but you had to get your schema file on both sides and compiled in (if using code generation) before you could get any work done. Ok, but what if you have 10 sides in a star pattern A B C D E F G H I J, where A and J can get messages to each other, B and H almost exclusively chat except for a message that goes to E and never back from, etc... In this scenario a schema can be very difficult! Maybe it's working and you add a whole slew of messages the option is to have old schemas, optional or missing definitions, or you synchronize everyone. For us this was the case and it would have taken place over 4 languages and in systems we didn't own.
Instead, we chose schemaless CBOR and appropriately name each map item. "apples" is for A,B,C, and J. "bananas" is an item that will go to C, H and E but never F, etc. Each side needs to know what it should expect and that's all.
As I understand it, FlatBuffers does have a schema-less mode, but I know little about it. I don't think there is a right answer, but for what it's worth, our web developers took to and understood CBOR right away because it's so similar in look and feel to JSON.
UPDATE: If interested in CBOR, but could really use some schema support and/or a clear way to document what the expected data is. CDDL (RFC 8610) looks to do exactly this. Also supports data definition of JSON because of how similar CBOR and JSON can be. There are also CDDL code generation tools for various languages that will accept the CDDL file, and help generate code for deserializing, parsing, validating the CBOR/JSON data. For me, this was the largest pain point of not having a schema, I was left to do this work and make mistakes on my own.

What makes JSON or YAML syntax able to be sent "through the wire"?

require 'yaml'
class Person
attr_accessor :name, :age
end
fred = Person.new
fred.name = "Fred Bloggs"
fred.age = 45
laura = Person.new
laura.name = "Laura Smith"
laura.age = 23
test_data = [ fred, laura ]
puts test_data.to_yaml
#YAML
- !ruby/object:Person
age: 45
name: Fred Bloggs
- !ruby/object:Person
name: Laura Smith
age: 23
This is an example of YAML serialization from a book that I am reading. I'm having trouble understanding what makes YAML syntax any different from normal ruby code for it to be saved/sent. If it were to be converted to binary as in "binary serialization" it would make sense to me as it would be able to be sent faster. If the point of serialization is to keep the state of an object in order and make it into a stream why not just make it a stream of its original order and syntax?
Concerning the question whether binary serialization would be faster: Yes, it would. If you are concerned about speed, YAML is not the tool you want – you should turn to other tools like Cap'n Proto. YAML has been designed to be human readable.
So why send YAML instead of Ruby code? Well, for starters: Security. If one end sends Ruby code to the other end and the code gets evaluated there, this may easily turn into a vulnerability if an unauthorized third party finds a way to inject a message into this stream; it can lead to arbitrary code execution.
So let's assume we don't actually want to send arbitrary Ruby code. Instead, we want to send a subset which is a single expression which evaluates to the data we want to send. Incidentally, this is how JSON came into existence: As a subset of JavaScript evaluating to an object value.
Since JSON already exists, there is no point in inventing the wheel again basing some serialization language on Ruby, unless you want to add some feature missing from JSON¹. You would need to write a complete parser and emitter (note that you cannot simply use your Ruby implementation because, as described above, this will let an attacker execute arbitrary code). And JSON is already supported in a wide range of programming languages and ecosystems, making it an ideal data interexchange format if you value cross-platform compatibility.
So now the question remains what YAML offers in addition to JSON. Some argue that YAML syntax is far better readable than JSON, YMMV. But there are a number of features in YAML that make it superior to JSON:
YAML has an extensible tagging system for annotation content with a type. Example from your code: !ruby/object:Person. This ensures that if you have a field in your data structure where differently typed values can occur, the receiving side immediately knows which type to use for deserialization. In JSON, you would need type inference (deducing the type from the value of the expression) to make that decision and that is not always possible².
Data structures may contain cycles (e.g. ring lists, strongly connected graphs). These are difficult to serialize. YAML has built-in anchors and aliases, making it possible to reference a previously started node to denote a cyclic structure. JSON has no such thing. I assume it would be difficult to include this feature in a Ruby-based serialization language without adding features alien to Ruby itself.
Lastly, and that's the answer to the question in the title, YAML has been designed for streaming (JSON to a far lesser extent). A YAML stream can contain any number of documents. This makes it possible to keep a stream open and wait for new data on the receiving side. In contrast, JSON expects the input to end after one object.
All of this does not mean that YAML (or JSON) is the one and only way to go. Don't have any cycles or heterogeneous fields in your data? You won't need anchors/aliases or tags! Don't need human-readable serialization? You can go with a binary format! JSON and YAML have been successful because their feature set pretty well mirrors the requirements in a lot of applications. Whether it is the right tool for your application is up to you to decide.
¹ There are surely projects that do exactly that for any number of reasons. The point I want to make is that in general, implementing proper (de-)serialization is an involved task and you usually want to use what's already there.
² You can, of course, extend your JSON schema so that every node has a structure like this:
{
"type": "myType",
"value": ...
}
But that would make the serialization pretty verbose.

What are the practical disadvantages of using strongly typed data interchange format (eg thrift / capn proto) in a microservices context?

I'm thinking of introducing a strongly typed (read - with predefined schema) data interchange format for communication between our internal services. For example, I guess something like Thrift or Cap'n Proto.
At least two obvious advantages (to me) of using this over something like JSON is that
you would KNOW the exact format of the data the service can expect (so leaves less room for ambiguity and errors while communicating) and
the implementation generally deserializes the raw message for you and it provides methods for accessing the objects.
What are the practical disadvantages for going this route, versus something like JSON?
For context - our system consists of services written in python and java - and possibly other languages in the future, and communicates via HTTP endpoints between services and message brokers like rabbitmq.
As with every strongly typed system, one of the major advantanges is without a doubt that if you make mistakes, it fails early in the process, typically at the compilation stage, which is a good thing.
Second biggest advantage IMHO is what you already said: because the fields and types are well known, the compiler, libraries and related code know what data to expect and thus can be written/organized in a more efficient manner - or in short: performance.
In contrast, a losely typed system (like Avro), while allowing for much greater flexibility without the need of recompiling, comes with the other side of the same coin: the downside of being prone to errors regarding the contents of the message at runtime.
This is because a losely defined system defines only the syntax of a valid document (like for example XML) and leaves the message-level semantics of what's in the document up to the upper layers. A strongly typed system has the knowledge about those message-level semantics already built in at compile time. Therefore, it is easy to detect/decide whether a particular document or message is not only well-formed but valid with regard to the message contents. If you need to do the same with the losely defined system, you need to provide additional information at runtime (like XML schema) and validate your document against it.
Bottom line
What system you prefer is more or less a matter of taste in most cases. I'd make the decision based on the question, how variable the data are that I have to deal with. If it makes sense to use a strongly typed system, I'd go that way, because I like it very much to get informed about errors and mistakes early.
However, if there is a need for very flexible data structures, it may make more sense to go the other road. Although designing a losely typed schema on top of a strongly typed system is surely possible, it is somewhat contradicting and you'll end up with some overly complicated, while overly generic, thing.
Typed
Incoming messages that are type tagged is very liberating, so long as it's possible to tell what the incoming message is without reading all of it. If so then you no longer care so much about message order. This is because it's easy for the recipient of the messages to handle whatever it is sent. So you can have an application which just sits there taking whatever it gets, and just does whatever is appropriate for each one.
Format
A schema language that allows you to define value and size constraints is very useful. It means that the sender of a message cannot accidentally send an invalid one. Moreover the receiver can automatically tell if an incoming message meets the schema. This is a real bonus in implementing a network service; the vast bulk of the message validation is done for you!
By size constraint, I mean that you can specify how long an array is in the schema and the generated code will refuse to handle arrays longer or shorter. By value constraints, imagine a message field called "bearing"; you might want to constrain that to be between 0 and 359.
These both allow you to make a clear, unambiguous statement about what the interface is and have it enforced automatically. How many security bugs have there been recently where some network interface data validation has been badly implemented...
Options
One serialisation standard that does all this is ASN.1. The tools I've used take an ASN.1 schema and produce code to serialise and deserialise, automatically checking that the value and size constraints have been met and also telling you what an incoming message type is. The tools for ASN.1 can be quite elderly and are in need of updating. If updated it would be ideal for every purpose, with both binary and text wire formats available.
There's now JSON schemas too, and they seem to have type, value and size constraints. This might be what you're looking for.
I'm fairly sure that Google Protocol Buffers doesn't do type tagging very well, and doesn't do value and size constraints. I've seen comments in GPB schema along the lines of:
// musn't be greater than 10.
If that's what is being written into a schema, the schema language is arguably inadequate...
I'm not sure of Thrift, I'm not sure it does value constraints (someone correct me if I'm wrong please!).
Disadvantages
Can't think of any! It can irritate developers; code they thought was good can be readily revealed to be producing junk messages, which annoys them intensely...

Should persistent objects validate data upon set?

If one has a object which can persist itself across executions (whether to a DB using ORM, using something like Python's shelve module, etc), should validation of that object's attributes be placed within the class representing it, or outside?
Or, rather; should the persistent object be dumb and expect whatever is setting it's values to be benevolent, or should it be smart and validate the data being assigned to it?
I'm not talking about type validation or user input validation, but rather things that affect the persistent object such as links/references to other objects exist, ensuring numbers are unsigned, that dates aren't out of scope, etc.
Validation is a part of the encapsulation- an object is responsible for it's internal state, and validation is part of it's internal state.
It's like asking "should I let an object do a function and set his own variables or should I user getters to get them all, do the work in an external function and then you setters to set them back?"
Of course you should use a library to do most of the validation- you don't want to implement the "check unsigned values" function in every model, so you implement it at one place and let each model use it in his own code as fit.
The object should validate the data input. Otherwise every part of the application which assigns data has to apply the same set of tests, and every part of the application which retrieves the persisted data will need to handle the possibility that some other module hasn't done their checks properly.
Incidentally I don't think this is an object-oriented thang. It applies to any data persistence construct which takes input. Basically, you're talking Design By Contract preconditions.
My policy is that, for a global code to be robust, each object A should check as much as possible, as early as possible. But the "as much as possible" needs explanation:
The internal coherence of each field B in A (type, range in type etc) should be checked by the field type B itself. If it is a primitive field, or a reused class, it is not possible, so the A object should check it.
The coherence of related fields (if that B field is null, then C must also be) is the typical responsibility of object A.
The coherence of a field B with other codes that are external to A is another matter. This is where the "pojo" approach (in Java, but applicable to any language) comes into play.
The POJO approach says that with all the responsibilities/concerns that we have in modern software (persistance & validation are only two of them), domain model end up being messy and hard to understand. The problem is that these domain objects are central to the understanding of the whole application, to communicating with domain experts and so on. Each time you have to read a domain object code, you have to handle the complexity of all these concerns, while you might care of none or one...
So, in the POJO approach, your domain objects must not carry code related to one of these concerns (which usually carries an interface to implement, or a superclass to have).
All concern except the domain one are out of the object (but some simple information can still be provided, in java usually via Annotations, to parameterize generic external code that handle one concern).
Also, the domain objects relate only to other domain objects, not to some framework classes related to one concern (such as validation, or persistence). So the domain model, with all classes, can be put in a separate "package" (project or whatever), without dependencies on technical or concern-related codes. This make it much easier to understand the heart of a complex application, without all that complexity of these secondary aspects.

How to design a class that has only one heavy duty work method and data returning other methods?

I want to design a class that will parse a string into tokens that are meaningful to my application.
How do I design it?
Provide a ctor that accepts a string, provide a Parse method and provide methods (let's call them "minor") that return individual tokens, count of tokens etc. OR
Provide a ctor that accepts nothing, provide a Parse method that accepts a string and minor methods as above. OR
Provide a ctor that accepts a string and provide only minor methods but no parse method. The parsing is done by the ctor.
1 and 2 have the disadvantage that the user may call minor methods without calling the Parse method. I'll have to check in every minor method that the Parse method was called.
The problem I see in 3 is that the parse method may potentially do a lot of things. It just doesn't seem right to put it in the ctor.
2 is convenient in that the user may parse any number of strings without instantiating the class again and again.
What's a good approach? What are some of the considerations?
(the language is c#, if someone cares).
Thanks
I would have a separate class with a Parse method that takes a string and converts it into a separate new object with a property for each value from the string.
ValueObject values = parsingClass.Parse(theString);
I think this is a really good question...
In general, I'd go with something that resembles option 3 above. Basically, think about your class and what it does; does it have any effective data other than the data to parse and the parsed tokens? If not, then I would generally say that if you don't have those things, then you don't really have an instance of your class; you have an incomplete instance of your class; something which you'd like to avoid.
One of the considerations that you point out is that the parsing of the tokens may be a relatively computationally complicated process; it may take a while. I agree with you that you may not want to take the hit for doing that in the constructor; in that case, it may make sense to use a Parse() method. The question that comes in, though, is whether or not there's any sensible operations that can be done on your class before the parse() method completes. If not, then you're back to the original point; before the parse() method is complete, you're effectively in an "incomplete instance" state of your class; that is, it's effectively useless. Of course, this all changes if you're willing and able to use some multithreading in your application; if you're willing to offload the computationally complicated operations onto another thread, and maintain some sort of synchronization on your class methods / accessors until you're done, then the whole parse() thing makes more sense, as you can choose to spawn that in a new thread entirely. You still run into issues of attempting to use your class before it's completely parsed everything, though.
I think an even more broad question that comes into this design, though, is what is the larger scope in which this code will be used? What is this code going to be used for, and by that, I mean, not just now, with the intended use, but is there a possibility that this code may need to grow or change as your application does? In terms of the stability of implementation, can you expect for this to be completely stable, or is it likely that something about the set of data you'll want to parse or the size of the data to parse or the tokens into which you will parse will change in the future? If the implementation has a possibility of changing, consider all the ways in which it may change; in my experience, those considerations can strongly lead to one or another implementation. And considering those things is not trivial; not by a long shot.
Lest you think this is just nitpicking, I would say, at a conservative estimate, about 10 - 15 percent of the classes that I've written have needed some level of refactoring even before the project was complete; rarely has a design that I've worked on survived implementation to come out the other side looking the same way that it did before. So considering the possible permutations of the implementation becomes very useful for determining what your implementation should be. If, say, your implementation will never possibly want to vary the size of the string to tokenize, you can make an assumption about the computatinal complexity, that may lead you one way or another on the overall design.
If the sole purpose of the class is to parse the input string into a group of properties, then I don't see any real downside in option 3. The parse operation may be expensive, but you have to do it at some point if you're going to use it.
You mention that option 2 is convenient because you can parse new values without reinstantiating the object, but if the parse operation is that expensive, I don't think that makes much difference. Compare the following code:
// Using option 3
ParsingClass myClass = new ParsingClass(inputString);
// Parse a new string.
myClass = new ParsingClass(anotherInputString);
// Using option 2
ParsingClass myClass = new ParsingClass();
myClass.Parse(inputString);
// Parse a new string.
myClass.Parse(anotherInputString);
There's not much difference in use, but with Option 2, you have to have all your minor methods and properties check to see if parsing had occurred before they can proceed. (Option 1 requires to you do everything that option 2 does internally, but also allows you to write Option 3-style code when using it.)
Alternatively, you could make the constructor private and the Parse method static, having the Parse method return an instance of the object.
// Option 4
ParsingClass myClass = ParsingClass.Parse(inputString);
// Parse a new string.
myClass = ParsingClass.Parse(anotherInputString);
Options 1 and 2 provide more flexibility, but require more code to implement. Options 3 and 4 are less flexible, but there's also less code to write. Basically, there is no one right answer to the question. It's really a matter of what fits with your existing code best.
Two important considerations:
1) Can the parsing fail?
If so, and if you put it in the constructor, then it has to throw an exception. The Parse method could return a value indicating success. So check how your colleagues feel about throwing exceptions in situations which aren't show-stopping: default is to assume they won't like it.
2) The constructor must get your object into a valid state.
If you don't mind "hasn't parsed anything yet" being a valid state of your objects, then the parse method is probably the way to go, and call the class SomethingParser.
If you don't want that, then parse in the constructor (or factory, as Garry suggests), and call the class ParsedSomething.
The difference is probably whether you are planning to pass these things as parameters into other methods. If so, then having a "not ready yet" state is a pain, because you either have to check for it in every callee and handle it gracefully, or else you have to write documentation like "the parameter must already have parsed a string". And then most likely check in every callee with an assert anyway.
You might be able to work it so that the initial state is the same as the state after parsing an empty string (or some other base value), thus avoiding the "not ready yet" problem.
Anyway, if these things are likely to be parameters, personally I'd say that they have to be "ready to go" as soon as they're constructed. If they're just going to be used locally, then you might give users a bit more flexibility if they can create them without doing the heavy lifting. The cost is requiring two lines of code instead of one, which makes your class slightly harder to use.
You could consider giving the thing two constructors and a Parse method: the string constructor is equivalent to calling the no-arg constructor, then calling Parse.