Composition in REST and consistence of the inserted data - json

How to properly design REST if you have a composition? I have a TestResult entity, which has TestCaseResults entities. Both support full set of REST methods. The important fact about this (which I believe differs from many examples I found on a web) is that TestResult is not consistent if it doesn't have all of TestCaseResults How do I properly design this in REST?
Let's say I create it as separate but dependent resources: api\testresults\ and api\testresults\1\testcaseresults. When the client wants to create a test result, he needs to POST to api\testresults, then retrieve URL api\testresults\1\testcaseresutls by a link from the response, and POST all of test case results to it. This means that at some point in time the test result is not consistent until the user finishes its operation. Basically, there is no concept of the transaction here.
Let's say I create only api\testresults resource, and embed an array of test case results inside, like this:
{
"Name": "Test A"
"Results": [
{
"Measured": "BB",
...
},
...
]
...
}
Then it is easier to insert, but it still hard to work with. Simple GET to api\testresults\1\ will retrieve test result with a big amount of test case results. GET to api\testresults\ will retrieve much more! The structure of this becomes complex. Furthermore, in the real word I have a few entities like TestCaseResults belong to TestResults, so there will be a few arrays, and each could have 100-200 elements.
I could try to combine the approaches. Embed the array, but also provide links to api\testresults\1\testcaseresults and support operations there as well. Maybe on GET api\testresults\1\ I could provide TestResult without it's TestCaseResults but only with a link pointing to a resource, but on POST I could accept an array of TestCaseResults embedded (not sure though it is allowed to have different return types for POST and GET in REST) But now there are two approaches for inserting information, it is confusing and I'm still not sure it solves anything.

your approach with api\testresults\1 and api\testresults\1\testcaseresults seems promising.
As JSON does not have a fixed structure, you can add query parameters to your URL to control if results are inserted or not.
api\testresults\1?with_results=true would mean that your caller want to see the test cases in addition to the test results.
api\testresults\1\testcaseresults would still return the test case results for your test 1.
If you fear that the number of test case results is too large, you can add pagination parameters, that would be reuse in the testcaseresults call.
api\testresults\1?with_results=true&per_page=10 would include the only the 10 first results. To get more, use api\testresults\1\testcaseresults?per_page=10&page=2 and so on, as it is the dedicated endpoint.
Cheers
Note: if you want a flexible API still returning JSON data, you can give a look to GraphQL, the trendy approach.

Related

REST API Design query

I have a problem in deciding what to do in this case in REST API design.
here is my problem,
I have a resource domain model, which has a nested object, which is also a domain model.
you can imagine something like this
{
"name":"abc"
"type":{
"name":"typeName",
"description":"description"
}
}
Now, i want to be able to fetch the outer model resources, based on the inner model and few more params.
for example, i want to fetch all outer model resources which have a given type and some params like page number, size etc.
So my questions,
1.the API should accept inner model in post, and return outer model, is it a good rest design?
How do i pass the extra params? It's a POST, can't put them in url, and can't put them inner model.
Should i create a new model, which contains these extra params and the type resource also?
like
{
"page":"10",
"type":{
"name":"typeName",
"description":"description"
}
}
If you are making a generic HTTP service, it's acceptable to use POST to send a complex query, and to get a response.
If you are trying to be RESTful, then this is a bad practice. You have two options. Option 1 is to find a way to encode your query in the URL, so you can use a GET request.
Option 2 is more involved. I wouldn't necessarily say that I would suggest this, but it's a method to get around the constraints of REST while having complex queries.
The idea is that you use POST to create a 'query' resource. Almost as if you doing a server-side prepared statement, and then later on use GET to get the result of the query.
Example of the client->server conversation:
POST /queries
Content-Type: application/json
...
A response to this might be:
HTTP/1.1 201 Created
Location: /queries/1234
Link: </queryresults/1234>; rel="some-relationship-identifier"
Then after that you could do a GET on /queries/1234 to see the query you 'prepared' and a GET on /queryresults/1234 to see the actual report.
Benefits of this is that it stays within the constraints of REST, and that you could potentially re-use queries and take a longer time to generate the results.
The obvious drawback is that it's harder to explain this to a consumer of your API, as not everyone might be familiar with this pattern and it's an extra HTTP request.
So you have to decide:
Is it worth doing this?
Can you encode the query in the URI instead to avoid this altogether
Maybe you don't care enough about being RESTful and you might just want to break the rules and use POST for some complex queries.

Rest API design with multiple unique ids

Currently, we are developing an API for our system and there are some resources that may have different kinds of identifiers.
For example, there is a resource called orders, which may have an unique order number and also have an unique id. At the moment, we only have URLs for the id, which are these URLs:
GET /api/orders/{id}
PUT /api/orders/{id}
DELETE /api/orders/{id}
But now we need also the possibility to use order numbers, which normally would result into:
GET /api/orders/{orderNumber}
PUT /api/orders/{orderNumber}
DELETE /api/orders/{orderNumber}
Obviously that won't work, since id and orderNumber are both numbers.
I know that there are some similar questions, but they don't help me out, because the answers don't really fit or their approaches are not really restful or comprehensible (for us and for possible developers using the API). Additionally, the questions and answers are partially older than 7 years.
To name a few:
1. Using a query param
One suggests to use a query param, e.g.
GET /api/orders/?orderNumber={orderNumber}
I think, there are a lot of problems. First, this is a filter on the orders collections, so that the result should be a list as well. However, there is only one order for the unique order number which is a little bit confusing. Secondly, we use such a filter to search/filter for a subset of orders. Additionally, a query params is some kind of a second-class parameter, but should be first-class in this case. This is even a problem, if I the object does not exist. Normally a get would return a 404 (not found), but a GET /api/orders/?orderNumber=1234 would be an empty array, if the order 1234 does not exist.
2. Using a prefix
Some public APIs use some kind of a discriminator to distinguish between different types, e.g. like:
GET /api/orders/id_1234
GET /api/orders/ordernumber_367652
This works for their approach, because id_1234 and ordernumber_367652 are their real unique identifiers that are also returned by other resources. However, that would result in a response object like this:
{
"id": "id_1234",
"ordernumber": "ordernumber_367652"
//...
}
This is not very clean, because the type (id or order number) is modelled twice. And apart from the problem of changing all identifiers and response objects, this would be confusing, if you e.g. want to search for all order numbers greater than 67363 (thus, there is also a string/number clash). If the response does not add the type as a prefix, a user have to add this for some request, which would also be very confusing (sometime you have to add this and sometimes not...)
3. Using a verb
This is what e.g. Twitter does: their URL ends with show.json, so you can use it like:
GET /api/orders/show.json?id=1234
GET /api/orders/show.json?number=367652
I think, this is the most awful solution, since it is not restful. Furthermore, it has some of the problems that I mentioned in the query param approach.
4. Using a subresource
Some people suggest to model this like a subresource, e.g.:
GET /api/orders/1234
GET /api/orders/id/1234 //optional
GET /api/orders/ordernumber/367652
I like the readability of this approach, but I think the meaning of /api/orders/ordernumber/367652 would be "get (just) the order number 367652" and not the order. Finally, this breaks some best practices like using plurals and only real resources.
So finally, my questions are: Did we missed something? And are there are other approaches, because I think that this is not an unusual problem?
to me, the most RESTful way of solving your problem is using the approach number 2 with a slight modification.
From a theoretical point of view, you just have valid identification code to identify your order. At this point of the design process, it isn't important whether your identification code is an id or an order number. It's something that uniquely identify your order and that's enough.
The fact that you have an ambiguity between ids and numbers format is an issue belonging to the implementation phase, not the design phase.
So for now, what we have is:
GET /api/orders/{some_identification_code}
and this is very RESTful.
Of course you still have the problem of solving your ambiguity, so we can proceed with the implementation phase. Unfortunately your order identification_code set is made of two distinct entities that share the format. It's trivial it can't work. But now the problem is in the definition of these entity formats.
My suggestion is very simple: ids will be integers, while numbers will be codes such as N1234567. This approach will make your resource representation acceptable:
{
"id": "1234",
"ordernumber": "N367652"
//...
}
Additionally, it is common in many scenarios such as courier shipments.
Here is an alternate option that I came up with that I found slightly more palatable.
GET /api/orders/1234
GET /api/orders/1234?idType=id //optional
GET /api/orders/367652?idType=ordernumber
The reason being it keeps the pathing consistent with REST standards, and then in the service if they did pass idType=orderNumber (idType of id is the default) you can pick up on that.
I'm struggling with the same issue and haven't found a perfect solution. I ended up using this format:
GET /api/orders/{orderid}
GET /api/orders/bynumber/{orderNumber}
Not perfect, but it is readable.
I'm also struggling with this! In my case, i only really need to be able to GET using the secondary ID, which makes this a little easier.
I am leaning towards using an optional prefix to the ID:
GET /api/orders/{id}
GET /api/orders/id:{id}
GET /api/orders/number:{orderNumber}
or this could be a chance to use an obscure feature of the URI specification, path parameters, which let you attach parameters to particular path elements:
GET /api/orders/{id}
GET /api/orders/{id};id_type=id
GET /api/orders/{orderNumber};id_type=number
The URL using an unqualified ID is the canonical one. There are two options for the behaviour of non-canonical URLs: either return the entity, or redirect to the canonical URL. The latter is more theoretically pure, but it may be inconvenient for users. Or it may be more useful for users, who knows!
Another way to approach this is to model an order number as its own thing:
GET /api/ordernumbers/{orderNumber}
This could return a small object with just the ID, which users could then use to retrieve the entity. Or even just redirect to the order.
If you also want a general search resource, then that can also be used here:
GET /api/orders?number={orderNumber}
In my case, i don't want such a resource (yet), and i could be uncomfortable adding what appears to be a general search resource that only supports one field.
So basically, you want to treat all ids and order numbers as unique identifiers for the order records. The thing about unique identifiers is, of course, they have to be unique! But your ids and order numbers are all numeric; do their ranges overlap? If, say, "1234" could be either an id or an order number, then obviously /api/orders/1234 is not going to reference a unique order.
If the ranges are unique, then you just need discriminator logic in the handler code for /api/orders/{id}, that can tell an id from an order number. This could actually work, say if your order numbers have more digits than your ids ever will. But I expect you would have done this already if you could.
If the ranges might overlap, then you must at least force the references to them to have unique ranges. The simplest way would be to add a prefix when referring to an order number, e.g. the prefix "N". So that if the order with id 1234 has order number 367652, it could be retrieved with either of these calls:
/api/orders/1234
/api/orders/N367652
But then, either the database must change to include the "N" prefix in all order numbers (you say this is not possible) or else the handler code would have to strip off the "N" prefix before converting to int. In that case, the "N" prefix should only be used in the API calls - user facing data-entry forms should not expose it! You can't have a "lookup by any identifier" field where users can enter either id or order number (this would have a non-uniqueness problem anyway.) Instead, you must have separate "lookup by id" and "lookup by order number" options. Then, you should be able to have the order number input handler automatically add the "N" prefix before submitting to the API.
Fundamentally, this is a problem with the database design - if this (using values from both fields as "unique identifiers") was a requirement, then the database fields should have been designed with this in mind (i.e. with non-overlapping ranges) - if you can't change the order number format, then the id format should have been different.

Apigility GET collection returns only 10 results when content negotiation is set to JSON

This issue is bugging me for some time now. To test it I just installed a fresh Apigility, set the db (PDO:mysql) and added a DB-Connected service. In the table I have 40+ records. When I make a GET collection request the response looks OK (with the default HAL content negotiation). Then I change the content negotiation to JSON. Now when I make a GET collection request my response contains only 10 elements.
So my question is: where do I set/change this limit?
You can set the page size manually, like so:
$paginator = $this->getAlbumTable()->fetchAll(true);
// set the current page to what has been passed in query string, or to 1 if none set
$paginator->setCurrentPageNumber((int) $this->params()->fromQuery('page', 1));
// set the number of items per page to 10
$paginator->setItemCountPerPage(10);
http://framework.zend.com/manual/current/en/tutorials/tutorial.pagination.html
Could you please send the page_size, total_items part at the end of the json output?
it's like:
"page_count": 140002,
"page_size": 25,
"total_items": 3500035,
"page": 1
This is not an ideal fix, because it requires you to go into the source code rather than using the page size given in the UI.
The collection class that is auto generated for you by the DB-Connected style derives off of Zend/Paginator/Paginator. This class defines the $defaultItemCountPerPage static protected member which is defaulted to 10. That's why you're only getting 10 results. If you open up the auto-generated collection class for your entity and add: protected static $defaultItemCountPerPage = 100; in the otherwise empty class, you will see that you now get up to 100 results in the response. You can look at other Paginator class variables and methods that you could replace in your derived class to get your desired behavior.
This is not an ideal solution. I'd prefer that the generated code automatically used the same configed page size that the HalJson strategy uses. Maybe I'll contribute a PR to change that. Or, maybe I'll just use the HalJson approach. It does seem like the better way to go. You should have some limit to how much data you load in from the DB at a time to not have an overly long running query or an overly large collection of data coming back you have to deal with. And, whatever limit you set, what do you do when you hit that limit? With the simple Json method, you can't ever get "page 2" of data. So, if you are going to work with some sizeable amount of data, it might be better to use HalJson on and then have some logic on the client side to grab pages of data at a time as needed. The returned JSON structure is a little more complicated, but not terribly so.
I'm probably in the same spot you are -- I'm trying to do a simple little api to play with while keeping everything simple and so I didn't want the client to have to deal with the other stuff in HalJson, but probably better to deal with that complexity and have a smooth way to page through data if you're going to use this with some real set of data. At least, that's the pep talk I'm giving myself right now. :-)

Converting a mock to JSON in Spock

One of the objects I've mocked must be converted into JSON but Spock does not seem to support the mocking of convertions. How can I choose which JSON will be returned?
Example of what I would like to achieve:
def "convert as JSON"()
{
when:
def product = Mock(Product)
println(product as JSON)
then:
1* (product as JSON) << (["message": "message"] as JSON)
}
This does not work however.
EDIT: Mocking the way the object is converted into JSON is useful, because what I want to achieve is to test a method of another class, that takes a product as argument and use it, calling "as JSON" on the product during it's execution. Since the products can be complex and have lots of dependencies and fields, I prefer to mock them. Spock then gives control over the output of the mocked products methods but it gets trickier when conversion is needed...
In your test, you're trying to reduce the complexity of an object (Product) to make your tests more simple. This is dangerous for two reasons:
Complicated tests are a code smell. They tell you "something is wrong". Trying to apply lots of deodorant on the smell will make things worse.
You're testing scenarios which can't happen in production.
The clean/better solution would be to refactor Product until it can be created easily and you don't need to mock it anymore. From what I know about your specific case, Product is a data object (like Integer, Long, BigDecimal). It just encodes state without much functionality of its own.
If that's true, it should be simple to create test cases without mocking. If you need mocking for data objects, then something is wrong with your code. Mocking is only needed for things like services - code which acts upon data objects and which has external dependencies which you need to cut for a test.
The second argument is that you're writing tests that pass but which don't tell a story. It's a complex form of having 10'000 tests that only contain assertTrue(true);. While it's a nice thing to have in terms of test count, it doesn't give you a single advantage over not having them at all.

Is it OK to have multiple assertions in a unit test when testing complex behavior?

Here is my specific scenario.
I have a class QueryQueue that wraps the QueryTask class within the ArcGIS API for Flex. This enables me to easily queue up multiple query tasks for execution. Calling QueryQueue.execute() iterate through all the tasks in my queue and call their execute method.
When all the results have been received and processed QueryQueue will dispatch the completed event. The interface to my class is very simple.
public interface IQueryQueue
{
function get inProgress():Boolean;
function get count():int;
function get completed():ISignal;
function get canceled():ISignal;
function add(query:Query, url:String, token:Object = null):void;
function cancel():void;
function execute():void;
}
For the QueryQueue.execute method to be considered successful several things must occur.
task.execute must be called on each query task once and only once
inProgress = true while the results are pending
inProgress = false when the results have been processed
completed is dispatched when the results have been processed
canceled is never called
The processing done within the queue correctly processes and packages the query results
What I am struggling with is breaking these tests into readable, logical, and maintainable tests.
Logically I am testing one state, that is the successful execution state. This would suggest one unit test that would assert #1 through #6 above are true.
[Test] public mustReturnQueryQueueEventArgsWithResultsAndNoErrorsWhenAllQueriesAreSuccessful:void
However, the name of the test is not informative as it does not describe all the things that must be true in order to be considered a passing test.
Reading up online (including here and at programmers.stackexchange.com) there is a sizable camp that asserts that unit tests should only have one assertion (as a guideline). As a result when a test fails you know exactly what failed (i.e. inProgress not set to true, completed displayed multiple times, etc.) You wind up with potentially a lot more (but in theory simpler and clearer) tests like so:
[Test] public mustInvokeExecuteForEachQueryTaskWhenQueueIsNotEmpty():void
[Test] public mustBeInProgressWhenResultsArePending():void
[Test] public mustNotInProgressWhenResultsAreProcessedAndSent:void
[Test] public mustDispatchTheCompletedEventWhenAllResultsProcessed():void
[Test] public mustNeverDispatchTheCanceledEventWhenNotCanceled():void
[Test] public mustReturnQueryQueueEventArgsWithResultsAndNoErrorsWhenAllQueriesAreSuccessful:void
// ... and so on
This could wind up with a lot of repeated code in the tests, but that could be minimized with appropriate setup and teardown methods.
While this question is similar to other questions I am looking for an answer for this specific scenario as I think it is a good representation of a complex unit testing scenario exhibiting multiple states and behaviors that need to be verified. Many of the other questions have, unfortunately, no examples or the examples do not demonstrate complex state and behavior.
In my opinion, and there will probably be many, there are a couple of things here:
If you must test so many things for one method, then it could mean your code might be doing too much in one single method (Single Responsibility Principle)
If you disagree with the above, then the next thing I would say is that what you are describing is more of an integration/acceptance test. Which allows for multiple asserts, and you have no problems there. But, keep in mind that this might need to be relegated to a separate section of tests if you are doing automated tests (safe versus unsafe tests)
And/Or, yes, the preferred method is to test each piece separately as that is what a unit test is. The closest thing I can suggest, and this is about your tolerance for writing code just to have perfect tests...Is to check an object against an object (so you would do one assert that essentially tests this all in one). However, the argument against this is that, yes it passes the one assert per test test, but you still lose expressiveness.
Ultimately, your goal should be to strive towards the ideal (one assert per unit test) by focusing on the SOLID principles, but ultimately you do need to get things done or else there is no real point in writing software (my opinion at least :)).
Let's focus on the tests you have identified first. All except the last one (mustReturnQueryQueueEventArgs...) are good ones and I could immediatelly tell what's being tested there (and that's very good sign, indicating they're descriptive and most likely simple).
The only problem is your last test. Note that extensive use of words "and", "with", "or" in test name usually rings problems bell. It's not very clear what it's supposed to do. Return correct results comes to mind first, but one might argue it's vague term? This holds true, it is vague. However you'll often find out that this is indeed pretty common requirement, described in details by method/operation contract.
In your particular case, I'd simplify last test to verify whether correct results are returned and that would be all. You tested states, events and stuff that lead to results building already, so there is no need to that again.
Now, advices in links you provided are quite good ones actually, and generally, I suggest sticking to them (single assertion for one test). The question is, what single assertion really stands for? 1 line of code at the end of test? Let's consider this simple example then:
// a method which updates two fields of our custom entity, MyEntity
public void Update(MyEntity entity)
{
entity.Name = "some name";
entity.Value = "some value";
}
This method contract is to perform those 2 operations. By success, we understand entity to be correctly updated. If one of them for some reasons fails, method as a unit is considered to fail. You can see where this is going; you'll either have two assertions or write custom comparer purely for testing purposes.
Don't be tricked by single assertion; it's not about lines of code or number of asserts (however, in majority of tests you'll write this will indeed map 1:1), but about asserting single unit (in the example above, update is considered to be an unit). And unit might be in reality multiple things that don't make any sense at all without eachother.
And this is exactly what one of questions you linked quotes (by Roy Osherove):
My guideline is usually that you test one logical CONCEPT per test. you can have multiple asserts on the same object. they will usually be the same concept being tested.
It's all about concept/responsibility; not the number of asserts.
I am not familiar with flex, but I think I have good experience in unit testing, so you have to know that unit test is a philosophy, so for the first answer, yes you can make a multiple assert but if you test the same behavior, the main point always in unit testing is to be very maintainable and simple code, otherwise the unit test will need unit test to test it! So my advice to you is, if you are new in unit testing, don't use multiple assert, but if you have good experience with unit testing, you will know when you will need to use them