Rest API design with multiple unique ids - json

Currently, we are developing an API for our system and there are some resources that may have different kinds of identifiers.
For example, there is a resource called orders, which may have an unique order number and also have an unique id. At the moment, we only have URLs for the id, which are these URLs:
GET /api/orders/{id}
PUT /api/orders/{id}
DELETE /api/orders/{id}
But now we need also the possibility to use order numbers, which normally would result into:
GET /api/orders/{orderNumber}
PUT /api/orders/{orderNumber}
DELETE /api/orders/{orderNumber}
Obviously that won't work, since id and orderNumber are both numbers.
I know that there are some similar questions, but they don't help me out, because the answers don't really fit or their approaches are not really restful or comprehensible (for us and for possible developers using the API). Additionally, the questions and answers are partially older than 7 years.
To name a few:
1. Using a query param
One suggests to use a query param, e.g.
GET /api/orders/?orderNumber={orderNumber}
I think, there are a lot of problems. First, this is a filter on the orders collections, so that the result should be a list as well. However, there is only one order for the unique order number which is a little bit confusing. Secondly, we use such a filter to search/filter for a subset of orders. Additionally, a query params is some kind of a second-class parameter, but should be first-class in this case. This is even a problem, if I the object does not exist. Normally a get would return a 404 (not found), but a GET /api/orders/?orderNumber=1234 would be an empty array, if the order 1234 does not exist.
2. Using a prefix
Some public APIs use some kind of a discriminator to distinguish between different types, e.g. like:
GET /api/orders/id_1234
GET /api/orders/ordernumber_367652
This works for their approach, because id_1234 and ordernumber_367652 are their real unique identifiers that are also returned by other resources. However, that would result in a response object like this:
{
"id": "id_1234",
"ordernumber": "ordernumber_367652"
//...
}
This is not very clean, because the type (id or order number) is modelled twice. And apart from the problem of changing all identifiers and response objects, this would be confusing, if you e.g. want to search for all order numbers greater than 67363 (thus, there is also a string/number clash). If the response does not add the type as a prefix, a user have to add this for some request, which would also be very confusing (sometime you have to add this and sometimes not...)
3. Using a verb
This is what e.g. Twitter does: their URL ends with show.json, so you can use it like:
GET /api/orders/show.json?id=1234
GET /api/orders/show.json?number=367652
I think, this is the most awful solution, since it is not restful. Furthermore, it has some of the problems that I mentioned in the query param approach.
4. Using a subresource
Some people suggest to model this like a subresource, e.g.:
GET /api/orders/1234
GET /api/orders/id/1234 //optional
GET /api/orders/ordernumber/367652
I like the readability of this approach, but I think the meaning of /api/orders/ordernumber/367652 would be "get (just) the order number 367652" and not the order. Finally, this breaks some best practices like using plurals and only real resources.
So finally, my questions are: Did we missed something? And are there are other approaches, because I think that this is not an unusual problem?

to me, the most RESTful way of solving your problem is using the approach number 2 with a slight modification.
From a theoretical point of view, you just have valid identification code to identify your order. At this point of the design process, it isn't important whether your identification code is an id or an order number. It's something that uniquely identify your order and that's enough.
The fact that you have an ambiguity between ids and numbers format is an issue belonging to the implementation phase, not the design phase.
So for now, what we have is:
GET /api/orders/{some_identification_code}
and this is very RESTful.
Of course you still have the problem of solving your ambiguity, so we can proceed with the implementation phase. Unfortunately your order identification_code set is made of two distinct entities that share the format. It's trivial it can't work. But now the problem is in the definition of these entity formats.
My suggestion is very simple: ids will be integers, while numbers will be codes such as N1234567. This approach will make your resource representation acceptable:
{
"id": "1234",
"ordernumber": "N367652"
//...
}
Additionally, it is common in many scenarios such as courier shipments.

Here is an alternate option that I came up with that I found slightly more palatable.
GET /api/orders/1234
GET /api/orders/1234?idType=id //optional
GET /api/orders/367652?idType=ordernumber
The reason being it keeps the pathing consistent with REST standards, and then in the service if they did pass idType=orderNumber (idType of id is the default) you can pick up on that.

I'm struggling with the same issue and haven't found a perfect solution. I ended up using this format:
GET /api/orders/{orderid}
GET /api/orders/bynumber/{orderNumber}
Not perfect, but it is readable.

I'm also struggling with this! In my case, i only really need to be able to GET using the secondary ID, which makes this a little easier.
I am leaning towards using an optional prefix to the ID:
GET /api/orders/{id}
GET /api/orders/id:{id}
GET /api/orders/number:{orderNumber}
or this could be a chance to use an obscure feature of the URI specification, path parameters, which let you attach parameters to particular path elements:
GET /api/orders/{id}
GET /api/orders/{id};id_type=id
GET /api/orders/{orderNumber};id_type=number
The URL using an unqualified ID is the canonical one. There are two options for the behaviour of non-canonical URLs: either return the entity, or redirect to the canonical URL. The latter is more theoretically pure, but it may be inconvenient for users. Or it may be more useful for users, who knows!
Another way to approach this is to model an order number as its own thing:
GET /api/ordernumbers/{orderNumber}
This could return a small object with just the ID, which users could then use to retrieve the entity. Or even just redirect to the order.
If you also want a general search resource, then that can also be used here:
GET /api/orders?number={orderNumber}
In my case, i don't want such a resource (yet), and i could be uncomfortable adding what appears to be a general search resource that only supports one field.

So basically, you want to treat all ids and order numbers as unique identifiers for the order records. The thing about unique identifiers is, of course, they have to be unique! But your ids and order numbers are all numeric; do their ranges overlap? If, say, "1234" could be either an id or an order number, then obviously /api/orders/1234 is not going to reference a unique order.
If the ranges are unique, then you just need discriminator logic in the handler code for /api/orders/{id}, that can tell an id from an order number. This could actually work, say if your order numbers have more digits than your ids ever will. But I expect you would have done this already if you could.
If the ranges might overlap, then you must at least force the references to them to have unique ranges. The simplest way would be to add a prefix when referring to an order number, e.g. the prefix "N". So that if the order with id 1234 has order number 367652, it could be retrieved with either of these calls:
/api/orders/1234
/api/orders/N367652
But then, either the database must change to include the "N" prefix in all order numbers (you say this is not possible) or else the handler code would have to strip off the "N" prefix before converting to int. In that case, the "N" prefix should only be used in the API calls - user facing data-entry forms should not expose it! You can't have a "lookup by any identifier" field where users can enter either id or order number (this would have a non-uniqueness problem anyway.) Instead, you must have separate "lookup by id" and "lookup by order number" options. Then, you should be able to have the order number input handler automatically add the "N" prefix before submitting to the API.
Fundamentally, this is a problem with the database design - if this (using values from both fields as "unique identifiers") was a requirement, then the database fields should have been designed with this in mind (i.e. with non-overlapping ranges) - if you can't change the order number format, then the id format should have been different.

Related

Terminology How to describe a query function without using cache underlying?

I have two versions of query function, one will cache the last result, another will not. I prefer make the cache-version to be default one, and let its API name without any suffix. (example: readVal())
What terms should I use to describe the non-cache-version function name?
readValFresh()?
readValVolatile()?
loadAgain()?
readValTruly()?
Is there a precise english word or terms to describe this case?

How do I fill a list with all the world's phone prefixes in Dart on Flutter?

I'd like to implement an app with Dart on Flutter. I'm on my first approach with this new language and for the first time I meet this problem.
My app must necessarily work with a mobile phone number. I would like to see a ban on the insertion of unse prefixed telephone numbers or, alternatively, the typing of a number with more digits than expected. For example, in Italy the figures after +39 (0039) are at most 10. I probably thought I'd separate the two parts to make it easier to distinguish between lengths (one field where you select the country and another that allows you to enter the number).
Is there, as you know, a JSON that contains exactly: - the prefix of each state, - the length of the telephone number (excluding prefix), - name, *flag and *sigla (Italy, green-white-red, IT)?
Sifting through the web a little bit, I saw that flutter should actually provide already in itself with .demoTextFieldEnterITPhoneNumber, through GalleryLocalizations to do such a job, but I didn't quite understand if it bothers to control a particular regular expression for each nation or not. Could I copy and paste a number for example? Will nationality be automatically recognized?
In the end I think that such a control, so deep, is not possible so I would just need this, so make two fields, one with a list, which at the choice automatically fills in depending on the selected prefix, and a field on which the user types his number: in case of copied and pasted number check if that string also contains a +prefix.
Thank you very much, I need a lot, since my app will mainly revolve around a correct value for this field. :)
Try using the international_phone_input or country_code_picker flutter package. They are quite easy to implement

REST service semantics; include properties not being updated?

Suppose I have a resource called Person. I can update Person entities by doing a POST to /data/Person/{ID}. Suppose for simplicity that a person has three properties, first name, last name, and age.
GET /data/Person/1 yields something like:
{ id: 1, firstName: "John", lastName: "Smith", age: 30 }.
My question is about updates to this person and the semantics of the services that do this. Suppose I wanted to update John, he's now 31. In terms of design approach, I've seen APIs work two ways:
Option 1:
POST /data/Person/1 with { id: 1, age: 31 } does the right thing. Implicitly, any property that isn't mentioned isn't updated.
Option 2:
POST /data/Person/1 with the full object that would have been received by GET -- all properties must be specified, even if many don't change, because the API (in the presence of a missing property) would assume that its proper value is null.
Which option is correct from a recommended design perspective? Option 1 is attractive because it's short and simple, but has the downside of being ambiguous in some cases. Option 2 has you sending a lot of data back and forth even if it's not changing, and doesn't tell the server what's really important about this payload (only the age changed).
Option 1 - updating a subset of the resource - is now formalised in HTTP as the PATCH method. Option 2 - updating the whole resource - is the PUT method.
In real-world scenarios, it's common to want to upload only a subset of the resource. This is better for performance of the request and modularity/diversity of clients.
For that reason, PATCH is now more useful than PUT in a typical API (imo), though you can support both if you want to. There are a few corner cases where a platform may not support PATCH, but I believe they are rare now.
If you do support both, don't just make them interchangeable. The difference with PUT is, if it receives a subset, it should assume the whole thing was uploaded, so should then apply default properties to those that were omitted, or return an error if they are required. Whereas PATCH would just ignore those omitted properties.

Question about grouping or separating functions / methods that are alike

I'll take a real example I have to implement in a program I'm coding:
I have a database that has the score of every game bowled in the past three years in a bowling center. With a GUI, you can choose to either search for the best score on each lane, search for the best score between two dates, for the best score for each week, etc.
I'm wondering what the best way to implement this is. Should I code something like this:
public Vector<Scores> grabMaxScores(sortType, param1, param2)
{
if(sortType.equals("By lane"))
...
else if(sortType.equals("Between given dates")
...
}
Or is it more appropriate to code different methods for each type and call the correct one in the listener?
public Vector<Scores> grabMaxScoresBetweenDates(startDate, endDate)
{
...
}
public Vector<Scores> grabMaxScoresByLane(minLane, maxLane)
{
...
}
I'm not necessarily asking for this particular problem, it's just a question I find asking myself often when I'm coding multiple methods that are alike where the principle is the same, but the parameters are different.
I can see there are good reasons to use each of them, but I want to know if there is a "more correct" or standard way of coding this.
In my personal opinion, I would prefer your second option over the first. This is because you have the opportunity to be precise about things like the types of the parameters. For example, minLane and maxLane may just be integers, but startDate and endDate could very well be Date objects. It's often nicer if you can actually specify what you expect, as it reduces the need for such things as casting and range checks, etc. Also, I would find it more readable, as the function names just say what you are trying to do.
However, I may have an alternative idea, which is kind of a variation on your first example (I actually got this inspiration from Java's Comparator, in case you're familiar with that). Rather than pass a string as the first argument, pass some sort of Selector object. Selector would be the name of a class or a interface, which would look something like so (in Java):
interface Selector {
public void select(Score next);
public Score getBest( );
}
If the select method "likes" the value of next which is given to it, it can store the value for later. If it doesn't like it, it can simply discard it, and keep whatever value it already has. After all the data is processed, the best value will be left over, and can be requested by calling getBest. Of course, you can alter the interface to suit your particular needs (e.g. it seems like you might be expecting more than one value to be retrieved. Also, generics might help a lot as well).
The reason I like this idea is that now your function is very general purpose. In order to add new functionality, you don't need to add functions, and you don't need to modify any functions you already have. Instead, the user of your code can simply define their own implementation of Selector as they see fit. This allows your code to be far more compositional, which makes it easier to use. The only inconvenience is the need to define implementations of Selector, though, you could also provide several default ones.
The approach you have used would also work. But if you want to add some new functionality like "get lowest scores on Friday evening", you will need to add one more function, which kinda not so good thing to do.
As you have already have the data in a database you can generate database queries which would fetch the required results and display. So you need not modify your code every time.

should I write more descriptive function names or add comments?

This is a language agnostic question, but I'm wandering what people prefer in terms of readability and maintainability... My hypothetical situation is that I'm writing a function which given a sequence will return a copy with all duplicate element removed and the order reversed.
/*
*This is an extremely well written function to return a sequence containing
*all the unique elements of OriginalSequence with their order reversed
*/
ReturnSequence SequenceFunction(OriginalSequence)
{...}
OR
UniqueAndReversedSequence MakeSequenceUniqueAndReversed(OriginalSequence)
{....}
The above is supposed to be a lucid example of using comments in the first instance or using very verbose function names in the second to describe the actions of the function.
Cheers,
Richard
I prefer the verbose function name as it make the call-site more readable. Of course, some function names (like your example) can get really long.
Perhaps a better name for your example function would be ReverseAndDedupe. Uh oh, now it is a little more clear that we have a function with two responsibilities*. Perhaps it would be even better to split this out into two functions: Reverse and Dedupe.
Now the call-site becomes even more readable:
Reverse(Dedupe(someSequence))
*Note: My rule of thumb is that any function that contains "and" in the name has too many responsibilities and needs to be split up in to separate functions.
Personally I prefer the second way - it's easy to see from the function name what it does - and because the code inside the function is well written anyway it'll be easy to work out exactly what happens inside it.
The problem I find with comments is they very quickly go out of date - there's no compile time check to ensure your comment is correct!
Also, you don't get access to the comment in the places where the function is actually called.
Very much a subjective question though!
Ideally you would do a combination of the two. Try to keep your method names concise but descriptive enough to get a good idea of what it's going to do. If there is any possibility of lack of clarity in the method name, you should have comments to assist the reader in the logic.
Even with descriptive names you should still be concise. I think what you have in the example is overkill. I would have written
UniqueSequence Reverse(Sequence)
I comment where there's an explanation in order that a descriptive name cannot adequately convey. If there's a peculiarity with a library that forced me to do something that appears non-standard or value in dropping a comment inline, I'll do that but otherwise I rely upon well-named methods and don't comment things a lot - except while I'm writing the code, and those are for myself. They get removed when it is done, typically.
Generally speaking, function header comments are just more lines to maintain and require the reader to look at both the comment and the code and then decide which is correct if they aren't in correspondence. Obviously the truth is always in the code. The comment may say X but comments don't compile to machine code (typically) so...
Comment when necessary and make a habit of naming things well. That's what I do.
I'd probably do one of these:
Call it ReverseAndDedupe (or DedupeAndReverse, depending which one it is -- I'd expect Dedupe alone to keep the first occurrence and discard later ones, so the two operations do not commute). All functions make some postcondition true, so Make can certainly go in order to shorten a too-long name. Functions don't generally need to be named for the types they operate on, and if they are then it should be in a consistent format. So Sequence can probably be removed from your proposed name too, or if it can't then I'd probably call it Sequence_ReverseAndDedupe.
Not create this function at all, make sure that callers can either do Reverse(Dedupe(x)) or Dedupe(Reverse(x)), depending which they actually want. It's no more code for them to write, so only an issue of whether there's some cunning optimization that only applies when you do both at once. Avoiding an intermediate copy might qualify there, but the general point is that if you can't name your function concisely, make sure there's a good reason why it's doing so many different things.
Call it ReversedAndDeduped if it returns a copy of the original sequence - this is a trick I picked up from Python, where l.sort() sorts the list l in place, and sorted(l) doesn't modify a list l at all.
Give it a name specific to the domain it's used in, rather than trying to make it so generic. Why am I deduping and reversing this list? There might be some term of art that means a list in that state, or some function which can only be performed on such a list. So I could call it 'Renuberate' (because a reversed, deduped list is known as a list "in Renuberated form", or 'MakeFrobbable' (because Frobbing requires this format).
I'd also comment it (or much better, document it), to explain what type of deduping it guarantees (if any - perhaps the implementation is left free to remove whichever dupes it likes so long as it gets them all).
I wouldn't comment it "extremely well written", although I might comment "highly optimized" to mean "this code is really hard to work with, but goes like the clappers, please don't touch it without running all the performance tests".
I don't think I'd want to go as far as 5-word function names, although I expect I have in the past.