Iterating/Paging REST API with Biztalk Send Port- Best Practice? - json

There is a REST API I need to pull from and each response has a 100 record limit (300k+ total records). I'm trying to think about what the best practice might be with paging/iterating via BizTalk adapter(s). FYI, this is complicated by the fact that I have 10 or so endpoints I need to page through and have to use a custom pipeline (for each to convert to XML and specify a namespace since a JSON response is the only type available).
The main question for me is how to manage the paging and multiple endpoints efficiently. The page number and offset are in the JSON response so my first guess is I'd have to build an orchestration to analyze the response and create the new request from it. I know there are a lot of ways to do this so I'm curious what best practice dictates and what the most efficient way would be.
Can I somehow get away with NOT using an orchestration?
Can/should I make use of a dynamic send port?

Related

Is it OK to just use POST method and JSON format for a REST-like API in Scala/Play

We decided to use POST method and JSON format for all of our internal APIs which makes everything simpler. But then we realized that this is not truly RESTful. More over it seems that GET requests are more lightweight than POSTs under high load.
We have a problem regarding GET methods. We have to bind our criteria object to the HTTP request (query string) which forces us to build Form object for each criteria model. As you know building the Form object will be done manually and there is no automation available like what we have for JSON formatters (Macro Inception).
Another issue is that we have to decide on whether to use route parameters or querystring.
I think it's simpler to use a single HTTP method and make all API calls uniform. Does it make sense?
POST is the method to be used for any operation that isn't standardized by the HTTP protocol, and simple retrieval is standardized in the GET method. So, using POST for simple retrieval isn't RESTful. More than that, it seems like you want to use POST so you can treat querystring parameters in the same way as the POST payload, but REST URIs are atomic identifiers, including the querystring. Your application shouldn't rely on URI semantics, and extracting bits of information that serve any purpose other than identification also doesn't make much sense in REST.
Frankly, from what you describe your API is so far from being considered truly RESTful that this shouldn't be a concern at all. Do whatever is more consistent with your tools and works better for your application. REST isn't for everyone, and worrying about designing an API that's truly RESTful when that isn't a requirement for your application is more likely to lead to bad design choices.
There's absolutely nothing wrong with using POST like you're describing. In fact, GET requests should not alter the state of the server but instead should only be used for retrieval. In other words, if you're sending data to the server to, for instance, create an entity, using GET would be technically incorrect.
There's nothing you're describing that sounds "not RESTful." POST can definitely be part of a RESTful architecture.
That said, the HTTP method you use should correspond to the action it will perform. For example, if you're retrieving an entity by ID, you should use GET whereas if you're updating an entity by ID, you should use POST or PUT. This gives developers using the API a hint as to the side effects and intended usage of the various API methods.

How to define and store data structure properties in a database

I'm working on a PHP web app with a Postgres backend. The app uses a variety of APIs and want to be able to add/edit the API endpoints used by the system dynamically.
I'm planning to handle variations in the API request URLs with replacement codes, for example: http://api.com/?key=%%api_key%%&user_id=%%user_id%%
The part I don't have a plan for is how to define and store the "shape" of the returned API data. For example, let's say I want to get a user's comments from different APIs. The structure of the data will likely differ from one to another. Even if they are all json data (vs. XML), the property(s) I care about will be located in different places. Is there an established way to do this?
I'm considering a text field with a json "map" to the location of the properties:
{
"user": {
"comments" : %%HERE%%
}
}
Presumably my app would parse this, and loop through it to find the indicated location and then use it to find the data in the corresponding location in the response data. But I'm not exactly how to do it or if this is even the best way. Any suggestions are welcome.
Thinking this through a bit more, I realize that an alternative approach would be to store some kind of algorithm to finding the data. Is there a precedent for this? I briefly considered the idea of storing raw PHP code that could be executed to parse the data, but this feels very wrong and potentially dangerous/insecure.
JOLT may be helpful. It's for transforming JSON to JSON, much like XSLT for XML. You could write a spec for each new api, which would transform the data into a uniform format for your app to read.

twisted - transfer data using json

I need to transfer data (objects) between client and server, and Twisted seems a good way to accomplish this. I've been doing a lot searching but still haven't found any example to understand the basic principle. So any simple code would help.
Thanks!
EDIT
Both client and server are written in python
The data may be large, so I need a fast, reliable transmission ( I've taken a look at producers, is that good?)
Flask is great, but I am using another framework, so the whole networking thing relies on Twisted.
It's hard to tell if your question is more about json, python or twisted, but here's an overview, more can follow once the specifics are known. Perhaps you could add some more info to your question so we can offer more assistance :-)
re Json: Json is just a string with a defined structure. If you are working in python and have an object to send as json, then you need to convert the object to a json string by use of
import json
json.dumps(objectName)
If your client is javascript then instead of json.dumps you might use JSON.stringify(objectname).
If you intend to use javascript for clients then some of the frameworks like jQuery make it very easy.
Pythons json.dumps has a lot of optional arguments, most of which you won't need. You can see the options at https://docs.python.org/2/library/json.html
Python is python, I assume you know how to create and populate objects. Will your client be python or javascript or something else? From a javascript client to a python server you would most likely use Ajax to send requests and get responses.
Twisted allows you to easily create a server that will listen on a given port and, when data arrives, an event will occur that supplies the data received. You can then do whatever you need to with the data. Just be careful about doing blocking things like database inserts since the server may miss some data or otherwise misbehave if you interrupt it's event loop. Twisted can be difficult to learn initially, but it is a very powerful and reliable system that is well proven. One alternative to consider, particularly if your clients are not python, is node.js. In my opinion, node is a little bit easier to grasp initially and there are thousands of add-on modules that let you do almost anything you'd want. I use both twisted and node for different things.
Neither node.js nor twisted are software that you can use to just quickly spin up a server or client without some study and experimentation. To use Twisted or Node.js properly confidently, using all their features and goodness, requires a bit of research and work on your part.
There are excellent frameworks like Flask that can be used to build a server that can react to a number of different Ajax calls from a client - you can have a single server be able to respond to several different kinds of requests instead of having a server for each Ajax type.
This is a small library that serializes an object with all its children to JSON and also parses it back to a fully working object:
https://github.com/Toubs/PyJSONSerialization/

Why use XML(SOAP) when JSON so simple and easy to handle?

Receiving and sending data with JSON is done with simple HTTP requests. Whereas in SOAP, we need to take care of a lot of things. Parsing XML is also, sometimes, hard. Even Facebook uses JSON in Graph API. I still wonder why one should still use SOAP? Is there any reason or area where SOAP is still a better option? (Despite the data format)
Also, in simple client-server apps (like Mobile apps connected with a server), can SOAP give any advantage over JSON?
I will be very thankful if someone can enlist the major/prominent differences between JSON and SOAP considering the information I have provided(If there are any).
I found the following on advantages of SOAP:
There is one big reason everyone sticks with SOAP instead of using JSON. With every JSON setup, you're always coming up with your own data structure for each project. I don't mean how the data is encoded and passed, but how the data formatted format is defined, the data model.
SOAP has an industry-mature way of specifying that data will be in a certain format: e.g. "Cart is a collection of Products and each Product can have these attributes, etc." A well put together WSDL document really has this nailed. See W3C specification: Web Services Description Language
JSON has similar ways of specifying this data structure — a JavaScript class comes to mind as the most common way of doing this — but a JavaScript class isn't really a data structure used for this purpose in any kind of agnostic, well established, widely used way.
In short, SOAP has a way of specifying the data structure in a maturely formatted document (WSDL). JSON doesn't have a standard way of doing this.
If you are creating a client application and your server implementation is done with SOAP then you have to use SOAP in client side.
Also, see: Why use SOAP over JSON and custom data format in an “ENTERPRISE” application? [closed]
Nowadays SOAP is a complete overkill, IMHO. It was nice to use it, nice to learn it, and it is beautiful we can use JSON now.
The only difference between SOAP and REST services (no matter whether using JSON) is that SOAP WS always has it's own WSDL document that could be easily transformed into a self-descriptive documentation while within REST you have to write the documentation for yourself (at least to document the data structures). Here are my cons'&'pros for both:
REST
Pros
lightweight (in all means: no server- nor client-side extensions needed, no big chunks of XML are needed to be transfered here and there)
free choice of the data format - it's up on you to decide whether you can use plain TXT, JSON, XML, or even create you own format of data
most of the current data formats (and even if used XML) ensures that only the really required amount of data is transfered over HTTP while with SOAP for 5 bytes of data you need 1 kB of XML junk (exaggerated, ofc, but you got the point)
Cons
even there are tools that could generate the documentation from docblock comments there is need to write such comments in very descriptive way if one wants to achieve a good documentation as well
SOAP
Pros
has a WSDL that could be generated from even basic docblock comments (in many languages even without them) that works well as a documentation
even there are tools that could work with WSDL to give an enhanced try this request interface (while I do not know about any such tool for REST)
strict data structure
Cons
strict data structure
uses an XML (only!) for data transfers while each request contains a lot of junk and the response contains five times more junk of information
the need for external libraries (for client and/or server, though nowadays there are such libraries already a native part of many languages yet people always tend to use some third-party ones)
To conclude, I do not see a big reason to prefer SOAP over REST (and JSON). Both can do the same, there is a native support for JSON encoding and decoding in almost every popular web programming language and with JSON you have more freedom and the HTTP transfers are cleansed from lot of useless information junk. If I were to build any API now I would use REST with JSON.
I disagree a bit on the trend of JSON I see here. Although JSON is an order maginitude easier, I'd venture to say it's quite limited. For example, SOAP WS is not the last thing. Indeed, between soap client/server you now have enterprise services bus, authentification scheme based on crypto, user management, timestamping requests/replies, etc. For all of this, there're some huge software platforms that provide services around SOAP (well, "web services") and will inject stuff in your XML. So although JSON is probably enough for small projects and an order of magnitude easier there, I think it becomes quite limited if you have decoupled transmission control and content (ie. you develop the content stuff, the actual server, but all the transmission is managed by another team, the authentification by one more team, deployment by yet another team). I don't know if my experience at a big corp is relevant, but I'd say that JSON won't survive there. There are too many constraints on top of the basic need of data representation. So the problem is not JSON RPC itself, the problem is it misses the additional tools to manage the complexity that arises in complex applications (not to say that what you do is not complex, it's just that the software reflects the complexity of the company that produces it)
I think there is a lot of basic misinformation on this thread. SOAP, REST, XML, and JSON concepts seem to be mixed up in the responses.
Here is some clarification -
XML and JSON (an others) are encodings of information.
SOAP is a communications protocol
REST is an (Architecture) style
each is used for something different although you might use more than one of these things together.
Lets start with encoding data structures as XML vs JSON:
Everything JSON currently supports can be done in XML, but not the other way around. JSON will eventually adopt all the features that XML has, but its proponents haven't encountered all of the problems yet, once they get more experience things will be added on to close the gap. for example JSON didn't start out with Schemas and binary formats.
SOAP is a communication protocol for calling an operation. It runs on top of things like, HTTP, SMTP, etc. Aside from many other features, SOAP messages can span multiple "application" layer protocols. i.e. i can sent a SOAP message by HTTP to a service endpoint which then puts it on a message queue for another system. SOAP solves the problem of maintaining authentication, message authenticity, etc. as the requested moved between different parts of a distributed system.
JSON and other data formats canbe sent via SOAP. I work with some systems that sent binary fixed-width encoded objects via SOAP, its not a problem.
The analogy is that - if only the postman is allowed to send you a letter, then it is just HTTP, but if anyone can send you a letter, then you want SOAP. (i.e. message transport security vs message content security)
the 6 REST constraints are architectural style. Interestingly the first several years of REST the examples were in SOAP. (there is no such thing as REST or SOAP they are not opposites)
A "heavyweight bloated, etc.etc." SOA SOAP system might have monoliths with operations like GET, PUT, POST instances of a single entity. SOAP doesn't have those operations predefined, but that is typically how it is used.
Consider that if you built a "REST" service on HTTP alone with an SSL/TLS terminating proxy, then you may have violated the 4th constraint of REST.
So for your software development today, you wouldn't normally interact with any of these directly. Just as if you were written a graphics program you wouldn't directly work with HDMI vs. DisplayPort typically.
The question is do you understand architecturally what your system needs to do and configure it to use the mechanism that does that job. (for example, all the challenges of applying today's microservices to general systems are old problems previously solved by SOAP, CORBA and the old protocols)
I have spent several years writing SOAP web services (with JAX WS). They are not hard to write. And I love the idea of a single endpoint and single HTTP method (POST). For me, REST is too verbose.
But as a data container, JSON is simpler, smaller, more readable, more flexible, looks closer to programming languages.
So, I reinvented the wheel and created my own approach to writing backends for AJAX requests. In comparison:
REST:
get user: method GET https://example.com/users/{id}
update user: method POST https://example.com/users/ (JSON with User object in request body)
RPC:
get user: method GET https://example.com/getUser?id=1
update user: method POST https://example.com/updateUser (JSON with User object in the request body)
My way (the proposed name is JOH - JSON over HTTP):
get user: method POST https://example.com/ (JSON specifies both user ID and class/method responsible for handling request)
update user: method POST https://example.com/ (JSON specifies both user object and class/method responsible for handling request)

ajax html vs xml/json responses - performance or other reasons

I've got a fairly ajax heavy site and some 3k html formatted pages are inserted into the DOM from ajax requests.
What I have been doing is taking the html responses and just inserting the whole thing using jQuery.
My other option is to output in xml (or possibly json) and then parse the document and insert it into the page.
I've noticed it seems that most larger site do things the json/xml way. Google Mail returns xml rather than formatted html.
Is this due to performance? or is there another reason to use xml/json vs just retrieving html?
From a javascript standpoint, it would seem injecting direct html is simplest. In jQuery I just do this
jQuery.ajax({
type: "POST",
url: "getpage.php",
data: requestData,
success: function(response) {
jQuery('div#putItHear').html(response);
}
with an xml/json response I would have to do
jQuery.ajax({
type: "POST",
url: "getpage.php",
data: requestData,
success: function(xml) {
$("message",xml).each(function(id) {
message = $("message",xml).get(id);
$("#messagewindow").prepend("<b>" + $("author",message).text() +
"</b>: " + $("text",message).text() +
"<br />");
});
}
});
clearly not as efficient from a code standpoint, and I can't expect that it is better browser performance, so why do things the second way?
Returning JSON/XML gives the application more freedom compared to returning HTML, and requires less specific knowledge in different fields (data vs markup).
Since the data is still just data, you leave the choice of how to display it to the client side of things. This allows a lot of the code to be executed on the client side instead of on the server - the server side needs to know only about data structures and nothing about markup. All the programmer needs to know is how to deliver data structures.
The client implementation only needs to know about how to display the data structures returned by the server, and doesn't need to worry about how these structures actually get build. All the programmer needs to know is how to display data structures.
If another client is to be build (that doesn't use HTML as a markup language), all the server components can be reused. The same goes for building another server implementation.
It will normally reduce the amount of data transferred and therefore improve transfer speed. As anything over-the-wire is normally the bottleneck in a process reducing the transfer time will reduce the total time taken to perform the process, improving user experience.
Here are a few pros for sending JSON/XML instead of HTML:
If the data is going to ever be used outside of your application HTML might be harder to parse and fit into other structure
JSON can be directly embedded in script tags which allows cross domain AJAX scenarios
JSON/XML preserves the separation of concerns between the server side scripts and views
Reduces bandwidth
You should check out Pure, a templating tool
to generate HTML from JSON data.
Generally JSON is a more efficient way to retrieve data via ajax as the same data in XML is a lot larger. JSON is also more easily consumed by your client side Javascript. However, if you're retrieving pure HTML content I would likely do as you suggest. Although, If you really needed to, you could embed your HTML content within a JSON string and get the best of both worlds
I'm currently wrestling with this decision too and it didn't quite click until I saw how Darin boiled it down:
"If the data is going to ever be used outside of your application HTML might be harder to parse and fit into other structure"
I think a lot of it is where/how the data is going. If it's a one-off application that doesn't need to share/send data anywhere else, then spitting back pure HTML is fine, even if it does weigh more.
Personally, if there is complex HTML to be wrapped around the data, I just spit back the HTML and drop it in. jQuery is sweet and all, but building HTML with Javascript is often a pain. But it's a balance game.
In some cases, AJAX responses need to return more information than just the HTML to be displayed. For example, let's say you are returning a list of the first twenty items from a search. You may need to return the total number of search results to be displayed somewhere else in the DOM. You could try piggybacking the total count in a hidden div, but that can get messy. With JSON, the total count can simply be a field value a structured JSON response.
To me it boils down to this:
It's for many of us, much less work to use a server side, mature, template engine that we're accustomed to, to generate html and send it down the pipe, than using a bunch of javascript code to generate HTML client side. Yes, there are some templating engines for javascript now which may mitigate it somewhat.
Since I already separate model, logic and views server side, there is no argument in having yet another separation. JSON is a view, HTML is another view.
And lets face it; both HTML/AJAX and JSON/AJAX are many times better than full page over the pipe.
The final thing you perhaps need to think about is; if you're going to be search engine friendly - you might have to generate the HTML server side any way (the old degrade gracefully mantra).
I usually do a combination. If there is client side logic, I use JSON - else I use HTML. Notifications and autocomplete special fields are sent via JSON.