Is there a way around a JSON block? - json

I accessed the following url last night for the first time and saw the JSON data there without a problem: http://mlb.mlb.com/ws/search/MediaSearchService?team_id=111&start=0&site=mlb&hitsPerPage=12&hitsPerSite=10&type=json&c_id=&src=vpp&sort=desc&sort_type=custom
This morning, I get the following message: "Please contact Search administrator as request is coming from invalid host/domain "
I am also unable to get a response in Yahoo Pipes.
Is there a way around what appears to be a block? Not sure what else could have happened.

Are you sure that isn't an internal API which got exposed accidentally? If it is, they have the full right to block everyone out (just tested, I get the very same message) or if it's meant for 3rd party developers but through managed API key storage, you have to obtain an API key somehow.

If there's a block, there's a reason.
I'd be very leery of circumventing this particular block - Major League Baseball has proven very litigious in the past. You don't want to get sued for a DMCA violation - they can afford much, much more expensive lawyers than you can.

Your JSON is being served over http. Since your URL worked before and is not working now, the server is most likely checking for an authentication cookie. which has probably expired. You should check to see if there is an API/way to authenticate your call with their service, otherwise even if you find a workaround there is not guarantee that it will continue to work.

Related

Why isn't it possible to download a file for status code 4XX and 5XX

I have noticed that many http clients including Firefox and Chrome don't allow file downloads for http response codes with 4XX and 5XX. However, some clients allow these downloads, like curl and wget (with --content-on-error option).
Both Chrome and Firefox don't provide nice exception messages.
Chrome fails with ERR_INVALID_RESPONSE. Firefox fails with File not found. As stated above for the curly and wget work for the same URL.
I was wondering if there is a specification that defines the correct behavior in this case? Are there good reasons why the request can't be processed by Chrome and Firefox? Also, it seems strange that they don't provide proper feedback.
I think for most cases a download for failing requests makes no sense, but for some cases it would be helpful. One good example where downloading a file even in the error case would be if there is a client that only communicates with the server using some 3rd party format. The client would have to download a generated file for the request. In the case of an error, the client should download a file containing the error description.
For example the RFC7231 states
Response messages with an error status code
usually contain a payload that represents the error condition, such
that it describes the error state and what next steps are suggested
for resolving it.
The 4xx (Client Error) class of status code indicates that the client
seems to have erred. Except when responding to a HEAD request, the
server SHOULD send a representation containing an explanation of the
error situation, and whether it is a temporary or permanent
condition. These status codes are applicable to any request method.
User agents SHOULD display any included representation to the user.
This doesn't forbid downloading in the case of an error.
Edit because of the first answer:
I don't think that this behavior is user friendly and I don't think that user friendliness is really the reason behind this. For example it would make way more sense to show the error code and error message (provided in the header) to the user. Or least indicate the error with an error message like "cannot download the file, because the server responded with an error". There might be servers that can only respond with XML or any other random file format.
What bugs me the most is that both browsers respond with different but arbitrary errors that don't hint any information about the underlying issue.
It might be that this is an undocumented edge case and both Chrome and Firefox just fall back to a default error, but this seems unlikely, especially because this is an edge case that has a special flag in wget.
4XX: Why would you assume a file download if your client did something wrong?
If we assume that an API has an endpoint that replies with a certain file format, it is fair to assume that also the error message including a hint what the client did wrong is provided in that format. So the file can help to fix the client error.
I'm not aware of any specification for that topic.
The behavior should be as user friendly as possible.
4XX:
Why would you assume a file download if your client did something wrong? Furthermore, the client software could not differ between the case of wrong usage(e.g. invalid url) and handling a file download.
5xx:
As you stated most api provide error information, but you could also not differ the case of downloading and for example an internal error providing the file.
You can use such behavior with wget and curl as you mentioned, but its not user friendly nor practical for using such an API programmatically.
The above info in mind, Chrome and firefox just try to be user friendly.
I hope I could somehow answer your question or challenge the idea behind it. :)
Looking at chromium handle download and not 2xx we see:
// The response code indicates that this is an error page, but we don't
// know how to display the content. We follow Firefox here and show our
// own error page instead of intercepting the request as a stream or a
// download.
So Chrome followed Firefox, and both are entirely consistent with the RFCs, the browser knows this payload is unidentified data relating to an error condition, so saving it as the file in question is not an option. Since it is being downloaded, presumably the browser can't display the payload, but in either case has been instructed not to, so displaying it in the error context is not a safe option. Since it is an error there is also a high likelihood that the sender has combined a partial response with an error code meaning that the payload contents may be an incomplete or corrupt representation of data from a 2xx response/etc.
If you look back at wget, --content-on-error is a specific option because it is the wrong thing to do as a general browser. A client side that works with the payload type could examine the errors when it is directly interacting with a server and wget is only providing options to help you debug such an interaction. A normal browser has less features to help emulate other clients for debugging than a text CLI, since a text CLI exists primarily to emulate some other client while debugging.
I was wondering if there is a specification that defines the correct
behavior in this case? Are there good reasons why the request can't be
processed by Chrome and Firefox? Also, it seems strange that they
don't provide proper feedback.
There is no such specification for this, but the chromium project member finds this as a trivial issue and unlikely to be fixed in near future. Instead of they fixing in the chromium they suggest that it should be fixed on the server by sending proper HTTP status.
Response from Chromium Project Member: "This issue has been Available for over a year. If it's no longer
important or seems unlikely to be fixed, please consider closing it
out. If it is important, please re-triage the issue."
Sorry for the inconvenience if the bug really should have been left as
Available.
You can check more details here Issue 479265
What's happening beneath the surface?
I further checked the source code of the chromium to find what actually happening and found that for any non 200 status for downloads, they are simply throwing ERR_INVALID_RESPONSE (Invalid Server Response) error.
To cut a long story short, you have to live with this behaviour of the browser, it is not going to be improved.
Building on #lossleader's answer, it looks like Chromium decided to follow Firefox's decision to not download files if the response was not successful.
It seems like this issue has a history. In 2005 an AOL website had an issue that returned a status code 500 and resulted in users downloading an .exe file. There was a "fix" that simply returns a 404 for responses that trigger a download and with erroneous responses. The corresponding issue can be found here.
There is an open issue from 2008, that complains about this error and states that it would is misleading. The corresponding issue can be found here.
I found a more detailed answer about this on Super User.
I still think that it would be correct to at least offer a choice to the user to download the file nevertheless or at least show a more meaningful error page. On the other hand, in most cases a download for a response code != 2XX is unintended and hints a server error. Therefore it seems that this issue has a low priority for browser vendors and seems "not worth the trouble".
These answers all seem to bypass the fundamental here: You're trying to give a browser-specific interpretation to an error in your code. From my point of view, in all associated cases, your code is failing in some manner without error handling.
4xx error? You've sent a bad request to the server, according to rules you have determined. It's not, technically, the browser's fault.
5xx error? Your server crashed and didn't throw a pretty error. On some types of server, (Django) a 500 error will be a bunch of debug information you probably shouldn't show the user.
Thus what you're asking for is strange from an architectural standpoint; you want to cover up the fact that you've screwed up by modifying the browser's response rather than fixing your code to respond appropriately.

How to find how many json endpoints an api has

I’m in the middle of making an Express app. It’s just a learning project.
I’m getting some info from an Anime api called jikan.me, it provides info about different Anime series like a picture url and synopsis.
For example one is at https://api.jikan.me/anime/16 .
Now, the jikan api might have a json endpoint at anime/1 but there's nothing at anime/2.
I want to find a list of all the numbers (https://api.jikan.me/anime/[numbers]) that actually contain endpoints.
I've tried simply going to https://api.jikan.me/anime but it returns error: No ID/Path Given.
I'm expecting there is likely no absolute answer to this problem but that I might learn something about server-side code along the way.
Where would I begin to look to find this info?
This is a bit late but, Jikan is an unofficial REST API for MyAnimeList. The IDs are respective to the IDs on MAL. For example; https://myanimelist.net/anime/1 can be parsed through https://api.jikan.moe/anime/1 but the ID 2 does not exist on MAL. It's a 404, hence that error.
To initially get some IDs, you can try the search endpoint.
Furthermore, I'll be releasing REST 2.2 quite soon (this month) which will give you the ability to parse from pages like these and thus you'll get another endpoint that provides a handful of IDs to get their data from.
Source: I'm the developer of Jikan
If it's not in the documentation it's probably information not available to you... a REST api needs to be specifically configured to offer certain endpoints, that number at the end might just be an ID that's searched for in an internal database and there's no way for the application to know if there's gonna be something there; all they can do is return an error message for you to handle as is the case here.

Non-standard JSON and Azure Logic Apps

I have an API that produces JSON like this:
)]}',
{
//JSON DATA
}
The //JSON DATA is valid JSON, but the )]}', up top is not.
When I try to GET this data via a Logic App, I get:
BadRequest. Http request failed: the content was not a valid JSON.
So, a few related questions:
1) Can I tell the logic app to return the invalid JSON anyway?
2) How can debug the issue better? I happen to know that the response is invalid, but what if I didn't? Can I see the raw data somewhere?
3) This is all done via the Azure web portal. Are there better tools? Visual Studio?
I should also mention that if I call a route on the same API that returns XML instead of JSON, then the Logic App works fine. So it definitely doesn't like the JSON response in particular.
Thanks!
First of all, please do not post three questions as a single question.
Question 1). The best thing you can do is make the API return a valid JSON object. This is good for million reasons. Here're a few:
it's pretty much a standard (either valid JSON or XML -- yeah, old school way);
therefore, no users of this API (including you) will need to struggle and guess what's going on and why;
your Logic App's step will just work without adding extra complexity;
you will make this world and your karma better.
If API-side changes are not within your reach, I don't think you can do much. If you're lucky and the HTTP action is successful (Status Code 2xx), you can try to use a Query Action with a function that truncates the first characters. It will look something like this (I don't know the exact syntax): #Substring(body('myHttpGet'), 4, length(body('myHttpGet')) - 4) where myHttpGet is the id of the Http Get action.
However, once again, if possible, I strongly recommend fixing up the API which is the root cause of the problem, instead of dealing with garbage response after that.
UPDATE Another thing you can do is wrap the dirty API. For example, you could create a trivial Azure Function that invokes the API you don't directly control, and sanitizes the response for you consumption requirements. This Azure Function function should be easy to call from the Logic App. It costs almost nothing (unless we're talking millions of requests/month). The only drawback here is the increasing latency, which may be not an issue at all -- test it and see whether it adds less than 100ms or so... Oh, and don't forget to file a ticket with the API owner, they make our world a bad place!
Question 2) In Azure Logic App web UI you can Look into the execution details and the error will definitely be there.
Question 3) You're asking for a tool recommendation which is by definition a highly subjective thing and is off-topic on StackOverflow.
TL/DR: The other app is not producing valid JSON.
Meaning, this is not a problem for you to solve. The other app has to return valid JSON if the owner claims it should.
If they cannot or will not produce valid JSON, then the first thing you need to do is inform your management that you will have to spend a lot of extra time accommodating their non-standard format.

Is there some kind of http validator?

I am writing a web service in node, and testing it with Postman. I spent a long timing looking for an error. When I finally found it, it turned out to be a simple error formatting the response body, which is json.
If I leave off the final brace in the response body, Postman waits for two minutes, and then reports that it received everything, just fine.
If I leave off the closing quote in the last value in the json, Postman says the server didn't respond, perhaps I should check my security certificates.
I would much rather Postman said "Hey, Buddy, you left off a quote!"
If there some validation service I can talk to? Or a plugin in Postman?
Here there are some validation javascript libraries, you can use:
Validator provides a declarative way of validating javascript objects.
Express-validator acts as an express.js middleware for node-validator.
Meanwhile, Postman got API testing and Collection Runner that can help you through this; which you can write some pre-request script as well as test script for each request.
Also, they got Newman which is a command-line collection runner. It allows you to effortlessly run and test a Postman collection directly from the command-line. It is built with extensibility in mind so that you can easily integrate it with your continuous integration servers and build systems.
I found that Paw worked (https://paw.cloud/). And so far I haven't paid for it.
Where Postman said "check your security certificates," Paw said "we were expecting 376 bytes but you only sent us 312."
Cuts down my time solving the problem a lot!
I use Fiddler for this. It is very good at identifying (with an error message that pops up) problems and bad implementations of the HTTP protocol. Browse the web with it running, and within a few minutes you'll undoubtedly hit a poorly implemented server.
Postman won't be able to handle these cases since it's insulated from poor behavior by the browser's framework.
That's not your problem though.
When I finally found it, it turned out to be a simple error formatting the response body, which is json.
That has absolutely nothing to do with HTTP. HTTP doesn't know or care what your request/response bodies are.
The problem you face is that your API endpoint could be returning whatever it wants. You need a custom solution to your problem, as there is no standard API server in this case.
Most folks will run unit tests that hit common endpoints of your service to ensure they're alive and well.
I should also point out that it should be all but impossible for you to break the JSON response if you're doing it correctly. Sounds like you're serializing JSON manually... never do that, we have JSON serializers for this purpose. Send in an object and let it worry about building the JSON output for you. Otherwise, you'll waste a lot of time on problems like these.

HTML interface to RESTful web service *without* javascript

Even if I offer alternatives to PUT and DELETE (c.f. "Low REST"), how can I provide user-friendly form validation for users who access my web service from the browser, while still exposing RESTful URIs? The form validation problem (described below) is my current quandry, but the broader question I want to ask is: if I go down the path of trying to provide both a RESTful public interface and a non-javascript HTML interface, is it going to make life easier or harder? Do they play together at all?
In theory, it should be merely a matter of varying the output format. A machine can query the URL "/people", and get a list of people in XML. A human user can point their browser at the same URL, and get a pretty HTML response instead. (I'm using the URL examples from the microformats wiki, which seem fairly reasonable).
Creating a new person resource is done with a POST request to the "/people" URL. To achieve this, the human user can first visit "/people/new", which returns a static HTML form for creating the resource. The form has method=POST and action="/people". That will work fine if the user's input is valid, but what if we do validation on the server side and discover an error? The friendly thing would be to return the form, populated with the data the user just entered, plus an error message so that they can fix the problem and resubmit. But we can't return that output directly from a POST to "/people" or it breaks our URL system, and if we redirect the user back to the "/people/new" form then there is no way to report the error and repopulate the form (unless we store the data to session state, which would be even less RESTful).
With javascript, things would be much easier. Just do the POST in the background, and if it fails then display the error at the top of the form. But I want the app to degrade gracefully when javascript support isn't available. At the moment, I'm led to conclude that a non-trivial web app cannot implement an HTML interface without javascript, and use a conventional RESTful URL scheme (such as that described on the microformats wiki). If I'm wrong, please tell me so!
Related questions on Stack Overflow (neither of which deal with form validation):
How to send HTML form RESTfully?
How do you implement resource "edit" forms in a RESTful way?
you could have the html form post directly to /people/new. If the validation fails, rerender the edit form with the appropriate information. If it succeeds, forward the user to the new URL. This would be consistent with the REST architecture as I understand it.
I saw you comment to Monis Iqbal, and I have to admit I don't know what you mean by "non-RESTful URLS". The only thing the REST architecture asks from a URL is that it be opaque, and that it be uniquely paired to a resource. REST doesn't care what it looks like, what's in it, how slashes or used, how many are used, or anything like that. The visible design of the URL is up to you and REST has no bearing.
Thanks for the responses. They have freed my mind a bit, and so in response to my own question I would like to propose an alternative set of RESTful URL conventions which actually embrace the two methods (GET and POST) of the non-AJAX world, instead of trying to work around them.
Edit: As commenters have pointed out, these "conventions" should not be part of the RESTful API itself. On the other hand, internal conventions are useful because they make the server-side implementation more consistent and hence easier for developers to understand and maintain. RESTful clients, however, should treat the URLs as opaque, and always obtain them as hyperlinks, never by constructing URLs themselves.
GET /people
return a list of all records
GET /people/new
return a form for adding a new record
POST /people/new
create a new record
(for an HTML client, return the form again if the input is invalid, otherwise redirect to the new resource)
GET /people/1
return the first record
GET /people/1/edit
return a form for editing the first record
POST /people/1/edit
update the first record
GET /people/1/delete
return a form for deleting the record
(may be simply a confirmation - are you sure you want to delete?)
POST /people/1/delete
delete the record
There is a pattern here: GET on a resource, e.g. "/people/1", returns the record itself. GET on resource+operation returns an HTML form, e.g. "/people/1/edit". POST on resource+operation actually executes the operation.
Perhaps this is not quite so elegant as using additional HTTP verbs (PUT and DELETE), but these URLs should work well with vanilla HTML forms. They should also be pretty self-explanatory to a human user...I'm a believer in the idea that "the URL is part of the UI" for users accessing the web server via a browser.
P.S. Let me explain how I would do the deletes. The "/people/1" view will have a link to "/people/1/delete", with an onclick javascript handler. With javascript enabled, the click is intercepted and a confirmation box presented to the user. If they confirm the delete, a POST is sent, deleting the record immediately. But if javascript is disabled, clicking the link will instead send a GET request, which returns a delete confirmation form from the server, and that form sends the POST to perform the delete. Thus, javascript improves the user experience (faster response), but without it the website degrades gracefully.
Why do you want to create a second "API" using XML?
Your HTML contains the data your user needs to see. HTML is relatively easy to parse. The class attribute can be used to add semantics as microformats do. Your HTML contains forms and links to be able to access all of the functionality of your application.
Why would you create another interface that delivers completely semantic free application/xml that will likely contain no hypermedia links so that you now have to hard code urls into your client, creating nasty coupling?
If you can get your application working using HTML in a web browser without needing to store session state, then you already have a RESTful API. Don't kill yourself trying to design a bunch of URLs that corresponds to someone's idea of a standard.
Here is a quote from Roy Fielding,
A REST API must not define fixed
resource names or hierarchies
I know this flies in the face of probably almost every example of REST that you have seen but that is because they are all wrong. I know I am starting to sound like a religious zealot, but it kills me to see people struggling to design RESTful API's when they are starting off on completely the wrong foot.
Listen to Breton when he says "REST doesn't care what [the url] looks like" and #Wahnfrieden will be along soon to tell you the same thing. That microformats page is horrible advice for someone trying to do REST. I'm not saying it is horrible advice for someone creating some other kind of HTTP API, just not a RESTful one.
Why not use AJAX to do the work on the client side and if javascript is disabled then design the html so that the conventional POST would work.