How is following HTTP url string parameter encoded and decoded? &=& vs && - reverse-engineering

I was going through some website and stumbled upon following bug in it, while playing with different combinations for url parameters.
When I append ?&=& to any valid url on this website I get following error: /p is part of url (java.lang.ArrayIndexOutOfBoundsException).
Chrome parses the string as below:
But this exception is not raised when I append ?&& instead of ?&=&. Chrome parses both string into same thing.
How is "?&=&" actually parsed and how is it different from "?&&"? As chrome parses them into same thing why does it generate an exception only in former case?
What kind of bug does this website might have?
Can such bug be used to do some kind of attack on this website?
Note:
I do not own this website so I am just curious to know what might have caused this bug.
Issue is seen consistently on both chrome and firefox.
builtwith.com says this website uses ngix server.
Let me know if this is offtopic here. Didnt find any such info.
Edit:
I understand what this exception means. I just want to know if these 2 kind of parameter are parsed differently. What are the possible cause of such a bug.

A java.lang.ArrayOutOfBoundsException caused by the Java backend of the page.
This exception occurs when a java application is trying to access an element in a Java Array that does not exist.
How and why exactly this occurs or how the url parameters are processed is impossible to say without having access to the source code of the backend.
It is not caused by the frontend code or by your browser.

Related

Why isn't it possible to download a file for status code 4XX and 5XX

I have noticed that many http clients including Firefox and Chrome don't allow file downloads for http response codes with 4XX and 5XX. However, some clients allow these downloads, like curl and wget (with --content-on-error option).
Both Chrome and Firefox don't provide nice exception messages.
Chrome fails with ERR_INVALID_RESPONSE. Firefox fails with File not found. As stated above for the curly and wget work for the same URL.
I was wondering if there is a specification that defines the correct behavior in this case? Are there good reasons why the request can't be processed by Chrome and Firefox? Also, it seems strange that they don't provide proper feedback.
I think for most cases a download for failing requests makes no sense, but for some cases it would be helpful. One good example where downloading a file even in the error case would be if there is a client that only communicates with the server using some 3rd party format. The client would have to download a generated file for the request. In the case of an error, the client should download a file containing the error description.
For example the RFC7231 states
Response messages with an error status code
usually contain a payload that represents the error condition, such
that it describes the error state and what next steps are suggested
for resolving it.
The 4xx (Client Error) class of status code indicates that the client
seems to have erred. Except when responding to a HEAD request, the
server SHOULD send a representation containing an explanation of the
error situation, and whether it is a temporary or permanent
condition. These status codes are applicable to any request method.
User agents SHOULD display any included representation to the user.
This doesn't forbid downloading in the case of an error.
Edit because of the first answer:
I don't think that this behavior is user friendly and I don't think that user friendliness is really the reason behind this. For example it would make way more sense to show the error code and error message (provided in the header) to the user. Or least indicate the error with an error message like "cannot download the file, because the server responded with an error". There might be servers that can only respond with XML or any other random file format.
What bugs me the most is that both browsers respond with different but arbitrary errors that don't hint any information about the underlying issue.
It might be that this is an undocumented edge case and both Chrome and Firefox just fall back to a default error, but this seems unlikely, especially because this is an edge case that has a special flag in wget.
4XX: Why would you assume a file download if your client did something wrong?
If we assume that an API has an endpoint that replies with a certain file format, it is fair to assume that also the error message including a hint what the client did wrong is provided in that format. So the file can help to fix the client error.
I'm not aware of any specification for that topic.
The behavior should be as user friendly as possible.
4XX:
Why would you assume a file download if your client did something wrong? Furthermore, the client software could not differ between the case of wrong usage(e.g. invalid url) and handling a file download.
5xx:
As you stated most api provide error information, but you could also not differ the case of downloading and for example an internal error providing the file.
You can use such behavior with wget and curl as you mentioned, but its not user friendly nor practical for using such an API programmatically.
The above info in mind, Chrome and firefox just try to be user friendly.
I hope I could somehow answer your question or challenge the idea behind it. :)
Looking at chromium handle download and not 2xx we see:
// The response code indicates that this is an error page, but we don't
// know how to display the content. We follow Firefox here and show our
// own error page instead of intercepting the request as a stream or a
// download.
So Chrome followed Firefox, and both are entirely consistent with the RFCs, the browser knows this payload is unidentified data relating to an error condition, so saving it as the file in question is not an option. Since it is being downloaded, presumably the browser can't display the payload, but in either case has been instructed not to, so displaying it in the error context is not a safe option. Since it is an error there is also a high likelihood that the sender has combined a partial response with an error code meaning that the payload contents may be an incomplete or corrupt representation of data from a 2xx response/etc.
If you look back at wget, --content-on-error is a specific option because it is the wrong thing to do as a general browser. A client side that works with the payload type could examine the errors when it is directly interacting with a server and wget is only providing options to help you debug such an interaction. A normal browser has less features to help emulate other clients for debugging than a text CLI, since a text CLI exists primarily to emulate some other client while debugging.
I was wondering if there is a specification that defines the correct
behavior in this case? Are there good reasons why the request can't be
processed by Chrome and Firefox? Also, it seems strange that they
don't provide proper feedback.
There is no such specification for this, but the chromium project member finds this as a trivial issue and unlikely to be fixed in near future. Instead of they fixing in the chromium they suggest that it should be fixed on the server by sending proper HTTP status.
Response from Chromium Project Member: "This issue has been Available for over a year. If it's no longer
important or seems unlikely to be fixed, please consider closing it
out. If it is important, please re-triage the issue."
Sorry for the inconvenience if the bug really should have been left as
Available.
You can check more details here Issue 479265
What's happening beneath the surface?
I further checked the source code of the chromium to find what actually happening and found that for any non 200 status for downloads, they are simply throwing ERR_INVALID_RESPONSE (Invalid Server Response) error.
To cut a long story short, you have to live with this behaviour of the browser, it is not going to be improved.
Building on #lossleader's answer, it looks like Chromium decided to follow Firefox's decision to not download files if the response was not successful.
It seems like this issue has a history. In 2005 an AOL website had an issue that returned a status code 500 and resulted in users downloading an .exe file. There was a "fix" that simply returns a 404 for responses that trigger a download and with erroneous responses. The corresponding issue can be found here.
There is an open issue from 2008, that complains about this error and states that it would is misleading. The corresponding issue can be found here.
I found a more detailed answer about this on Super User.
I still think that it would be correct to at least offer a choice to the user to download the file nevertheless or at least show a more meaningful error page. On the other hand, in most cases a download for a response code != 2XX is unintended and hints a server error. Therefore it seems that this issue has a low priority for browser vendors and seems "not worth the trouble".
These answers all seem to bypass the fundamental here: You're trying to give a browser-specific interpretation to an error in your code. From my point of view, in all associated cases, your code is failing in some manner without error handling.
4xx error? You've sent a bad request to the server, according to rules you have determined. It's not, technically, the browser's fault.
5xx error? Your server crashed and didn't throw a pretty error. On some types of server, (Django) a 500 error will be a bunch of debug information you probably shouldn't show the user.
Thus what you're asking for is strange from an architectural standpoint; you want to cover up the fact that you've screwed up by modifying the browser's response rather than fixing your code to respond appropriately.

Django cannot parse POST parameters of WSGIRequest on Internal Server Errors

I'm using Django REST Framework and all the API calls come from Android and iOS apps. The system works perfectly most of the time, however, when an internal server error happens and I get an email from Django, the POST of the WSGIRequest contains <could not parse> instead of the actual posted JSON data (even though 'CONTENT_TYPE': 'application/json' is also in the header, and the data is sent as JSON).
This is really annoying, as it would be great to see the request body that actually causes the error, not just the stacktrace.
The <could not parse> part is very similar to this question (in the ModPythonRequest part): django request.POST contains <could not parse>, except the actual problem is slightly different. Also the reference link in that question (https://stackoverflow.com/questions/12471661/mod-python-could-not-parse-the-django-post-request-for-blackberry-and-some-andro) seems to have gone down, even though the name looked very promising.
I'm on Django 1.6.2 and DRF 2.3.13.
The POST dictionary of the WSGIRequest is always going to be invalid, because it is intended to hold the parsed form data when the Content-Type is application/x-www-form-urlencoded or multipart/form-data.
The data you want is in the body attribute of the WSGIRequest object, which isn't output when that object is converted to a string to be written to the log.
When using Django REST framework, you will typically want to access request.DATA (which will handle whatever formats you have parsers configured for - defaulting to form content and JSON) instead of Django's standard request.POST, which will only handle form encoded data.

Debugging Web Service that fails as JSON but not as XML

I have a webservice method that if I call directly via url GET returns XML without issue.
However, POST to that same url with Content-Type Json, it fails.
I think I can figure out the issue (I'm guessing it's an encoding or bad character somewhere in there) but I don't know how to debug the problem.
If I set a breakpoint in the webservice, it runs to completion. The failure appears to be happening AFTER the method returns, but BEFORE the json is returned to the caller.
How can I get in between to trace the error?
Please let me know if I can provide more context to help, but I really just need to know how to get in there.
EDIT:
The web service is configured to receive POST and return JSON and in fact DOES correctly return JSON in some cases. However, there are certain calls that are failing, so I need a way to trace this or debug it somehow and figure out why some calls are not working.
The web service is likely not configured to receive POST requests, especially if you are receiving a 405 Method Not Allowed response status.
Although I didn't find a way to debug or intercept the request to find the exact answer, it turns out the problem was the size of the content being returned by the webservice. Following this answer: ASP.NET WebMethod with jQuery json, is there a size limit?
and increasing the json limit fixed the issue!
Is there a way I could have trapped this to find the error without just guessing it was a size limit?

Contact API directly from URL in browser

I am trying to understand how POST and/or GET methods work in terms of the actual browser.
I am attempting to contact an API which requires API key, method you wish to use on their side, and an IP address at the minimum.
My original idea was to do something like this:
I feel like I'm on the right track, it does something and gets an error as opposed to telling me the page does not exist. I'm expecting either JSON or XML in response as the API supports both but instead I get this error:
This page contains the following errors:
error on line 1 at column 1: Document is empty
Below is a rendering of the page up to the first error.
Upon studying the documentation of the API more, I found something saying that methods are called using HTML form application/x-www-form-urlencoded and the resuource models are given as form elements.
I tried researching what that means to see what the problem was and found this site http://www.w3.org/TR/xforms11/ but I'm still unclear.
Ideas?
It seems to mean that the application is expecting a POST method but you're doing a request with a GET method (when you use the querystrings).
Since you can't just do browser requests using POST using the address bar, you may need to:
Construct a simple JS function that does a xmlhttprequest request using that method instead, and running it from the console;
Create a simple HTML page that automates the above process, allowing you do make POST calls;
Using CURL instead, which is a great tool for testing those kinds of requests.

Direct POST into URL not working?

I am trying to contact an API by posting the parameters in the URL. I am unsure whether it will respond in XML or JSON, but it is one of the two, however, it says there is an error.
This is an example of what I'm submitting. I am receiving this in response:
This page contains the following errors:
error on line 1 at column 1: Document is empty
Below is a rendering of the page up to the first error.
I do not know what is going on... I followed the syntax of the POST I believe, my only remaining question about the syntax would be whether the ? is in the right spot. The page API does work when I POST using PHP...
Or maybe it is working, the browser just isn't capable of understanding an XML or JSON response? (I'm using chrome so I do not think this is the issue)
Otherwise, if anyone has any insight on this, I'd be greatful
A different browser yields this error:
XML Parsing Error: syntax error
Location:
Line Number 1, Column 1:Array
^
While the syntax of the URL does seem to be fine, you imply that the API expects the parameters in POST. Adding them to the actual URL means the parameters are passed in GET, rather than POST.
You could try to test this by making a little HTML form containing all the relevant parameters and passing them to this API via POST, and see if that gives you the expected result.
your issue is how their being sent to the api it should be url-encoded
http://api.example.com/api/?apikey=asdfa23462=example&ip=208.74.76.5
should be
http://api.example.com/api/?apikey=asdfa23462&=example&ip=208.74.76.5
also another issue i see is that you have ?apikey=asdfasfsdafsd&=example
the =example could well be the issue all together.
just some thoughts from what i see.