XSS vulnerability for JSON API - json

I have a REST API that accepts and returns JSON data.
A sample request response is a follows
Request
{
"repos": [
"some-repo",
"test-repo<script>alert(1)</script>"
]
}
Response
{
"error": "Error Message",
"repos": [
"test-repo<script>alert(1)</script>"
]
}
Is my API vulnerable for XSS?. From what I understand, since the Content-Type is set to application/json, the API as such is safe from XSS. The client needs to ensure that the output is encoded to prevent any XSS attacks.
To add an additional layer of security, I can add some input encoding/validation in the API layer.
Please let me know if my assessment is right and any other gotchas that I need to be aware of

I think it's right that any XSS issue here is a vulnerability of the client. If the client inserts HTML into a document, then it is its responsibility to apply any neccessary encoding.
The client knows what encoding is required not the server. Different encoding, or no encoding may be needed in different places for the same data. For example:
If a client did something like:
$(div).html("<b>" + repos + "</b>");
then it would be vulnerable to XSS, so repos would need to be HTML encoded here.
But if it did something like:
$(div).append($("<b>").text(repos));
then HTML encoding would have resulted in HTML entity codes being wrongly displayed to the user.
Or if the client wanted to do some processing of the data, it may want the plaintext data first to do the processing, and then encode it later to output it.
Input validation can help too, but the rules for what is valid input may not align with what is safe to use without encoding. Things like ampersands, quotes and brackets can appear in valid text data too. But if your data can't contain these characters, you can reject the input as invalid.

Your API will not be vulnerable to XSS unless it also provides a UI that consumes it. It will be the clients of your API which could be vulnerable - they will need to make sure they correctly encode any data from your API before they use it in any UI.

I think your api is vulnerable, though the script may not be execute. Talking to XSS prevention, we always suggest decode/validation dangerous characters when the api deals the input. There is a common requirement "clean the data before it store in the DB".
As for your situation, the api just response a json, but we don't know where the data in json will be used to. usually the frontend accept the data without any decode/validation, if that, there will be a xss.
you talk about that the client use decode the data they get from your api. Yes I agree, but frontend always "trust" the backend so that they won't decode the data. but the api server should not trust any input from frontend due to this can be controlled/changed by (malicious)users.

Related

Does using multipart/form-data Content Type for a RESTful POST api a good practice?

I have a situation where I have to write a api to create a resource and amongst datafields that I need to accept is a string that is basically contents of a html file. As I see it I have a choice between structuring the entire thing as a json object where this field is a string field with urlencoded html string , and having the Content Type as multipart/form-data where each of the fields and the html string (UTF-8 encoded) is a part in the message.
Not using json is something I am not comfortable with as I feel violating the REST standards in not structuring the content of the entity I am about to create thus there is a loss of information for the consumers as they can't tell immediately looking at my api definition about what data to feed to it. But practically multipart/form-data handles stuff like html file content better and more efficient as I will not have to urlencode it and can control the char-encoding also.
What will be a better approach in current context and upholding RESTful principles ? Also are there other trade-offs i should be aware of ? what about parsing a json with a huge string field (~ 200 Kb)embedded?
EDIT :- I was reading some similar questions on SO and one approach that stood out was the 2-step approach of making a first call with metadata to create the entity and then upload the file as an UPDATE process to the created entity wherein we use multipart/form-data. In that context, I guess , what I am asking is how sound is an approach where I send both metadata and the file in a single api call as multipart data , where each metadata field is actually a part in the multipart message as is the file.
The canonical way to upload files to REST API is using the multipart/form-data. As W3 recommendation guide says:
The content type "multipart/form-data" should be used for submitting
forms that contain files, non-ASCII data, and binary data.
Multipart/form-data has advantages over base64 to represent binary data. Is sticked to REST/Http philosophy, and simplify the develop of API clients.
Returning values from Forms: multipart/form-data
W3 Recommendation guide
The good practice is to use multipart/form-data whenever files are uploaded to the server along with database fields. Do not send a base64 JSON string as the request to your Rest API as it might corrupt the file or degrade the performance of your application.
As far as documenting multipart/form-data Rest API for your consumers is concerned you have to force your API consumers to use the same form fields which you have predefined in your web service.
Returning Values from Forms: multipart/form-data
I started using FormData objects everywhere on the client-side, in lieu of regular form input fields, for dynamic REST posts. FormData is presented in a positive light in various tutorials, so I went with it.
However, down the line, this caused me problems when decoding the form data into my Go structs. FormData objects are sent as "multipart/form-data" (regardless of files being sent) and I believe my decoder in Go didn't convert the raw data back to string form. Eventually my SQL queries were throwing panics, as hex data was being sent in where strings should have been.
So with some adjustment, I could use FormData however I've decided to revert to the simple universal recommendation: Use "multipart/form-data" only for special cases like when sending files. Otherwise, just use regular "application/x-www-form-urlencoded".

JSON data, HTML encoded from server or client?

So there's a small debate by my team, and I'm sure this is answered in many places but I couldn't find any definitive answers.
Right now we have a server that tosses up JSON data (think REST, sorta). The client is a complete JavaScript client that uses $.ajax to grab the data and render it appropriately.
The client is using UnderscoreJS templates to render data within the HTML:
<%- something %>
So if the server sends down a JSON block (non-html encoded):
{
"username": "Joe's Crab & Cookies"
}
Should the server be HTML or JavaScript encoding this value? Or should that still be left up to the client?
What if a bit of data from the server needs to be an attribute of an element:
<li data-item-id="<%= userId %>">something</li>
I realize that I shouldn't need to encode anything that's generated by the server, it's all data that is entered by the user. So imagine the "userId" above being set by a user, not generated.
So if we encode on the server and on the client we see on the rendered page:
Joe's Crab & Cookies
If you're sending json data somewhere, the only encoding that should be done to it is json encoding. You don't necessarily know if the values are going to end up in sql, javascript, xml, html attributes, a winforms app, etc.
Now, on the other hand, if some of your json values were to contain html, that html value should be encoded, ready-to-display html. It depends on context.
First of all, you should be escaping the value, if it can be set by a user.
You should use as much escaping and validation as possible -- both on the input fields, when capturing the data on the server, when inputting into DB and, finally, when rendering it back.
Among the mild consequences of not escaping data would be that it can crack your HTML when you're outputting to data-item-id.

Using Json string in the Http Header

Recently I run into some weird issue with http header usage ( Adding multiple custom http request headers mystery) To avoid the problem at that time, I have put the fields into json string and add that json string into header instead of adding those fields into separate http headers.
For example, instead of
request.addHeader("UserName", mUserName);
request.addHeader("AuthToken", mAuthorizationToken);
request.addHeader("clientId","android_client");
I have created a json string and add it to the single header
String jsonStr="{\"UserName\":\"myname\",\"AuthToken\":\"123456\",\"clientId\":\"android_client\"}";
request.addHeader("JSonStr",jsonStr);
Since I am new to writing Rest and dealing with the Http stuff, I don't know if my usage is proper or not. I would appreciate some insight into this.
Some links
http://lists.w3.org/Archives/Public/ietf-http-wg/2011OctDec/0133.html
Yes, you may use JSON in HTTP headers, given some limitations.
According to the HTTP spec, your header field-body may only contain visible ASCII characters, tab, and space.
Since many JSON encoders (e.g. json_encode in PHP) will encode invisible or non-ASCII characters (e.g. "é" becomes "\u00e9"), you often don't need to worry about this.
Check the docs for your particular encoder or test it, though, because JSON strings technically allow most any Unicode character. For example, in JavaScript JSON.stringify() does not escape multibyte Unicode, by default. However, you can easily modify it to do so, e.g.
var charsToEncode = /[\u007f-\uffff]/g;
function http_header_safe_json(v) {
return JSON.stringify(v).replace(charsToEncode,
function(c) {
return '\\u'+('000'+c.charCodeAt(0).toString(16)).slice(-4);
}
);
}
Source
Alternatively, you can do as #rocketspacer suggested and base64-encode the JSON before inserting it into the header field (e.g. how JWT does it). This makes the JSON unreadable (by humans) in the header, but ensures that it will conform to the spec.
Worth noting, the original ARPA spec (RFC 822) has a special description of this exact use case, and the spirit of this echoes in later specs such as RFC 7230:
Certain field-bodies of headers may be interpreted according to an
internal syntax that some systems may wish to parse.
Also, RFC 822 and RFC 7230 explicitly give no length constraints:
HTTP does not place a predefined limit on the length of each header field or on the length of the header section as a whole, as described in Section 2.5.
Base64encode it before sending. Just like how JSON Web Token do it.
Here's a NodeJs Example:
const myJsonStr = JSON.stringify(myData);
const headerFriendlyStr = Buffer.from(myJsonStr, 'utf8').toString('base64');
res.addHeader('foo', headerFriendlyStr);
Decode it when you need reading:
const myBase64Str = req.headers['foo'];
const myJsonStr = Buffer.from(myBase64Str, 'base64').toString('utf8');
const myData = JSON.parse(myJsonStr);
Generally speaking you do not send data in the header for a REST API. If you need to send a lot of data it best to use an HTTP POST and send the data in the body of the request. But it looks like you are trying to pass credentials in the header, which some REST API's do use. Here is an example for passing the credentials in a REST API for a service called SMSIfied, which allows you to send SMS text message via the Internet. This example is using basic authentication, which is a a common technique for REST API's. But you will need to use SSL with this technique to make it secure. Here is an example on how to implement basic authentication with WCF and REST.
From what I understand using a json string in the header option is not as much of an abuse of usage as using http DELETE for http GET, thus there has even been proposal to use json in http header. Of course more thorough insights are still welcome and the accepted answer is still to be given.
In the design-first age the OpenAPI specification gives another perspective:
an HTTP header parameter may be an object,
e.g. object X-MyHeader is
{"role": "admin", "firstName": "Alex"}
however, within the HTTP request/response the value is serialized, e.g.
X-MyHeader: role=admin,firstName=Alex

JSON vs Form POST

We're having a bit of a discussion on the subject of posting data to a REST endpoint. Since the objects are quite complex, the easiest solution is to simply serialize them as JSON and send this in the request body.
Now the question is this: Is this kosher? Or should the JSON be set as a form parameter like data=[JSON]? Or is sending of JSON in the request body just frowned upon for forcing the clients using the application, to send their data via JavaScript instead of letting the browser package it up as application/x-www-form-urlencoded?
I know all three options work. But which are OK? Or at least recommended?
I'd say that both methods will work well
it's important that you stay consistent across your APIs. The option I would personally choose is simply sending the content as application/json.
POST doesn't force you to use application/x-www-form-urlencoded - it's simply something that's used a lot because it's what webbrowsers use.
There is nothing wrong about sending it directly as serialized JSON, for example google does this by default in it's volley library (which obviously is their recommended REST library for android).
If fact, there are plenty of questions on SO about how not to use JSON, but rather perform "normal" POST requests with volley. Which is a bit counter intuitive for beginners, having to overwrite it's base class' getParams() method.
But google having it's own REST library doing this by default, would be my indicator that it is OK.
You can use JSON as part of the request data as the OP had stated all three options work.
The OP needs to support JSON input as it had to support contain complex structural content. However, think of it this way... are you making a request to do something or are you just sending what is basically document data and you just happen to use the POST operation as the equivalent of create new entry.
That being the case, what you have is basically a resource endpoint with CRUDL semantics. Following up on that you're actually not limited to application/json but any type that the resource endpoint is supposed to handle.
For non-resource endpoints
I find that (specifically for JAX-RS) the application/x-www-urlencoded one is better.
Consistency with OAuth 2.0 and OpenID Connect, they use application/x-www-urlencoded.
Easier to annotate the individual fields using Swagger Annotations
Swagger provides more defaults.
Postman generates a nice form for you to fill out and makes things easier to test.
Examples of non-resource endpoints:
Authentication
Authorization
Simple Search (though I would use GET on this one)
Non-simple search where there are many criteria
Sending a message/document (though I would also consider multipart/form-data so I can pass meta data along with the content, but JAX-RS does not have a standard for this one Jersey and RestEasy have their own implementations)

Is this a valid JSON response?

G'day gurus,
I'm calling the REST APIs of an enterprise application that shall remain nameless, and they return JSON such as the following:
throw 'allowIllegalResourceCall is false.';
{
"data": ... loads of valid JSON stuff here ...
}
Is this actually valid JSON? If (as I suspect) it isn't, is there any compelling reason for these kinds of shenanigans?
The response I received from the application vendor is that this is done for security purposes, but I'm struggling to understand how this improves security much, if at all.
Thanks in advance!
Peter
According to
http://jsonlint.com/
It is not.
Something like the below is.
{
"data": "test"
}
Are they expecting you to pull the JSon load out of the message above?
Its not a JSON format at all. From your question it seems you are working with enterprise systems like JIVE :). I am also facing same issue with JIVE api. This is the problem with their V3 API. Not standard , but following thing worked for me. (I am not sure if you are talking about JIVE or not)
//invalid jason response... https://developers.jivesoftware.com/community/thread/2153
jiveResponse = jiveResponse.Replace
("throw 'allowIllegalResourceCall is false.';",String.Empty);
There is a valid reason for this: it protects against CSRF attacks. If you include a JSON url as the target of a <script> tag, then the same-origin policy doesn't apply. This means that a malicious site can include the URL of a JSON API, and any authenticated users will successfully request that data.
By appropriately overriding Object.prototype and/or Array.prototype, the malicious site can get any data parsed as an object literal or array literal (and all valid JSON is also valid javascript). The throw statement protects against this by making it impossible to parse javascript included on a page via <script> tags.
Definitely NOT valid JSON. Maybe there's an error in the implementation that is mixing some kind of debug output with the correct output?
And, by no means this is for security reasons. Seems to me this is a plain bug.
throw 'allowIllegalResourceCall is false.'; is certainly not valid JSON.
What MIME type is reported?
It seems they have added that line to prevent JSON Hijacking. Something like that line is required to prevent JSON Hijacking only if you return a JSON array. But they may have added that line above all of their JSON responses for easier implementation.
Before using it, you have to strip out the first line, and then parse the remaining as JSON.