Say I'm in a Spring environment and I have an URL http://www.example.com/name=alice
In the controller, I have code like,
mav.addObject("name", request.getParameter("name"));
And in the JSP file, it is rendered like
<div><c:out value="${name}" /></div>
My question is,
If a malicious user appends a bad string, for example, a short script in the URL, like http://www.example.com/name={some bad script}, <c:out> will protect me, is my understanding correct?
What if I cannot use <c:out>? Say, the parameter is "alice&bob", <c:out> will turn it to "alice%26bob", which is not what I want. How can I protect myself in this case?
You will always have to escape and sanitize untrusted input before you send it to the client.
Coding it by hand is painful and error prone. But you could use SafeHtmlBuilder for example. :-)
Related
As with any user supplied data, the URLs will need to be escaped and filtered appropriately to avoid all sorts of exploits. I want to be able to
Put user supplied URLs in href attributes. (Bonus points if I don't get screwed if I forget to write the quotes)
...
Forbid malicious URLs such as javascript: stuff or links to evil domain names.
Allow some leeway for the users. I don't want to raise an error just because they forgot to add an http:// or something like that.
Unfortunately, I can't find any "canonical" solution to this sort of problem. The only thing I could find as inspiration is the encodeURI function from Javascript but that doesn't help with my second point since it just does a simple URL parameter encoding but leaving alone special characters such as : and /.
OWASP provides a list of regular expressions for validating user input, one of which is used for validating URLs. This is as close as you're going to get to a language-neutral, canonical solution.
More likely you'll rely on the URL parsing library of the programming language in use. Or, use a URL parsing regex.
The workflow would be something like:
Verify the supplied string is a well-formed URL.
Provide a default protocol such as http: when no protocol is specified.
Maintain a whitelist of acceptable protocols (http:, https:, ftp:, mailto:, etc.)
The whitelist will be application-specific. For an address-book app the mailto: protocol would be indispensable. It's hard to imagine a use case for the javascript: and data: protocols.
Enforce a maximum URL length - ensures cross-browser URLs and prevents attackers from polluting the page with megabyte-length strings. With any luck your URL-parsing library will do this for you.
Encode a URL string for the usage context. (Escaped for HTML output, escaped for use in an SQL query, etc.).
Forbid malicious URLs such as javascript: stuff or links or evil domain names.
You can utilize the Google Safe Browsing API to check a domain for spyware, spam or other "evilness".
For the first point, regular attribute encoding works just fine. (escape characters into HTML entities. escaping quotes, the ampersand and brackets is OK if attributes are guaranteed to be quotes. Escaping other alphanumeric characters will make the attribute safe if its accidentally unquoted.
The second point is vague and depends on what you want to do. Just remember to use a whitelist approach instead of a blacklist one its possible to use html entity encoding and other tricks to get around most simple blacklists.
I'm no expert on web development, and need to find a way to let the browser call a PHP routine on the server with the current document ID as parameter, eg.
http://www.acme.com/index.php?id=1
I then need to call eg. /change.php with id=1 to do something about that document.
Unless I'm mistaken, there are three ways for the client to return this information:
if passed as argument in the URL (as above), it will be available as HTTP referrer
by including it as hidden field in
by sending it as cookie
I suppose using a hidden field is the most obvious choice. Are there other ways? Which solution would you recommend? Any security issues to be aware?
Thank you.
You can also POST the data so it won't be seen in the URL with ’form method = "post" ’
All of these methods are, to a point, insecure as they can be manipulated by a savvy user/hacker. You could https your site, limiting any man in then middle attacks. Be sure to check and validate incoming data
Ajax is another option as well, and it allows you to send that information without refreshing the page.
http://www.acme.com/index.php?id=1
The above url would be more "browser friendly" if you transform it into something similar to this:
http://www.acme.com/index/page/1
I am sure you can achieve this in Apache. Or Java Servlets.
I am trying to pass a link within another in such a way :
http://www.1st_site.com/?u=http://www.2nd_site.com/?parameter1=xyz
I think what the problem is , parameter1=xyz is passed as a parameter for 1st_site
is there anyway to avoid that?
You need to URL-encode the entire URL which is represented as query parameter value, else it will be interpreted as part of the request URL, thus this part: http://www.2nd_site.com/?parameter1=xyz.
It's unclear what programming language you're using, but most of decent webbased languages provides functions/methods/classes to achieve this, e.g. URLEncoder in Java, or c:url and c:param in JSP/JSTL, urlencode() in PHP and escape() in JavaScript.
Here's at least an online URL encoder: http://meyerweb.com/eric/tools/dencoder/. If you input http://www.2nd_site.com/?parameter1=xyz, you should get http%3A%2F%2Fwww.2nd_site.com%2F%3Fparameter1%3Dxyz back so the request URL should effectively end up in:
http://www.1st_site.com/?u=http%3A%2F%2Fwww.2nd_site.com%2F%3Fparameter1%3Dxyz
how can I do this without getting "forbidden". Other sites do it, for example http://twitter.com?status=http://somesite.com works just fine. I've been looking everywhere for an answer. Please can somebody help! Please note my example is automatically encoded (imagine it without the %3A)
You will need to encode the url. A query string with an unencoded url is going to be a problem.
If you don't encode urls inside urls, then whoever is interpreting it will not see it as a valid URL. This is because in your example
http://twitter.com?status=http%3A//somesite.com
The %3A is a colon. But according to the URI specification, the colon is a schema delimiter (http, ftp, irc, whatever), and a uri can only contain one. And if I've read enough of these specs, I'm guessing it says the equivalent to "servers receiving an badly formed url should return an error message" or "..try to interpret it without guaranteeing a positive response".
Technically the // should also be escaped, since they are path delimiters, but only a server serving static content would react to that.
For the URI specification, see http://labs.apache.org/webarch/uri/rfc/rfc3986.html
If you are asking how to do this in Javascript you should use the escape/unescape and handle the special case of the / character.
Take a look at this reference.
I have a page in which users submit URLs, some of which contain &, = etc. Now if I want it to validate it with W3C I need to write it as & = etc. How can I automatically do this? Also, should I even bother?
you should encode the urls on server side then. not knowing what backend language you use, here's a list:
* htmlentities() - PHP
* HttpUtility.UrlEncode() - ASP.net
* URI.escape() - Ruby
* URLEncodedFormat() - Coldfusion
* urllib.urlencode() - Python
* java.net.URLEncoder.encode() - Java
Yes, you should bother, and it's quite simple. Saying, "Oh, look how many invalid pages there are" does not excuse your contributions to the problem. Every major language either has this functionality built-in (as Can noted for PHP) and/or can implement it trivially.
If users are submitting urls and you want to assist them in not making errors, then I'd validate the url by calling it. Use the http head method to validate the url.
This will take more programming than statically looking at the url string. You'll want to think about using a helper process, returning the result asynchronously to the original submit, etc. But that's the sort of stuff which separates the students from the professionals.
You need to use %26 instead of &.
In the general case though, find a URL encoder function in whatever language you're using.
I'd say don't even bother. See Jeff's post on the subject: HTML Validation: Does It Matter?
On the other hand, if you're a perfectionist, properly escaping query strings should be pretty trivial in any language. For example, you can use htmlspecialchars, htmlentities, urlencode or rawurlencode in PHP.