Web security, are there issues with hidden fields (no sensitive data)? - html

I was having a discussion with coworkers. We have to implement some security standards. We know not to store 'sensitive, addresses, date of birth' information in hidden fields but is it OK to use hidden fields for your application, in general.
For example:
action=goback
It seems like it would be safer to use hidden fields for that kind of information as opposed to adding it in the query string. It is one less piece of information that a hacker could use against your application.

A hacker can access hidden fields just as easily as querystring values by using an intercepting proxy (or any number of tools).
I dont think there is anything wrong with using hidden fields as long as they aren't used for anything sensitive and you validate them like you would any other value from the client.

Making a field "hidden" has pretty much nothing to do with security, and should be considered a UI decision. Any "hacker" will read your HTML source anyway.
Better to either not show sensitive information at all, or, if you must, to use SSL (to prevent data interception by network intermediaries) and some combination of login challenges (to prevent unauthorized access).

It's only a security hole if you're exposing information that wouldn't be otherwise available to the end user and/or aren't validating it on return.
I'd look instead to storing said information in a server side session variable instead...

Storing your data in a hidden field is, from a security standpoint, exactly the same as storing it in the query string. In fact, if your form uses the GET action, it ends up int he query string anyway.
Hidden fields are completely unrelated to security in any way; they are simply a method by which data can be stored in a form without forcing the user to see it. They do not provide a way of preventing the user from seeing it.

Hidden fields are not always an issue, but they should always ring alarm bells as they have two potential problems:
1) If the data is sensitive, it exposes it to the client (e.g. using a proxy, or simply view source - and it is pointless to try and prevent this programmatically)
2) If the data is interpreted by the server, a knowledgeable user can change it. To take a silly example, if the hidden field contains the user's bank balance, they could use a proxy or some non standard client to make the server think their bank balance is anything they choose.
The second one is a big source of vulnerabilities in webapps. Data associated with the session should be held server side, unless you have a means of validating it on the server (for example if the field is signed or encrypted by the server).
Provided you are sure you're not falling into either of these traps, they can be OK to use. As a rule of thumb, I would not use hidden fields except for data you would be happy to see in the query string, or if javascript needs them for processing. In the latter case, you still need to make sure the server is validating though, don't assume the client will run your javascript.

Consider encrypting the name and value of your hidden field for the purpose of tamper checking since hackers can still get hold of your hidden fields and manipulate them the way they wanted to.

As other people have mentioned both the query string and hidden fields are essentially public data, viewable by the user.
One thing to keep in mind if you place data on the querystring is that people pass urls around, and because of this should never contain any information specific to the current user.
It is also probably a good idea not to include state information in the url, if that state can not be entered directly. Or at least you would need to handle invalid state information in the querystring.

I would say that this is no more or less safe than placing the item in the query string. After all, one could always view source on the site (and there isn't any way to prevent that, since one could always programmatically download the source).
A better solution here would be to encrypt the names of the fields and the values with a key that is generated on the server, and only the server. Unless the server was hacked, the client wouldn't have any clue what the name of the value is, or its value.
Of course, since this is coming from the client, you still have to check the validity of the data coming back, don't just take for granted that it hasn't been altered in a manner that you didn't dictate.
To that end, you will want to use hashing to make sure that the value hasn't been tampered.

In general don't use hidden form fields for sensitive data. Only for static non sensitive POST data that you realise is not safe to handle "as its recieved". The only time i use them is to store Session Tokens as they're rendered and checked upon recieving the POST. To prevent CSRF attacks or atleast make them a great deal harder.

In addition to all the other useful advice by other posters, I'd also add that hidden fields make your app no less vulnerable to SQL injection attacks as url query string values do. As always, sanitise your input.

Related

Storing malicious code in a database - is escape-on-output always the correct approach?

Just want to understand the thinking here and arrive at a correct and accepted approach to this issue. For context this is in a web environment and we are talking about escaping on input to the database.
I understand many of the reasons behind not escaping on input when taking user input and storing it into a database. You might want to use that input in a variety of different ways (as JSON, as SMS etc) and you also might want to show that input to the user in its original form.
Before putting anything into the database we make sure there is no SQL injection attacks to protect the database.
However following the principals set out here and here, they suggest the approach of saving user input as is. This user input might not be an SQL injection attack, but it could be other malicious code. In these cases is it OK to store Javascript based XSS attacks into the database?
I just want to know if my assumptions here are correct, are we all fine with storing malicious code in the database so long as that malicious code doesn't directly affect the database? Is it a case of it not being the database's problem, it can hold this malicious code and its up to the output device to avoid the pitfalls of the malicious code?
Or should we be doing more escaping on input than suggested by these principals - does the security concerns come before the idea of escaping on output? Should we take the approach that no malicious code enters the database? Why would we want to store malicious code anyway?
What is the correct approach for saving malicious code into a database in the context of a web client/server environment?
[For the purposes of this I am ignoring any sites that specifically allow code to be shared on them, I am thinking of "normal" inputs such as Name, Comment and Description fields.]
Definition: I use the term "sanitize" instead of filter or escape, because there's a third option: rejecting invalid input. For example, returning an error to the user saying "character ‽ may not be used in a title" prevents ever having to store it at all.
saving user input as is
The security principle of "defense in depth" suggests that you should sanitize any potential malicious input as early and often as possible. Whitelist only the values and strings useful to your application. But even if you do, you'll have to encode/escape these values too.
Why would we want to store malicious code anyway?
There are times where accuracy is more important than paranoia. For example: user feedback may need to include potentially disruptive code. I could imagine writing user feedback that says, "Every time I use type %00 as part of a wiki title the application crashes." Even if wiki titles don't need the %00 characters, the comment should still transmit them accurately. Failing to allow this in comments prevents operators from learning about a serious issue. See: Null Byte Injection
up to the output device to avoid the pitfalls of the malicious code
If you need to store arbitrary data, the correct approach is to escape as you switch to any other encoding type. Note that you must decode (unescape) and then encode (escape); there is no such thing as non-encoded data - even binary is at least Big-Endian or Small-Endian. Most folks use the language's built in strings as the 'most decoded' format, but even that can get wonky when considering Unicode vs ASCII. User input in web applications will be URLEncoded, HTTP Encoded, or encoded according to the "Content-Type" header. See: http://www.ietf.org/rfc/rfc2616.txt
Most systems now do this for you as part of templating or parameterized queries. For example, a parameterized query function like Query("INSERT INTO table VALUES (?)", name) would prevent the need to escape single quotes or anything else in the name. If you don't have a convenience like this, it helps to create objects that track data per encoding type, such as HTMLString with a constructor like NewHTMLString(string) and Decode() function.
Should we take the approach that no malicious code enters the database?
Because the database cannot determine all future possible encodings, it is impossible to sanitize against all potential injections. For example, SQL and HTML may not care about backticks, but JavaScript and bash do.
This user input might not be an SQL injection attack, but it could be
other malicious code. In these cases is it OK to store Javascript
based XSS attacks into the database?
It could be OK depending on your use-case. Theoretically, the database should be agnostic of the usage of the data it stores. As a result, it would be reasonable to store raw data in the database and escape them during output depending on the output media used.
I just want to know if my assumptions here are correct, are we all
fine with storing malicious code in the database so long as that
malicious code doesn't directly affect the database? Is it a case of
it not being the database's problem, it can hold this malicious code
and its up to the output device to avoid the pitfalls of the malicious
code?
As explained above, whether a piece of data is "malicious" is highly dependent to the context and the way it's used. To give an example, <script>...</script> as a piece of data can cause serious issues if rendered in an HTML webpage. However, it could potentially be considered an absolutely legit payload to be shown in a printed document/report. That's the rationale behind the generic suggestion of storing data in the raw form and escaping them accordingly depending on the output medium. To straightly answer to your question, yes one could argue that storing this data in the database is absolutely OK, as far as all the escaping mechanisms are in place for all the possible media.
Or should we be doing more escaping on input than suggested by these
principals - does the security concerns come before the idea of
escaping on output? Should we take the approach that no malicious code
enters the database? Why would we want to store malicious code anyway?
There is a small difference between sanitization and escaping. The former refers to the process of filtering invalid data before storing them, while the latter one refers to transforming data to the proper format before displaying to the selected medium. Following the principle of defense in depth, you could (and you should, if possible) perform an additional step of sanitization when receiving the data. However, in order to achieve this, a prerequisite is that you must know the character of the data you expect. For instance, if you expect a telephone number, then it would make sense to flag data containing <script> as invalid data to the user. That would not necessarily be true if you expected the report for a programming assignment. So, everything is dependent on the context.

Input form max length prevent editing?

I'm wondering how the heck you make it impossible for others to change the values and the maxlength by using the source viewer. It causes a lot of trouble for me due to the fact that there is people changing these fields, typing a whole book into these fields. That screws up my database, how to prevent that? Heard something about HTML5, and it should be server sided. But I don't really get it, nor did I find anything about maxlength.
The text is being inserted into a database, and when it is retrieved it gets fucked up because of the lenghts.
You really can't prevent users from messing with the HTML through the browser dev tools.
Which is why you should also validate on the server side as well as the client side. On your code that processes the form post, do validation/sanity checks for all the data, including string length, min/max values etc. Only when the data passes your checks do you allow it into the database.
Depending on your framework/platform, there may be validation libraries that can assist with this.

At what level should I check for the correctness of a form field?

I am writing a web application using JSP, with a mysql database that keeps track of all the users. In a web page I use a form to allow users to register.
In my database, for example the username of an user has a maximum of 20 characters, so I would avoid to allow an user to register with an username longer than 20. In my application I am strictly separating all levels, so there's a strong separation between services, business logics, business flows and the presentation level done with JSP pages.
My concern is about where I should check that any given field is correct. In business logics I implement a class that abstracts the concept of an user, allowing to create a new user and inserting it in my database. In business flows (that is beans) I can elaborate all the HTTP parameters received, so I know all the fields values. I could do it in my JSP page, even with javascript analyzing every inserted field and conditionally submitting the form, in beans or in my "user" class. Which one would be the most correct?
Assuming you're using a pattern close to MVC
The input validation is relevant to the controller part. It's up to your controller to process data, then display user friendly error message by passing these errors message to your view.
Any processing have to be done in the controller and validating data is processing.
Anyway, an extra security on model isn't a bad thing, but in this case it's totally useless because you database engine will truncate (or throw an error) if you're inserting more than 20 characters, so security is allready in place.
Models are only meant to acces and store data, not validate it! (Except some rare case when data storage need validation and when database structure don't check integrity by itself).
But again, these are just concepts, you're free to adopt concepts in the way you like. As long as you're consistent across your application (don't do some validations in models, some in controllers, and why not some in view if we are at that!)
I would do it in the model class.
What you must not do is doong the validation with javascript in the client, because the user can disable JS

Preemptively getting pages with HTML5 offline manifest or just their data

Background
I have a (glorified) CRUD application that I'd like to enable HTML5 offline support with. The cache-manifest system looks simple yet powerful, but I'm curious about how I can allow users to access data while offline.
For example, suppose I have these pages for the entity "Case" (i.e. this is CRM case-management software):
http://myapplication.com/Case
http://myapplication.com/Case/{id}
http://myapplication.com/Case/Create
The first URI contains a paged listing of all cases, using the querystring parameters pageIndex and pageSize, e.g. /Case?pageIndex=2&pageSize=20.
The second URI is the template for editing individual cases, e.g. /Case/1 or /Case/56.
Finally, /Case/Create is the form used to create cases.
The Problem
I would like all three to be available offline.
/Case
The simple way would be to add /Case to the cache-manifest, however that would break paging (as the links wouldn't work).
I think I could instead add something like /Case/AllData which is an XML resource, which is cached and if offline then a script on /Case would use this XML data to populate the list and provide for pagination.
If I go for the latter, how can I have this XML data stored in the in-browser SQL database instead of as a cached resource? I think using the SQL database would be more resilient.
/Case/{id}
This is more complicated. There is the simple solution of manually adding /Case/1, /Case/2, /Case/3 etc... to /Case/1234, but there can be hundreds or even thousands of cases so this isn't very practical.
I think the system should provide access to the 30 most recent cases, for example. As above, how can I store this data in the database?
Also, how would this work? If I don't explicitly add /Case/34 to the manifest and the user clicks on to /Case/34 how can I get the browser to load a page that my JavaScript will populate based on the browser's SQL database data and not display the offline message?
/Case/Create
This one is more simple - as it's just an empty page and on the <form>'s submit action my script would detect if it's offline, and if it is offline then it would add it to the browser's SQL database. Does this sound okay?
Thanks!
I think you need to be looking at a LocalStorage database (though it does have some downsides), but there are other alternatives such as WebSQL and IndexedDB.
Also I don't think you should be using numeric Id's if you are allowing people to create as you will get Primary Key conflicts, it is probably best to use something like a GUID.
Another thing you need is the ability to push those new cases onto the server. there could be multiple...
Can they be edited? If they can I think you really need to be thinking about synchronization and conflict resolution hard very hard if that is the case.
Shameless self promotion, I have a project that is designed to handle these very issues, though it's not done, it's close. You can see it (with an ugly but very functional) demo at https://github.com/forbesmyester/SyncIt

Rest Service Post parameters

I have rest services that i will be posting data to. Is it better to post data using http form elements in the post data or is it better to post all the data in one json string and then parse the string at the server side. Any reason to go one way vs the other?
Thanks in advance. I am trying to make sure architecturally we code this the best way.
Thanks
I think you have to use the first solution because it is more close to the RESTful architecture. In addition, this solution is a standard, so you will don't need to do extra things to encode / decode the POST parameters.
I think it depends on your data.
If your data is quite flat with a one to one correspondence between keys and simple values then the form style submission is probably most appropriate. If you have more complex nested data or an array of some kind I would roll with the json approach. I don't think either option is more or less RESTful.
Form elements are the way to go. If you use json in your post, then you need to communicate the structure to the clients. This is usually done out-of-band (I've never seen it done in-band, but I might be wrong), which creates a coupling between the client and the server.
When you use a form, the in-band form communicates to the client what the post data should be. When the data requirements change, the form is changed and the client can (possibly) adjust accordingly.
For instance, just say you've defined the following nouns in your media-type: email, password, first-name, last-name, date-of-birth etc and you have a user creation form that requires email and password, with the other user optionally data populated later on (via another form). Later it's decided that you want users to provide their name when the account is created, so you update the form so that is requires email, password, first-name and last-name. Since the clients are already familiar with these nouns (and know what data belongs in each), well written clients will be compatible with the updated form. If it was just json data being posted, the clients would not work as they would have no idea the required json data had changed (unless you change the media-type, in which case you'll break them anyway).
Now this approach only works for nouns that have been defined in your media-type. If you are adding a new noun, then you can either only ever make it optional (existing clients will still work, new clients can take advantage of the new noun) or if you need to make it required, then you need to create a new media-type, which only new (or updated) clients will be able to use.