relying on hidden inputs and query strings - language-agnostic

Suppose that I have a request handler that accepts an argument: key
And let the request be:
http://example.com/2323
When the handler receives a GET, relevant data is fetched from db based on this key, fed to a form and displayed. In the process, the value of key is put in a hidden input.
When it receives a POST, it has the key argument from the query string, as well as the key from the hidden input, which are the same, provided that the user has not tampered with them.
I'd like to know if it's the hidden input or the query string argument I should rely on when the data on the form will be saved to db. The problem is that query string may be modified by the user prior to post, just like the hidden input may also be modified since the source is open to the user.

Well, any data that you send to the client has the potential to be modified (it may not be trivial, but the possibility exists, nevertheless).
There are many options; Querystrings, hidden fields, cookies to name a few. Each of them suffers from this very drawback - the possibility that a malicious user may modify the data in those entities.
Your best bet is to use strong encryption. Whether it is the data in the hidden field or a cookie, it can be easily encrypted. Then when the request is received, it can be compared with the value existing before the earlier response and it can be decrypted with the reasonable faith that the data has not been tampered with. For a good example, you should research how ASP.NET Viewstate works.
So, to answer your question, you should rely on neither state persistence method without additional security implementation. As they say, Security through obscurity is not security at all.

Related

What is the use of all GET, PUT, DELETE when anything can be done by POST in the most secured way of communication in REST API calls

I read a lot about each and every function of those mentioned in the title,
But I always have doubt that what is the primary use of all individual functions. Can someone explain to me in detail? Thank you.
What is the use of all GET, PUT, DELETE when anything can be done by POST
This is a pretty important question.
Note that, historically, SOAP essentially did everything by POST; effectively reducing HTTP from an application protocol to a transport protocol.
The big advantage of GET/PUT/DELETE is that the additional semantics that they promise (meaning, the semantics that are part of the uniform interface agreed to by all resources) allow us to build general purpose components that can do interesting things with the meta data alone, without needing to understand anything specific about the body of the message.
The most important of these is GET, which promises that the action of the request is safe. What this means is that, for any resource in the world, we can just try a GET request to "see what happens".
In other words, because GET is safe, we can have web crawlers, and automated document indexing, and eventually Google.
(Another example - today, I can send you a bare URI, like https://www.google.com and it "just works" because GET is understood uniformly, and does not require that we share any further details about a payload or metadata.)
Similarly, PUT and DELETE have additional semantic constraints that allow general purpose components to do interesting things like automatically retry lost requests when the network is unreliable.
POST, however, is effectively unconstrained, and this greatly restricts the actions of general purpose components.
That doesn't mean that POST is the wrong choice; if the semantics of a request aren't worth standardizing, then POST is fine.
In API perspective,
GET - It is to retrieve record/data from a source. API would need no data from client/UI to retrieve all records , or would need query param / path param to filter records based on what is required - either record with a particular ID or other properties.
POST - it is to store a new record at a source . API would get that record from client/UI through request body and store it.
PUT - it is to update an existing record at a source . API would receive updated record along with Id and update it with existing record whose id match with one passed from UI.
DELETE - it is to delete a record present in source. UI would send nothing to delete whole all records at source or send id to remove a particular record.
Source refers to any database.

GET request in Django

I have a doubt. When is a GET request sent. I mean, I have seen a lot of people using if request.method == 'GET', when they render the form for the first time, but when the form is submitted, they do a `POST' request.
While they explicitly mention when defining the form in html that the method will be 'POST', they don't do the same for 'GET' request which is made when an empty form is requested.
How does django know it's a GET request?
And, why is it done so?
Thanks,
I'm not an expert but I think Django "knows" this because like all things on the Internet it uses the HTTP protocol. There are several HTTP methods. If not specified the default method will always be GET
GET
A GET is usually used to retrieve information. Typically a GET function has no side-effects (this means that data in the database is not changed, that no files in the filesystem are modified, etc.).
Strictly speaking this is not always true, since some webservers log requests (themselves) and thus add an entry to the database that a specific user visited a specific page at a specific timestamp, etc.
A typical GET request is idempotent. This means that there is no difference between doing a query one time, or multiple times (two times, three times, five times, thousand times).
GET queries are therefore typically used to provide static content, as well as pages that contain data about one or more entries, search queries, etc.
POST
POST on the other hand typically ships with data (in the POST parameters), and usually the idea is that something is done with this data that creates a change in the persistent structures of the webserver. For example creating a new entry in some table, or updating the table with the values that are provided. Since these operations are not always idempotent, it can be dangerous if the user refreshes the page in the browser (since that could for example create two orders, instead of the single order the user actually wanted to create).
Therefore in Django a POST request will typically result in some changes to the database, and a redirect as result. This means that the user will typically obtain a new address, and perform a GET request on that page (and that GET is idempotent, hence it will not construct a new order).
PUT, PATCH and DELETE
Besides the popular GET and POST, there are other typical requests a client can make to a webserver. For example PUT, PATCH and DELETE.
PUT
PUT is the twin of a POST request. The main difference is that the URI it hits, specifies what entry to construct or update. PUT is usually an idempotent operation.
This means that if we would for example perform a POST server.com/blog/create to create a blog, PUT will typically look like PUT server.com/blog/123/. So we specify the id in advance. In case the object does not yet exists, a webserver will typically construct one. In case the entity already exists, a new entity will typically be constructed for that URI. So performing the same PUT operation twice, should have no effect.
Note that in case of a PUT request, typically one should specify all fields. The fields that are not specified will typically be filled in with default values (in case such values exist). We thus do not really "update" the entity: we destroy the old entity and create a new one in case that entity already exists.
PATCH
PATCH is a variant of PUT that updates the entity, instead of creating a new one. The fields that are thus missing in a PATCH request, typically remain the same as the values in the "old" entity so to speak.
DELETE
Like the name already suggests, if we perform a DELETE server.com/blog/123/ request, then we will typically remove the corresponding element.
Some servers do not immediately remove the corresponding element. You can see it as scheduling the object for deletion, so sometimes the object is later removed. The DELETE request, thus typically means that you signal the server to eventually remove the entity.
Actually Django is based on HTTP responses-requests. HTTP is fully textuall. So Django parses each request and finds in its header information about what kind of request is it. I may be mistaken in details, but as I understand when server receives request - Django creates it's object request, which contains all the data from HTTP. And then you decide if you need a specific action on GET or POST and you check the type of request with request.method.
P.S. And yes, by default each request is GET.

How do you detect that a visitor changed a value in the query string?

For our last week in school (finals next week) our teacher decided to give us a crash course in Perl. We talked about all the differences we would encounter if we used Perl and then we started talking about "spoofing".
We were given an HTML example where a user could input their first and last names. Of course our example already had Mickey as the first name and Mouse as the last name.
<form action="action_page.php">
First name:<br>
<input type="text" value="Mickey">
<br>
Last name:<br>
<input type="text" name="lastname" value="Mouse">
<br><br>
<input type="submit" value="Submit">
</form>
At the end when you hit submit you were redirected to a new screen that said your first name is Mickey and your last name is Mouse.
Our teacher said "spoofing" is when you change the method = get in the URL so instead of having
firstname=Mickey&lastname=Mouse
you would enter something like
firstname=baseball&lastname=bat
That would instantly alter the intended command and you would end up getting first name as baseball and lastname as bat.
This all sounds pretty straight forward, until he said he wanted us to write a program to prevent spoofing without using a post method.
Instead when a user attempts to spoof the system we're supposed to print out some anti-spoofing comment.
Unfortunately, we never really talked about spoofing aside from the examples. I've attempted to Google spoofing to see some example code, or at least understand this concept, but I haven't had much luck, or I haven't looked in the right places.
So I thought I would ask here. Can someone who is decent at Perl direct me towards basic anti-spoofing programs and content, or at least explain and show how spoofing is supposed to work.
What you need to do is to authenticate the data in the query string, and validate it when you receive it. There is a standard tool(set) for this: a cryptographic Message Authentication Code (MAC).
Basically, a MAC is a function that takes in a message (any arbitrary string) and a secret key, and outputs a random-looking token that depends, in a complicated way, on both the message and the key. Importantly, it is effectively impossible to compute a valid MAC token for a modified message without knowing the key.
To validate a query string (or some other data) with a MAC, you'd basically follow these steps:
Encode the data into a "canonical" form as a string. For an HTTP URL, you could just use the query string (and/or the entire URL) as it is, although you may wish to normalize it e.g. by %-decoding any characters that don't have to be encoded, and normalizing the case of any %-encoded values (e.g. %3f → %3F).
Alternatively, you could decode the query string into, say, an associative array, and serialize this array in a format of your choice. This can make it easier to combine parameters from multiple sources (e.g. hidden form fields), to add extra data fields (see below) and to choose which fields you want to validate.
Optionally, combine the data with any additional information you wish to associate it with, such as a user ID and/or a timestamp. (You can either transmit the timestamp explicitly, or just round it down to, say, the last hour, and check both the current and the previous timestamp when validating it.) Changing any of these values will change the MAC output, thus preventing attackers from e.g. trying to submit one user's data under another user's account.
Store a secret key (preferably, a securely generated random value of, say, 128 bits) on the server. Obviously, this secret key must be stored so that users cannot access it (e.g. by guessing the path to the config file).
Feed the canonically encoded data and the secret key into the MAC algorithm. Take the result and (if your MAC library doesn't do this for you) encode it in some convenient matter (e.g. using the URL-safe Base64 variant).
Append the encoded MAC token as an extra parameter in the URL.
When you receive the data back, remove the MAC token, feed the rest of the data back into the MAC generation code as described above, and check that the resulting MAC matches the one you received.
MAC algorithms can be constructed from cryptographic hash functions like MD5 or SHA-1/2/3. In fact, a basic MAC can be obtained simply by concatenating the secret and the message, hashing them, and using the result as the token.
For some hash functions, like SHA-3, the simple MAC construction described above is actually believed to be secure; for older hash functions, which were not explicitly designed with this use in mind, however, it's safer to use the (slightly) more complicated HMAC construction, which hashes the input twice.
Alternatively, there are also MAC algorithms, such as CMAC, which are based on block ciphers (like AES) instead of hash functions. In some cases (e.g. on embedded platforms, where a fast hash function may not be available) these may be more efficient than HMAC; for a web application, however, the choice is essentially a matter of taste.
One difference between GET and POST is that the information for the former is passed in the URL itself. That means you can type what you like in the browser's address bar -- it doesn't have to have come from an HTML form. I think that's what is meant by spoofing here.
The most obvious protection is to calculate a CRC of all the protected fields -- in this case MickeyMouse -- and put that value in a hidden field of the HTML form sent out by the server. Then, when the request comes back, calculate the CRC of the same fields and check that it matches the value of the returned hidden field.
Of course that can be circumvented if the user works out how the protection functions and adds his own calculation of the CRC of his spoofed data as well. But this should be sufficient for a proof of concept.
If you want to detect if a user has changed a parameter in the querystring of a url after a form has performed a GET action, then generate a client side hash before the form is submitted. The hash would be based on the values of the form fields, and then compared to a recalculated hash based on the current parameter values on the response page. If the hashes don't match the querystring has been tampered with.
Here's a client side Crypto library to calculate the hashes https://code.google.com/p/crypto-js/
Note this is only for educational use, and wouldn't provide enough security in the real world, as a person could also discover the hashing key by inspecting the page source and use that to generate their own hashes.
A POST method wouldn't prevent spoofing anyway. POST and GET do almost exactly the same thing - they send plain text encoded variables to a web server.
They're insanely easy to "spoof" - the point isn't the spoofing, it's that you shouldn't trust "user input" like that, ever.
I would suggest in the case of the names, it doesn't matter. So what if I fudge your web page to "pretend" I am called "baseball bat" instead?
If it's important, like for example, ensuring I can only see my test results - then you need to handle the data processing server side. One method of doing this is via session tracking - so rather than including field in a web form, I instead use a "session token".
You would 'send' me a username and password - ideally using a hash to make it impossible to 'see' as you're sending it, or in your browser history. And then I would check it against my server, to check if that hash is 'valid' by performing the same operation on the server, and comparing the two.
So perlishly:
#!/usr/bin/perl
use strict;
use warnings;
use Digest::SHA qw ( sha1_base64 );
my ( $firstname, $lastname ) = qw ( Mickey Mouse );
my $timewindow = int ( time / 300 );
my $token = sha1_base64 ( $timewindow.$firstname.$lastname );
print $token;
This produces a token that doesn't last long - it changes every 5 minutes - but it's extremely difficult to tamper with.
The reason for including the time, is to avoid replay attacks, whereby if look in your browser history, I can find "your" token and reuse it. (That's probably the next question after the "spoofing" one though :))
If you sent the parameters with the token, bear in mind that it's actually quite easy for a malicious actor to perform the same calculation themselves, and send some completely faked credentials and tokens.
This is something of a simplistic example though - because really, faked parameters shouldn't matter, because you shouldn't trust them in the first place. If 'Mickey Mouse' is valid, and 'baseball bat' isn't, then your server should detect that when processing the form, and discard the latter, which makes the whole 'form spoofing' thing irrelevant.
The question is rather narrowly phrased, so this answer might not quite address what you're asking. But as a matter of policy, if you don't want your users to tamper with your data you should not give them custody of it. Why are you relying on the query string for the user name if the server already knows it? Rely on the client for authentication and for new information, and rely on your records for any information that should stay beyond the user's control.
POST requests can be crafted almost as easily as GET requests, and cryptographic protection, even when it is secure, is only useful to the extent that the client cannot access
the encrypted data; so why transmit it back and forth?

Need a design pattern for AJAX requests that don't complete before HTML form is submitted

I've done several forms that follow a similar pattern:
two interdependent form fields, let's say "street address" and "location" (lon/lat).
when user fills in one field, the other is updated via an ajax call.
(eg. if the user fills in street address, do a request to a geocode API and put the result in the location field; if the user fills in the location (eg. via a map UI), do a request to a reverse-geocode API and put the result in the address field.
No problem so far, these are easy to hook up to blur and/or focus change events.)
The problem occurs if the form is submitted before an ajax call completes. In this case one field will have a correct value and the other will be stale. The handler on the server needs to detect that this has happened and update the stale value. We can't just check for the default value because the user might have changed both fields any number of times.
There are two possible solutions I've thought of, and I don't much like either one. I'd love other suggestions.
Solution 1. Use hidden fields as flags to indicate freshness: set the value to 0 by default, reset it to 0 before the ajax request is sent, and set it to 1 when the response comes back. On the server side, check these fields and recompute any field whose freshness flag is set to 0. There is still a potential race condition here but the window is greatly narrowed. I've used this technique and it works (eg. http://fixcity.org/racks/new/). It is annoying though, as it requires more code on both client and server and is another possible source of bugs.
Solution 2. Use synchronous AJAX calls instead ("SJAX"?). Not appealing since AJAX here is just a UI convenience, it's not strictly necessary for the application to work, so I'd rather not make things feel slow - then it becomes UI *in*convenience.
Solution 3. Always do server-side postprocessing. If it's expensive, use caching to make it cheaper - eg. if the value is not stale, that means the client just made the same request via AJAX so we should have populated the cache if needed during the AJAX handler.
This one currently seems the most appealing to me, although it has two limitations:
it can't be used for things that are not safe and idempotent - eg. if the AJAX request was doing a POST; and it can't even be used for this example because we have two interdependent fields and no way to know which is correct and which is stale.
When the user presses submit, have it run a validation function that decides what state the form is in by examining the form fields and the state of the ajax call (set a flag, such as ajaxBusy).
You could enhance your AJAX call to both disable the form submit button and set a global var to to true that is checked on form submit~ That way the user can't submit the form before AJAX completes. I would add a loading graphic for UI sake.
You should validate what is submitted on server-side anyway. If both fields are related 1-1, then you can designate one of them as "master", and submit it alone, while the other one is calculated server-side.

How to detect hidden field tampering?

On a form of my web app, I've got a hidden field that I need to protect from tampering for security reasons. I'm trying to come up with a solution whereby I can detect if the value of the hidden field has been changed, and react appropriately (i.e. with a generic "Something went wrong, please try again" error message). The solution should be secure enough that brute force attacks are infeasible. I've got a basic solution that I think will work, but I'm not security expert and I may be totally missing something here.
My idea is to render two hidden inputs: one named "important_value", containing the value I need to protect, and one named "important_value_hash" containing the SHA hash of the important value concatenated with a constant long random string (i.e. the same string will be used every time). When the form is submitted, the server will re-compute the SHA hash, and compare against the submitted value of important_value_hash. If they are not the same, the important_value has been tampered with.
I could also concatenate additional values with the SHA's input string (maybe the user's IP address?), but I don't know if that really gains me anything.
Will this be secure? Anyone have any insight into how it might be broken, and what could/should be done to improve it?
Thanks!
It would be better to store the hash on the server-side. It is conceivable that the attacker can change the value and generate his/her own SHA-1 hash and add the random string (they can easily figure this out from accessing the page repeatedly). If the hash is on the server-side (maybe in some sort of cache), you can recalculate the hash and check it to make sure that the value wasn't tampered with in any way.
EDIT
I read the question wrong regarding the random string (constant salt). But I guess the original point still stands. The attacker can build up a list of hash values that correspond to the hidden value.
Digital Signature
Its probably overkill, but this sounds no different than when you digitally sign an outgoing email so the recipient can verify its origin and contents are authentic. The tamper-sensitive field's signature can be released into the wild with your tamper-sensitive field with little fear of undetectable tampering, as long as you protect the private key and verify the data and the signature with the public key on return.
This scheme even has the nifty property that you can limit "signing" to very protected set of servers/processes with access to the private key, but use a larger set of servers/processes provided with the public key to process form submissions.
If you have a really sensitive "do-not-tamper" field and can't maintain the hash signature of it on the server, then this is the method I would consider.
Although I suspect most are familiar with digital signing, here's some Wikipedia for any of the uninitiated:
Public Key Cryptography - Security
... Another type of application in
public-key cryptography is that of
digital signature schemes. Digital
signature schemes can be used for
sender authentication and
non-repudiation. In such a scheme a
user who wants to send a message
computes a digital signature of this
message and then sends this digital
signature together with the message to
the intended receiver. Digital
signature schemes have the property
that signatures can only be computed
with the knowledge of a private key.
To verify that a message has been
signed by a user and has not been
modified the receiver only needs to
know the corresponding public key. In
some cases (e.g. RSA) there exist
digital signature schemes with many
similarities to encryption schemes. In
other cases (e.g. DSA) the algorithm
does not resemble any encryption
scheme. ...
If you can't handle the session on the server, consider encrypting the data with your private key and generating an HMAC for it, send the results as the hidden field(s). You can then verify that what is returned matches what was sent because, since no-one else knows your private key, no-one else can generate the valid information. But it would be much better to handle the 'must not be changed' data on the server side.
You have to recognize that anyone sufficiently determined can send an HTTP request to you (your form) that contains the information they want, which may or may not bear any relation to what you last sent them.
If you can't/won't store the hash server side, you need to be able to re-generate it server-side in order to verify it.
For what it's worth, you should also salt your hashes. This might be what you meant when you said:
concatenated with a constant long
random string (i.e. the same string
will be used every time)
Know that if that value is not different per user/login/sesison, it's not actually a salt value.
As long as you guard the "constant long random string" with your life then your method is relatively strong.
To go further, you can generate one time / unique "constant long random string"'s.
What you are describing is similar to part of the implementation required for what are termed canaries and are used to mitigate Cross Site Request forgery attacks.
Generally speaking, a hidden input inside a HTML form contains an encrypted value that is posted back with a HTTP request. A browser cookie or a string held in session contains the same encrypted value such that when the hidden input value is decrypted and the cookie/session value is decrypted, the unencrypted values are compared to each other - if they are not identical, the HTTP request cannot be trusted.
The encrypted value might be an object containing properties. For example, in ASP.NET MVC, the canary implementation uses a class that contains properties for the authenticated username, a cryptographically pseudo random value generated using the RNGCryptoServiceProvider class, the DateTime (UTC format) at which the object was created and an optional salt string. The object is then encrypted using the AES Encryption algorithm with a 256 bit key and decrypted with the same key when the request comes in.