Search Parameters - language-agnostic

I have a Detail Search form on the Startpage, where the user have many Search options available.
What would be the best practice to keep Search paramets for the user Session.
What are the Pros and Cons if the put them in
URL
Session
Cookie
What should be used as Best practice.

I'm going to plump for Cookie on the basis that URL persistence will make all your URLs ugly and poor for link sharing; not only that but some devices might balk at very long URLs (you say there are a lot of options). Session persistence requires cookies anyway; or query string persistence to maintain the state (back to link-sharing and ugly URL problems).
With a cookie you can store a lot of data (well, within reason) and it doesn't affect your urls.
However - if search parameter persistence is crucial to your application, then you should have a fallback that detects whether cookies are available, and resorts to URL persistence if not.

Best practice really depends on the scenario (including business case, programming language, etc.). However, here are some high level pros/cons.
URL Pros: easy to read/write
URL Cons: user can easily manipulate them causing unintended results, nasty URLs
Session pros: should be pretty easy to read/write programmatically (depending on the language), don't have to worry about parameters in a URL
Session cons: takes up more memory (may be negligible depending on the data)
Cookie pros: doesn't take up memory
Cookie cons: must read/write to a file, user could delete cookies at any time (mid-session), cookies shared within the browser (1 cookie for any number of sessions)

I'd say a session is the best option. If you have several pages, you most likely will need to keep some global state -- the alternative being the user resubmitting all the previous data when he moves to the next page.
That said, you cannot just use a session that relies on a cookie to store the session identifier, at least not without some extra data that is in fact passed around between the several pages as a hidden field or a URL parameter.
The problem is that with just a cookie you won't have web conversations, you have a global cookie that's shared between all the tabs/windows in the browser. If the user opens a new tab and starts a new search, the session cookie will be replaced and the session in the other tab will be lost.
So either you:
Pass the session id in the URL instead of using a cookie (beware of session fixation, though).
Include an extra GET parameter or hidden field that identifies the conversation.

Related

Best practice for email links that will set a DB flag?

Our business wants to email our customers a survey after they work with support. For internal reasons, we want to ask them the first question in the body of the email. We'd like to have a link for each answer. The link will go to a web service, which will store the answer, then present the rest of the survey.
So far so good.
The challenge I'm running into: making a server-side changed based on an HTTP GET is bad practice, but you can't do a POST from a link. Options seem to be:
Use an HTTP GET instead, even though that's not correct and could cause problems (https://twitter.com/rombulow/status/990684453734203392)
Embed an HTML form in the email and style some buttons to look like links (likely not compatible with a number of email platforms)
Don't include the first question in the email (not possible for business reasons)
Use HTTP GET, but have some sort of mechanism which prevents a link from altering the server state more than once
Does anybody have any better recommendations? Googling hasn't turned up much about this specific situation.
One thing to keep in mind is that HTTP is specifying semantics, not implementation. If you want to change the state of your server on receipt of a GET request, you can. See RFC 7231
This definition of safe methods does not prevent an implementation from including behavior that is potentially harmful, that is not entirely read-only, or that causes side effects while invoking a safe method. What is important, however, is that the client did not request that additional behavior and cannot be held accountable for it. For example, most servers append request information to access log files at the completion of every response, regardless of the method, and that is considered safe even though the log storage might become full and crash the server. Likewise, a safe request initiated by selecting an advertisement on the Web will often have the side effect of charging an advertising account.
Domain agnostic clients are going to assume that GET is safe, which means your survey results could get distorted by web spiders crawling the links, browsers pre-loading resource to reduce the perceived latency, and so on.
Another possibility that works in some cases is to treat the path through the graph as the resource. Each answer link acts like a breadcrumb trail, encoding into itself the history of the clients answers. So a client that answered A and B to the first two questions is looking at /survey/questions/questionThree?AB where the user that answered C to both is looking at /survey/questions/questionThree?CC. In other words, you aren't changing the state of the server, you are just guiding the client through a pre-generated survey graph.

What is the benefit of blocking cookie for clicked link? (SameSite=strict)

So for Google Chrome and Opera, cookies have SameSite attribute, which can have one of two values: strict or lax.
One of a few differences between those is that SameSite=strict will prevent a cookie from being sent when we click a link to another domain.
I know that SameSite is not W3C recommendation yet, but what is potential benefit of this behavior? I find it rather annoying, because the cookie is sent anyway when we refresh or click another link on the current domain. That leads to rather weird user experience - for example: we are logged out, then we click some domestic link or refresh and we are suddenly authenticated.
I'm aware that it's not designed for the greatest user experience, but rather for security. But what are we actually winning here in terms of security?
The benefits of using strict instead of lax are limited. I can see two:
Protection against CSRF attacks via GET requests. Such attacks are not normally possible since they rely on the server implementing GET endpoints with side effects (incorrectly and in violation of the semantics specified by RFC 7231). An example was given by you in the comments:
Imagine we have a very bad design and all our actions are performed on GET method. The attacker placed link saying "Save puppies" which links to http://oursite.com/users/2981/delete. That's the only use-case I can think of - when we have some action done via GET method, while it shouldn't.
Protection against timing attacks. There's a class of attacks - which were already discovered back in 2000, but which Mathias Bynens has recently explored and popularised - that involve a malicious webpage using JavaScript to initiate a request to a page on another domain and then measuring how long it takes, and inferring things about the user from the time taken. An example that Mathias invented is to initiate a request to a Facebook page with a restricted audience, such that it is only accessible by, say, people in Examplestan. Then the evil webpage times how long it takes for the response to come back. Facebook serves the error page when you try to access an inaccessible post faster than it serves the actual post, so if the evil webpage gets a quick response, it can infer that the user is not in Examplestan; if it gets a slow response, then the user is probably an Examplestani.
Since browsers don't stop executing JavaScript on a page when you do top-level navigation until they've received a response from the URL being navigated to, these timing attacks are unfortunately perfectly possible with top-level navigation; your evil page can just navigate the user away via location=whatever, then time how long it takes for the other page to load by repeatedly recording the current timestamp to localStorage in a loop. Then on a subsequent visit the evil page can check how long the page took to start unloading, and infer the response time of the page being timed.
The domain hosting the target page - like facebook.com, in the case of Mathias's example - could protect its users from this kind of attack by using samesite=strict cookies.
Obviously, these limited benefits come at a serious UX tradeoff, and so are typically not worth it compared to the already-pretty-good protections offered by samesite=lax!
There should be an answer here so I'm just going to repeat what's already been said in the comments.
You should always use samesite=lax, unless you're OK with giving your users a terrible user experience. lax is secure enough, as cookies will only be sent for Safe Methods (i.e. GET) when referred from a different domain. If you do dangerous things with GET requests, well, you have bigger problems.

Why can't the browser search POST HTML request's body for the parameters and then bookmark the pages like for GET

POST HTML requests can't be bookmarked but GET ones can be. The reason given is that the parameters are appended in the case of GET whereas they are not in POST. Why can't the browser search POST HTML requests body for the parameters and then bookmark the pages like for GET ?
A POST is meant to update the state of something on the server.
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.
Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.
Source.
Do you want to bookmark a delete method of a website, etc?
Theoretically speaking — they could.
They shouldn't though, as POST requests are supposed to "request that the origin server accept the entity enclosed in the request as a new subordinate of the resource".
Examples given by the spec are:
Annotation of existing resources;
Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles;
Providing a block of data, such as the result of submitting a form, to a data-handling process;
Extending a database through an append operation.
None of these are repeatable operations, so it doesn't make sense for the browser to store the request in a repeatable fashion.

what is the need of GET method in PHP,JAVA ot Dot NET, when you have many advantages of POST over GET method?

In all languages there are GET and POST methods for transfering data. POST is more secure then GET and data transfer size limits are also there. So why in all languages there is a GET method? What are the advantages of the GET method?
GET data is stored in URL, so page with GET request can be bookmarked or linked. You just can't do that with POST. Almost every web-page uses GET to specify requested page, even stackoverflow.com.
Note that GET, POST (and PUT, DELETE, etc.) are not methods of the language you program in, but are HTTP protocol methods.
What do you mean by "transfer data"?
If, by this, you mean to collect data from the user in the browser (or other client application) and then send to the server for updating a database or to process in some other way that creates/updates a resource on the server, consider the POST or PUT method instead (depending on whether the action is idempotent or not).
If, however, you mean to collect data from the user and send to the server to retrieve information and, without updating/creating a resource on the server, the GET method would be appropriate.
It's useful for direct linking for the user. You can immediately put the thread number in the address bar in forums or video numbers for videos in YouTube instead of having to browse the entire site.

What's the proper place for input data validation?

(Note: these two questions are similar, but more specific to ASP.Net)
Consider a typical web app with a rich client (it's Flex in my case), where you have a form, an underlying client logic that maps the form's input to a data model, some way of remoting these objects to a server logic, which usually puts it in a database.
Where should I - generally speaking - put the validation logic, i. e. ensuring correct format of email adresses, numbers etc.?
As early as possible. Rich client frameworks like Flex provide built-in validator logic that lets you validate right upon form submission, even before it reaches your data model. This is nice and responsive, but if you develop something extensible and you want the validation to protect from programming mistakes of later contributors, this doesn't catch it.
At the data model on the client side. Since this is the 'official' representation of your data and you have data types and getters / setters already there, this validation captures user errors and programming errors from people extending your system.
Upon receiving the data on the server. This adds protection from broken or malicious clients that may join the system later. Also in a multi-client scenario, this gives you one authorative source of validation.
Just before you store the data in the backend. This includes protection from all mistakes made anywhere in the chain (except the storing logic itself), but may require bubbling up the error all the way back.
I'm sort of leaning towards using both 2 and 4, as I'm building an application that has various points of potential extension by third parties. Using 2 in addition to 4 might seem superfluous, but I think it makes the client app behave more user friendly because it doesn't require a roundtrip to the server to see if the data is OK. What's your approach?
Without getting too specific, I think there should validations for the following reasons:
Let the user know that the input is incorrect in some way.
Protect the system from attacks.
Letting the user know that some data is incorrect early would be friendly -- for example, an e-mail entry field may have a red background until the # sign and a domain name is entered. Only when an e-mail address follows the format in RFC 5321/5322, the e-mail field should turn green, and perhaps put a little nice check mark to let the user know that the e-mail address looks good.
Also, letting the user know that the information provided is probably incorrect in some way would be helpful as well. For example, ask the user whether or not he or she really means to have the same recipient twice for the same e-mail message.
Then, next should be checks on the server side -- and never assume that the data that is coming through is well-formed. Perform checks to be sure that the data is sound, and beware of any attacks.
Assuming that the client will thwart SQL injections, and blindly accepting data from connections to the server can be a serious vulnerability. As mentioned, a malicious client whose sole purpose is to attack the system could easily compromise the system if the server was too trusting.
And finally, perform whatever checks to see if the data is correct, and the logic can deal with the data correctly. If there are any problems, notify the user of any problems.
I guess that being friendly and defensive is what it comes down to, from my perspective.
There's only a rule which is using at least some kind of server validation always (number 3/4 in your list).
Client validation (Number 2/1) makes the user experience snappier and reduces load (because you don't post to the server stuff that doesn't pass client validation).
An important thing to point out is that if you go with client validation only you're at great risk (just imagine if your client validation relies on javascript and users disable javascript on their browser).
There shoudl definitely be validation on the server end. I am thinking taht the validation should be done as early as possible on the server end, so there's less chance of malicious (or incorrect) data entering the system.
Input validation on the client end is helpful, since it makes the interface snappier, but there's no guarantee that data coming in to the server has been through the client-side validation, so there MUST be validation on the server end.
Because of security an convenience: server side and as early as possible
But what is also important is to have some global model/business logic validation so when you have for example multiple forms with common data (for example name of the product) the validation rule should remain consistent unless the requirements says otherwise.