What's the proper place for input data validation? - language-agnostic

(Note: these two questions are similar, but more specific to ASP.Net)
Consider a typical web app with a rich client (it's Flex in my case), where you have a form, an underlying client logic that maps the form's input to a data model, some way of remoting these objects to a server logic, which usually puts it in a database.
Where should I - generally speaking - put the validation logic, i. e. ensuring correct format of email adresses, numbers etc.?
As early as possible. Rich client frameworks like Flex provide built-in validator logic that lets you validate right upon form submission, even before it reaches your data model. This is nice and responsive, but if you develop something extensible and you want the validation to protect from programming mistakes of later contributors, this doesn't catch it.
At the data model on the client side. Since this is the 'official' representation of your data and you have data types and getters / setters already there, this validation captures user errors and programming errors from people extending your system.
Upon receiving the data on the server. This adds protection from broken or malicious clients that may join the system later. Also in a multi-client scenario, this gives you one authorative source of validation.
Just before you store the data in the backend. This includes protection from all mistakes made anywhere in the chain (except the storing logic itself), but may require bubbling up the error all the way back.
I'm sort of leaning towards using both 2 and 4, as I'm building an application that has various points of potential extension by third parties. Using 2 in addition to 4 might seem superfluous, but I think it makes the client app behave more user friendly because it doesn't require a roundtrip to the server to see if the data is OK. What's your approach?

Without getting too specific, I think there should validations for the following reasons:
Let the user know that the input is incorrect in some way.
Protect the system from attacks.
Letting the user know that some data is incorrect early would be friendly -- for example, an e-mail entry field may have a red background until the # sign and a domain name is entered. Only when an e-mail address follows the format in RFC 5321/5322, the e-mail field should turn green, and perhaps put a little nice check mark to let the user know that the e-mail address looks good.
Also, letting the user know that the information provided is probably incorrect in some way would be helpful as well. For example, ask the user whether or not he or she really means to have the same recipient twice for the same e-mail message.
Then, next should be checks on the server side -- and never assume that the data that is coming through is well-formed. Perform checks to be sure that the data is sound, and beware of any attacks.
Assuming that the client will thwart SQL injections, and blindly accepting data from connections to the server can be a serious vulnerability. As mentioned, a malicious client whose sole purpose is to attack the system could easily compromise the system if the server was too trusting.
And finally, perform whatever checks to see if the data is correct, and the logic can deal with the data correctly. If there are any problems, notify the user of any problems.
I guess that being friendly and defensive is what it comes down to, from my perspective.

There's only a rule which is using at least some kind of server validation always (number 3/4 in your list).
Client validation (Number 2/1) makes the user experience snappier and reduces load (because you don't post to the server stuff that doesn't pass client validation).
An important thing to point out is that if you go with client validation only you're at great risk (just imagine if your client validation relies on javascript and users disable javascript on their browser).

There shoudl definitely be validation on the server end. I am thinking taht the validation should be done as early as possible on the server end, so there's less chance of malicious (or incorrect) data entering the system.
Input validation on the client end is helpful, since it makes the interface snappier, but there's no guarantee that data coming in to the server has been through the client-side validation, so there MUST be validation on the server end.

Because of security an convenience: server side and as early as possible
But what is also important is to have some global model/business logic validation so when you have for example multiple forms with common data (for example name of the product) the validation rule should remain consistent unless the requirements says otherwise.

Related

Best practice for email links that will set a DB flag?

Our business wants to email our customers a survey after they work with support. For internal reasons, we want to ask them the first question in the body of the email. We'd like to have a link for each answer. The link will go to a web service, which will store the answer, then present the rest of the survey.
So far so good.
The challenge I'm running into: making a server-side changed based on an HTTP GET is bad practice, but you can't do a POST from a link. Options seem to be:
Use an HTTP GET instead, even though that's not correct and could cause problems (https://twitter.com/rombulow/status/990684453734203392)
Embed an HTML form in the email and style some buttons to look like links (likely not compatible with a number of email platforms)
Don't include the first question in the email (not possible for business reasons)
Use HTTP GET, but have some sort of mechanism which prevents a link from altering the server state more than once
Does anybody have any better recommendations? Googling hasn't turned up much about this specific situation.
One thing to keep in mind is that HTTP is specifying semantics, not implementation. If you want to change the state of your server on receipt of a GET request, you can. See RFC 7231
This definition of safe methods does not prevent an implementation from including behavior that is potentially harmful, that is not entirely read-only, or that causes side effects while invoking a safe method. What is important, however, is that the client did not request that additional behavior and cannot be held accountable for it. For example, most servers append request information to access log files at the completion of every response, regardless of the method, and that is considered safe even though the log storage might become full and crash the server. Likewise, a safe request initiated by selecting an advertisement on the Web will often have the side effect of charging an advertising account.
Domain agnostic clients are going to assume that GET is safe, which means your survey results could get distorted by web spiders crawling the links, browsers pre-loading resource to reduce the perceived latency, and so on.
Another possibility that works in some cases is to treat the path through the graph as the resource. Each answer link acts like a breadcrumb trail, encoding into itself the history of the clients answers. So a client that answered A and B to the first two questions is looking at /survey/questions/questionThree?AB where the user that answered C to both is looking at /survey/questions/questionThree?CC. In other words, you aren't changing the state of the server, you are just guiding the client through a pre-generated survey graph.

HTML- Required "bug" [duplicate]

Using a simple tool like FireBug, anyone can change javascript parameters on the client side. If anyone take time and study your application for a while, they can learn how to change JS parameters resulting in hacking your site.
For example, a simple user can delete entities which they see but are not allowed to change. I know a good developer must check everything on server side, but this means more overhead, you must do checks with data from a DB first, in order to validate the request. This takes a lot of time, for every action someone must validate it, and can only do this by fetching the needed data from DB.
What would you do to minimize hacking in that case?
A more simple way to validate is to add another parameter for every javascript function, this parameter must be a signature between previous parameters and a secret key.
How good sounds the solution above to you?
Our team use teamworkpm.net to organize our work. I just discovered that I can edit someone else tasks by changing a javascript function (which initially edit my own tasks).
when every function call to server, in server side before you do the action , you need to check if this user is allowed to do this action.
It is necessary to build server-side permissions mechanism to prevent unwanted actions, you may want to define groups of users, not individual user level, it makes it easier.
Anything on the client side could be spoofed. If you use some type of secret key + parameter signature, your signature algorithm must be sufficiently random/secure that it cannot be reverse engineered.
The overhead created with adding client side complexity is better spent crafting proper server side validations.
What would you do to minimize hacking in that case ?
You can't work around using validation methods on the server side.
A more simple way to validate is to add another parameter for every javascript function, this parameter must be a signature between previous parameters and a secret key.
How good sounds the solution above to you ?
And how do you use the secret key without the client seeing it? As you self mentioned before, the user easily can manipulate your javascript, and also he can read everything in javascript, the secret key, too!
You can't hide anything in JavaScript, the only thing you can do is to obscure things in JavaScript, and hope nobody tries to find out what you try to hide.
This is why you must validate everything on the server. You can never guarantee that the user won't mess about with things on the client.
Everything, even your javascript source code is visible to the client and can be changed by them, theres no way around this.
There's really no way to do this completely client-side. If the person has a valid auth cookie, they can craft any sort of request they want regardless of the code on the page and send it to your server. You can do things with other, encrypted cookies that must sent back with the request and also must match the inputs on the page, but you still need to check this server-side. Server-side security is essential in protecting your application from unauthorized access and you must ensure, server-side, that every action being performed is one that the user is authorized to perform.
You certainly cannot hide anything client side, so there is little point in trying to do so.
If what you are saying is that you are sending something like a user ID and you want to ensure that the returned value has not been illicitly changed then the simplest way of doing so it probably to generate and send a UUID alongside it, and check on return that the value of the uuid matched that stored on the server for the userID before doing any further processing. The space for uuid's is so large that you can discount any false hits ever occurring.
As to actual server side processing vulnerabilities:- you should simply always build in your security/permissions as close to the database as you can, and defiantly not in the client. There's nothing different in the scenario you outline from any normal client-server design.
Peter from Teamworkpm.net here - I'm one of the main developers and was concerned to come across this report about a security problem. I checked into this and I am happy that is not possible to delete a task that you shouldn't have access to.
You get a message saying "You do not have permission to delete this task".
I think it is just the confusion between being a Project Administrator and being an overall Administrator that is the problem here :- You may not be a member of a project but as an overall administrator, you still have permission to delete any task within your Teamwork site. This is by design.
We take security very seriously and it's all implemented server side because as Jens F says, we can't reply on client side security.
If you do come across any issues in TeamworkPM that you would like to discuss, we'd encourage any of you to just hit the feedback link and you'll typically get an answer within a few hours.

prevent SQL injection using html only

Am trying to validate the inputs for a comment box in order to accept only text, and to alert a message if the user entered number (1-0) or symbol (# # $ % ^ & * + _ = ). to prevent SQL injection
is there is a way to do that in html
You can never trust what comes from the client. You must always have a server side check to block something such as an SQL injection.
You can of course add the client side validation you mentioned but it's only to help users not enter junk data. Still can't trust it once it's sent to the server.
On using Javascript/HTML to improve security
is there is a way to do that in html
No. As others have pointed out, you cannot increase security by doing anything in your HTML or Javascript.
The reason is that the communication between your browser and your server is totally transparent to an attacker. Any developer is probably familiar with the "developer tools" in Firefox, Chrome etc. . Even those tools, which are right there in most modern browsers, are enough to create arbitrary HTML requests (even over HTTPS).
So your server must never rely on the validity of any part of the request. Not the URL, not the GET/POST parameters, not the cookies etc.; you always have to verify it yourself, serverside.
On SQL injection
SQL injection is best avoided by making sure never to have code like this:
sql = "select xyz from abc where aaa='" + search_argument + "'" # UNSAFE
result = db.execute_statement(sql)
That is, you never want to just join strings together to for a SQL statement.
Instead, what you want to do is use bind variables, similar to this pseudo code:
request = db.prepare_statement("select xyz from abc where aaa=?")
result = request.execute_statement_with_bind(sql, search_argument)
This way, user input is never going to be parsed as SQL itself, rendering SQL injection impossible.
Of course, it is still wise to check the arguments on the client-side to improve user experience (avoid the latency of a server roundtrip); and maybe also on server-side (to avoid cryptic error messages). But these checks should not be confused with security.
Short answer: no, you can't do that in HTML. Even a form with a single check box and a submit button can be abused.
Longer answer...
While I strongly disagree that there's nothing you can do in HTML and JavaScript to enhance security, a full discussion of that goes way beyond the bounds of a post here.
But ultimately you cannot assume that any data coming from a computer system you do not control is in any way safe (indeed, in a lot of applications you should not assume that data from a machine you do control is safe).
Your primary defence against any attack is to convert the data to a known safe format for both the sending and receiving components before passing it between components of your system. Here, we are specifically talking about passing data from your server-side application logic to the database. Neither HTML nor JavaScript are involved in this exchange.
Moving out towards the client, you have a choice to make. You can validate and accept/reject content for further processing based on patterns in the data, or you can process all the data and put your trust in the lower layers handling the content correctly. Commonly, people take the first option, but this gives rise to new security problem; it becomes easy to map out the defences and find any gaps. In an ideal world that would not matter too much - the deeper defences will handle the problem, however in the real world, developers are limited by time and ability. If it comes down to a choice of where you spend your skills/time budget, then the answer should always be on making the output safer over validating input.
The question seems to be more straight forward, but i keep my answer the same as before:
Never validate information on the clientside. It makes no sense, because you need to validate the same information with the same (or even better) methods on the server! Validating on Clientside only generates unnecessary overhead as the information from a client can not be trusted. Its a waste of energy.
If you have problems with users sending many different Symbols but no real messages, you should shut down your server immideately! Because this could mean that your users try to find a way to hack into the server to gain control!
Some strange looking special character combinations could allow this if the server doesn't escape user input properly!
In short:
HTML is made for content display, CSS for design of the content, Javascript for interactivity and other Languages like Perl, PHP or Python are made for processing, delivering and validation of information. These last called Languages normally run on a server. Even if you use them on a server you need to be very carefull, as there are possible ways to render them useless too. (For instance if you use global variables the wrong way or you dont escape user input properly.)
I hope this helps to get the right direction.

Failures in eventual consistent system and user experience [duplicate]

When using distributed and scalable architecture, eventual consistency is often a requirement.
Graphically, how to deal with this eventual consistency?
Users are used to click save, and see the result instantaneously... with eventual consistency it's not possible.
How to deal with the GUI for such scenarios?
Please note the question applies both for desktop applications and web applications.
PS: I'm working with the Microsoft platform, but I imagine the question applies to any technology...
A Task Based UI fits this model great. You create and execute tasks from the UI. You can also have something like a task status monitor to show the user when a task has executed.
Another option is to use some kind of pooling from the client. You send the command, and pool from the client until the command completed and the new data is available. You will have a delay in some cases from when the user presses save to when he will see the new record, but in most cases it should be almost synchronous.
Another (good?) option is to assume/design commands that don't fail. This is not trivial but you can have a cache on the client and add the data from the command to that cache and display it to the user even before the command has been executed. If the command fails for some unexpected situation, well then just design a good "we are sorry" message for misleading the user for a few seconds.
You can also combine the methods above.
Usually eventual consistency is more of a business/domain problem, and you should have your domain experts handle it.
I think that other answers mix together CQRS in general and eventual consistency in particular. Task-based UI is very suitable for CQRS but it does not resolve the issue with eventually consistent read model.
First, I would like to challenge your statement:
Users are used to click save, and see the result instantaneously... with eventual consistency it's not possible.
What do you by this? Why is it not possible to see the result immediately? I think the issue here is your definition of result.
The result of any action is that that action has been performed. There are numerous of ways to show this! It depends on what kind of action do you want to complete. Examples:
Send an email: if user has entered a correct email address, it is almost guaranteed that the action will complete successfully. To prevent unexpected failures one might use durable queues since this kind of actions do not need to be done synchronously. So you just say "email sent". Typically you see this kind of response when you ask to reset your password.
Update some information in a user profile: after you have validated the new data on the client, most probably the command will succeed too since the only thing that could happen is the database error (if you use database). Again, even this can be mitigated by using durable queues. In this case you just show the updated field in the same form. The good practice for SPA is to have a comprehensive data store on the client side, like Redux does. In this case you can safely update the server by sending a command and also updating the client-side store, which will result in UI to shows the latest data. Disclaimer: some answers refer to this technique as "tricking the user", but I disagree with this definition.
If you have commands that are prone to error, you can use techniques that are already described in other answers like Websockets or Server-side events to communicate errors back. This requires quite a lot of additional work. You can also send a command and wait for reply or execute commands synchronously. Some would say "this is not CQRS" but this would be just another dogma to be challenged. Ensuring the command has completed the execution in combination with the previous point (client-side data store) will be a good solution.
I am not sure if there is any 100% bullet proof technique that allows you to always show non-stale data from the read model. I think it goes against the principles of CQRS. Even with real-time events you will only get events that indicate that you write model has been updated. Still, your projections could have failed and reacting on this is a whole other story.
However, I would not concentrate that much on this issue. The fact is that well-tested projections and almost-guaranteed commands will work very well. For error handling in 90% of situations it is enough to have some manual or half-manual process to recover from those errors. For the last 10% you can combine generic "error" messages pushed from the server saying "sorry, your action XXX has failed to execute" and the top priority actions could have some creative process behind them but in reality those situations would be very very rare.
There are 2 ways:
To trick a user (just to show that things has happened then they
really hasn't happened yet)
Show that system is processing request
and use polling in background (not good) or just timer with value of
your SLA.
I prefer the 1st option.
As someone has already mentioned, task based UI's fit well for this, and what I would do is employ a technique that 'buys you time' for the command to propagate.
For example, imagine we are on a list screen, where the user can perform various actions, one of which being to add a new item to the list. After choosing to add an item you could display a "What would you like to do next?" which could have 'Add another item', 'Do this task', 'Do some other task', 'Go back to list'.
By the time they have clicked on an option, the data would have hopefully been refreshed.
Also, if you're using a task based UI, you can analyse the patterns of task execution and use these "what would you like to do next" screens to streamline the UI. Similar to amazon's "other people also bought these items".
As previously stated, it is fine to tell the user that the request (command) has been acknowledged (successfully issued). In case of some failure, the system should communicate this to the requester, by means of:
email;
SMS;
custom inbox (e.g. like the SO inbox);
whatever.
E.g., mail client / service:
I am sending a mail to a wrong address;
the mail service says: "email sent successfully :)";
after few minutes, I receive a mail from the service: "email could not be delivered".
I believe a great way to inform the user about a recent failure is to present him an error panel while he's navigating through the application. A user gesture might be required in order to dismiss that alert etc.
For example:
I wouldn't go with tricking the user or blocking him from committing some other actions. I would rather go for streaming data toward UI after they are being acknowledged by a read side. Let's consider these two cases:
Users saves data and expects result. Connection is established toward server. After they are being acknowledged by a read side, they are streamed toward UI and UI is being updated.
User saves data and refreshes web page. Upon reload, data are being fetched from data store and connection for streaming is established. If read side didn't update the data store in the meantime, there's still an opened stream and UI should be updated after data reaches the read side.
Why streaming from read side and not directly from write side? Simply, that would be a confirmation that read side has been reached.
From technical aspect, Server-Sent Events could be used.
Disadvantage:
Results will still not be reflected immediately by a read side. But at least, in most cases, user will be able to continue with his work without being blocked by a UI.
There are several ways to handle eventual consistency. All of them are really to occupy the time from the User's action until the backend refresh.
User Reads A given user can only read from the same database node that they write to. Other users read from the replicated nodes. PROS: UI is quick enough, and application stays in sync. CONS: Your service architecture has to track and route Users to specific database nodes.
Disable the UI until the action has completed, and refresh it. Java Server Faces has a classic example of this. One could create a modal with a loading spinner to cover the UI until the refresh was completed. PROS: UI stays in sync with application state. CONS: Most every action creates a blocked UI. Users get very frustrated by the restricted UI, and will complain of application slowness.
Confirmation Immediately thank the user for their submission. Then let them know later (email, SMS, in-app notification) whether or not the action was completed. PROS: It's fast up front. CONS: UI lags behind system until refresh. Even with a notice, the User may get confused that they don't see the updates. It also requires integration of various communication channels. Users won't see their changes right away. If the action fails, they may not know until it's too late.
Fake it Optimistically assume that the action will complete. Show the User the resulting UI (upvote, comment, credit card confirmation, etc) and allow them to continue as if it succeeded. If there were failures, immediately show them as contextual errors: alerts next to the undone upvotes, in-app alert on the post with the failed comment, email for the declined credit card. PROS: UI feels much faster. CONS: UI is temporarily out of sync with application state, and you must resolve that. One case: you might fake creation of content with temp IDs. But after content is created, then the temp IDs will be wrong until the refresh. Second case, you might need to store all state changes on the UI after the action until the refresh. Then you need some Resolver to apply all the local state changes since the action was issued. This is resolution is non-trivial.
Web Sockets Subscribe the UI to an event stream so that when the action is completed on the backend, it is pushed to the front end. Is it one-way or two-way streaming? PROS: UI feels fast, and it's in sync with the application state. CONS: Consistent browser support, need a backend source of streaming events, and socket server scalability.

Design: Exposing user interface behaviour to external systems

I'm working on a web application (Java EE backend) which contains a fairly complex input modal. This input modal allows the user to capture data, but it has a bunch of (JavaScript) restrictions, such as mandatory fields, fields only being available if a specific value is entered, etc.
I have to expose this functionality to external systems and allow them to submit this data to my server. These external systems can be both web or client based (but I can assume that the clients will have internet access). My first thought is to provide some kind of definition of the fields and stuff like mandatory to these systems through services, and have them render the input modal however they want. This has been met with resistance though, because the types of fields and restrictions will likely change quite a bit during the next few months of development. These external systems have different deployment timelines, and for this to work we'll have to firstly duplicate all the logic handling these restrictions across all systems, and secondly synchronize our deployments.
An alternative which has been proposed is to have the external systems call my modal through standard HTTP and render it either in an iframe or in an embedded rendered. This solves all of the previous complaints, but it leaves me feeling a little uneasy.
Are there any alternatives we are not thinking of? Maybe some kind of UI schema with existing render libraries for the different platforms? What are your thoughts on the second proposal, any major concerns or is this the "best" solution?
Edit: To clarify, I'll of course still perform backend validation regardless of the frontend decision, as I can't just trust the incoming data.
The constraints that you mention (mandatory fields etc.) really have nothing to do with the user interface. You are also right that it is not a good idea to have your backend render web content.
Your first proposal sounds like a good idea, here's how I would solve the issues you mentioned:
Do all the validation on the backend and send a model object to the client, representing the current state of the UI (field name, type, enabled/disabled, error message etc.).
Keep the client as dumb as possible. It should only be responsible for rendering the model on a window / webpage. Whenever a field is changed and it requires validation, submit the model to the backend for validation and get back a new model to be displayed. (You could optimize this by only returning the fields that changed.)
Doing it this way will keep your validation logic in one place (the backend) and the clients rarely need to be modified.
I have been faced with same issues in several previous projects. Based on this experience I can honestly say that server-side validation is the thing you will likely have to implement to avoid rubbish being committed from client side regardless if it comes from GUI or other third party system via API. You can choose one of available validation frameworks, I used Apache Commons Validator and think it is well, or you can implement your own one. On the other hand client side pre-validation, auto-completion and data look up are the solutions you should have to make human users happy. Do not consider about code duplication, just make your system right way from the business point of view.