Protecting client-side database IDs in a web app [duplicate] - html

I've heard that exposing database IDs (in URLs, for example) is a security risk, but I'm having trouble understanding why.
Any opinions or links on why it's a risk, or why it isn't?
EDIT: of course the access is scoped, e.g. if you can't see resource foo?id=123 you'll get an error page. Otherwise the URL itself should be secret.
EDIT: if the URL is secret, it will probably contain a generated token that has a limited lifetime, e.g. valid for 1 hour and can only be used once.
EDIT (months later): my current preferred practice for this is to use UUIDS for IDs and expose them. If I'm using sequential numbers (usually for performance on some DBs) as IDs I like generating a UUID token for each entry as an alternate key, and expose that.

There are risks associated with exposing database identifiers. On the other hand, it would be extremely burdensome to design a web application without exposing them at all. Thus, it's important to understand the risks and take care to address them.
The first danger is what OWASP called "insecure direct object references." If someone discovers the id of an entity, and your application lacks sufficient authorization controls to prevent it, they can do things that you didn't intend.
Here are some good rules to follow:
Use role-based security to control access to an operation. How this is done depends on the platform and framework you've chosen, but many support a declarative security model that will automatically redirect browsers to an authentication step when an action requires some authority.
Use programmatic security to control access to an object. This is harder to do at a framework level. More often, it is something you have to write into your code and is therefore more error prone. This check goes beyond role-based checking by ensuring not only that the user has authority for the operation, but also has necessary rights on the specific object being modified. In a role-based system, it's easy to check that only managers can give raises, but beyond that, you need to make sure that the employee belongs to the particular manager's department.
There are schemes to hide the real identifier from an end user (e.g., map between the real identifier and a temporary, user-specific identifier on the server), but I would argue that this is a form of security by obscurity. I want to focus on keeping real cryptographic secrets, not trying to conceal application data. In a web context, it also runs counter to widely used REST design, where identifiers commonly show up in URLs to address a resource, which is subject to access control.
Another challenge is prediction or discovery of the identifiers. The easiest way for an attacker to discover an unauthorized object is to guess it from a numbering sequence. The following guidelines can help mitigate that:
Expose only unpredictable identifiers. For the sake of performance, you might use sequence numbers in foreign key relationships inside the database, but any entity you want to reference from the web application should also have an unpredictable surrogate identifier. This is the only one that should ever be exposed to the client. Using random UUIDs for these is a practical solution for assigning these surrogate keys, even though they aren't cryptographically secure.
One place where cryptographically unpredictable identifiers is a necessity, however, is in session IDs or other authentication tokens, where the ID itself authenticates a request. These should be generated by a cryptographic RNG.

While not a data security risk this is absolutely a business intelligence security risk as it exposes both data size and velocity. I've seen businesses get harmed by this and have written about this anti-pattern in depth. Unless you're just building an experiment and not a business I'd highly suggest keeping your private ids out of public eye. https://medium.com/lightrail/prevent-business-intelligence-leaks-by-using-uuids-instead-of-database-ids-on-urls-and-in-apis-17f15669fd2e

It depends on what the IDs stand for.
Consider a site that for competitive reason don't want to make public how many members they have but by using sequential IDs reveals it anyway in the URL: http://some.domain.name/user?id=3933
On the other hand, if they used the login name of the user instead: http://some.domain.name/user?id=some they haven't disclosed anything the user didn't already know.

The general thought goes along these lines: "Disclose as little information about the inner workings of your app to anyone."
Exposing the database ID counts as disclosing some information.
Reasons for this is that hackers can use any information about your apps inner workings to attack you, or a user can change the URL to get into a database he/she isn't suppose to see?

We use GUIDs for database ids. Leaking them is a lot less dangerous.

If you are using integer IDs in your db, you may make it easy for users to see data they shouldn't by changing qs variables.
E.g. a user could easily change the id parameter in this qs and see/modify data they shouldn't http://someurl?id=1

When you send database id's to your client you are forced to check security in both cases. If you keep the id's in your web session you can choose if you want/need to do it, meaning potentially less processing.
You are constantly trying to delegate things to your access control ;) This may be the case in your application but I have never seen such a consistent back-end system in my entire career. Most of them have security models that were designed for non-web usage and some have had additional roles added posthumously, and some of these have been bolted on outside of the core security model (because the role was added in a different operational context, say before the web).
So we use synthetic session local id's because it hides as much as we can get away with.
There is also the issue of non-integer key fields, which may be the case for enumerated values and similar. You can try to sanitize that data, but chances are you'll end up like little bobby drop tables.

My suggestion is to implement two stages of security.
"Security through obscurity": You can have integer Id as primary key and Gid as GUID as surrogate key in tables. Whereas integer Id column is used for relations and other database back-end and internal purposes (and even for select list keys in web apps to avoid unnecessary mapping between Gid and Id while loading and saving) and Gid is used for REST Urls i.e for GET,POST, PUT, DELETE etc. So that one cannot guess the other record id. This gives first level of protection against guess-based attacks. (i.e. number series guessing)
Access based control at Server side : This is most important, and you have various way to validate the request based on roles and rights defined in application. Its up to you to decide.

From the perspective of code design, a database ID should be considered a private implementation detail of the persistence technology to keep track of a row. If possible, you should be designing your application with absolutely no reference to this ID in any way. Instead, you should be thinking about how entities are identified in general. Is a person identified with their social security number? Is a person identified with their email? If so, your account model should only ever have a reference to those attributes. If there is no real way to identify a user with such a field, then you should be generating a UUID before hitting the DB.
Doing so has a lot of advantages as it would allow you to divorce your domain models from persistence technologies. That would mean that you can substitute database technologies without worrying about primary key compatibility. Leaking your primary key to your data model is not necessarily a security issue if you write the appropriate authorization code but its indicative of less than optimal code design.

Related

Dealing with many to many relationships in DDD

I've been reading about this so far, and I think that is just a design decision, but unfortanetly I couldn't figure out which is the best approach.
I have many entities, among them are Application, User, Role and Permissions. There are some rules as follows,
An Application must have at least one User.
An User must be in at least one Application.
Each User have different Roles, password, and others attributes in each Application it belongs.
Each Role have different Permissions, and so on.
My problem is how should I build each Aggregate?, my approaches have been the followings:
My first approach was create an Aggregate for Application, User, Role, etc. But should I create a different aggregate for the many to many relationship between Application and User because of the adittional attributes it will have?, or should I convert the many to many relationship in an one to many relationship?, if so, how could I achieve it?
The second one was create just one Aggregate for Application, and add User as a ChildEntity, but I'm not sure if it is appropiated for the given context, if so, should I have Role and Permission entities as ChildEnties in my Application Aggregate too?
Please let me to know your thoughts about this, and if there is another point of view that could help me, it will be great. thank you in advice.
Honestly these rules seem rather artificial. If you absolutely need strong consistency on all these then you need a giant ApplicationAccess aggregate which will certainly be very busy because any access rights changes for a given application would conflict with any other change for the same application.
That giant AR is not even enough on it's own to cover the "An User must be in at least one Application." rule which means you'd probably have to update the User AR along with the ApplicationAccess AR in every role member addition/removal.
e.g.
// Assume transactional
function removeUserFromRole(userId, applicationId, roleId) {
applicationAccess = applicationAccessRepo.existingOfId(applicationId);
user = userRepo.existingOfId(userId);
applicationAccess.removeUserRole(user, roleId);
user.trackRoleRemoved(); // decrement and throws if 0 (trackRoleAdded would increment)
}
Like you can guess this design doesn't seem very scalable. It might work for a small amount of users without too much concurrent access modifications but it's probably the wrong design otherwise. If you go for it you would probably want to use pessimistic locking rather than optimistic + retries.
If you want a more effective model I think you will have no choice but to explore the possibilities of loosening up the rules and allow them to be eventually consistent rather than strongly consistent.
For instance, why does it matter that much that a User has no access? Could you just run exception reports to list these? Could you just flag the Users so that their access need to be updated manually?
The same applies to all the other rules and there's endless possibilities to deal with eventual consistency. You could have automated compensating actions that reverts some actions if they are found to have violated some rules or just flag & have manual resolutions like described above etc.
Anyway, a good way to question the rules is to analyze the "cost" of a rule being violated through concurrent modifications and how often that might happen under expected concurrent usage should you put things in distinct ARs and have possibly stale checks of rules.

Segregating sandbox environment

For a site that is using a Sandbox mode, such as a Payment site, would a separate database be used, or the same one?
I am examining two schemas for the production and sandbox environment. Here are the two options.
OPTION 1:
Clone database, route requests to the correct database based upon sandbox mode.
OPTION 2
Single database, 'main tables' have an is_sandbox boolean.
What would be the pros and cons of each method?
In most situations, you'd want to keep two separate databases. There's no good reason to have the two intermingled in the same database, and a lot of very good reasons to keep them separated:
Keeping track of which entities are in which "realm" (production vs. sandbox) is extra work for your code, and you'll likely have to include it in a lot of places.
You'll need that logic in your database schema as well. UNIQUE indexes all have to include the realm, for instance.
If you forget any of that code, you've got yourself a potential security vulnerability. A malicious user could cause data from one realm to influence the other. Depending on what your application is, this could range anywhere from annoying to terrifying. (If it's a payment application, for instance, the potential consequences are incredibly dire: "pretend" money from the sandbox could be converted into real money!)
Even if your code is all perfect, there'll still be some information unavoidably leaked between the realms. For instance, if your application uses any sequential identifiers (AUTO_INCREMENT in MySQL, for instance), gaps in values seen in the sandbox will correspond with values used in production. Whether this matters is debatable, though.
Using two separate databases neatly solves all these problems. It also means you can easily clean out the sandbox when needed.
Exception: If your application is an almost entirely public web site (e.g, like Stack Overflow or Wikipedia), or involves social aspects that are difficult to replicate in a sandbox (like Facebook), more integrated sandboxes may make more sense.

How to store data in db so that nobody with access to it can understand it?

We are soon releasing a private beta of a domestic economy website.
The website of course gathers information from a user's (identified by email only) private financial situation: salary, rent, bills, mortages, etc. All of these are really sensitive information and should not be accessible by anyone - not even us, the tech ppl.
What are best practises for storing data in a non-readable fashion? Of course, member passwords are already hashed in the db.
What I'm thinking about is to encrypt all data using some kind of key. But then again, the application needs access to that key. And I don't want to store it in the db. If supplied by user, I guess I could keep it in the session in order to decrypt every retreived db result. But what about overhead?
Pls, anyone with guidelines?
First, separate the personally-identifiable information from the statistics. This allows you to perform computations without putting sensitive data at risk. Next, strongly encrypt the personally-identifiable information, and store the keys in a hardened system with limited access. Don't use the same key for all data, but the number of keys you do use is a design decision that is up to you. More keys will be more secure, but harder to handle.
There may be existing standards that apply to your data, depending on where in the world you are and what industries you are working with. Seek these out and follow them.
Anyone with admin level access (your tech people for example) can get access to any decryption keys stored on the machine. Regarding session, any admin level person can do memory dumps to pull keys out of session.
Point is, the only "solution" of sorts is great off machine access logging combined with a strong legal document acknowledging that they are being watched and you will prosecute. Also you should perform annual background checks of your tech people. Next, the number of people with that type of access should be extremely limited. As a CEO who takes an active part in our development process, even I don't have access to our production systems.
Regardless, you should still encrypt the database, especially the PII data. Depending on your industry you could be sued if you don't, and never mind the bad press if someone does pull a data dump.
What about a second keyphrase secured by the users keyphrase? When he logs in his second will be decrypted and stored.

Best practices to store CreditCard information into DataBase

In my country the online payments are not an old thing, the first time i saw a web application taking payments directly to a local bank account was last year.
So, Im a newbie coding web payment system.
My question is, what are the best practices to store creditcard information into the database...
I have many ideas: encrypting the creditcard, database security restriction, etc.
What have you done?
DON'T DO IT
There is simply far too much risk involved, and you will typically need to be externally audited to ensure that you're complying with all the relevant local laws and security practises.
There are many third-party companies that do it for you that have already gone through all trouble of making sure their system is secure, that they comply with local laws and so on. An example in the US that I have used in the past is authorize.net. Some banks also have systems that you can hook into to store credit card data and process payments.
I realise the country you're in may not have as strict laws as the U.S., but in my opinion that's no excuse for rolling your own. When you're dealing with other people's money, the risk is just too much to warrant.
In 2020, use Stripe, and avoid storing payment information yourself.
HISTORICAL ANSWER:
For this, I recommend a comprehensive, layered approach.
First, storing credit card info should be an option.
Secondly, the data should be stored securely, using a strong form of encryption. I recommend AES with 256bit key size. Make sure when choosing your key, you use the entire keyspace (it's a rookie mistake to just use a randomly generated alphanumericsymbol string as a key).
Third, the AES key needs to be properly secured. Do not embed the value inside your code. If you are using windows, consider using DPAPI.
Fourth, you will want to setup database permissions so that applications and computers will have access on a need to know basis.
Fifth, secure the connection string to your database.
Sixth, ensure that any application that will have access to the credit card data, will properly secure it.
At miniumum follow the PA DSS (Payment Appliction Data Security Standard). More info can be found here:
https://www.pcisecuritystandards.org/security_standards/pa_dss.shtml
Also it would be wise to look at PCI DSS, which could be found here:
https://www.pcisecuritystandards.org/security_standards/pci_dss.shtml
You should avoid storing any credit card information due to the risks to you and to customers of doing so.
Encrypt encrypt encrypt. Don't decrypt if you don't absolutely have to - don't decrypt to show the last 4 digits. Don't decrypt to tell the user what their card was.
In fact, if you can, don't even keep the encrypted card numbers in the same physical server as the rest of the user information.
Authorize.net has a Customer Information Manager API that allows you to store customer information in their system. It costs $20/mo. as an add-on to your account.
I suggest you encrypt card numbers with a strong algorithm( similar AES) and a long secret key.
Then,keep your secret key in a secure place similar an external hard or optical disk.
When you need to secret key,use external hard.
If you are using a shared host, you have to store your secret key in an external device.
Strict your database
Define strict users for your database
Remove root user of your database if it is not needed.

Secure encrypted database design

I have a web based (perl/MySQL) CRM system, and I need a section for HR to add details about disciplinary actions and salary.
All this information that we store in the database needs to be encrypted so that we developers can't see it.
I was thinking about using AES encryption, but what do I use as the key? If I use the HR Manager's password then if she forgets her password, we lose all HR information. If she changes her password, then we have to decrypt all information and re-encrypt with the new password, which seems inefficient, and dangerous, and could go horrifically wrong if there's an error half way through the process.
I had the idea that I could have an encryption key that encrypts all the information, and use the HR manager's password to encrypt the key. Then she can change her password all she likes and we'll only need to re-encrypt the key. (And without the HR Manager's password, the data is secure)
But then there's still the problem of multi-user access to the encrypted data.
I could keep a 'plaintext' copy of the key off site, and encrypt it with each new HR person's password. But then I know the master key, which doesn't seem ideal.
Has anyone tried this before, and succeeded?
GnuPG allows documents to be encrypted using multiple public keys, and decrypted using any one of the corresponding private keys. In this way, you could allow data to be encrypted using the public keys of the everyone in the HR department. Decryption could be performed by any one having one of the private keys. Decryption would require both the private key and the passphrase protecting the key to be known to the system. The private keys could be held within the system, and the passphrase solicited from the user.
The data would probably get quite bloated by GnuPG using lots of keys: it has to create a session key for the payload and then encrypt that key using each of the public keys. The encrypted keys are stored alongside the data.
The weak parts of the system are that the private keys need to be available to the system (ie. not under the control of the user), and the passphrase will have to pass through the system, and so could be compromised (ie. logged, stolen) by dodgy code. Ultimately, the raw data passes through the system too, so dodgy code could compromise that without worrying about the keys. Good code review and release control will be essential to maintain security.
You are best avoiding using MySQL's built in encryption functions: these get logged in the replication, slow, or query logs, and can be visible in the processlist - and so anyone having access to the logs and processlist have access to the data.
Why not just limit access to the database or table in general. That seems much easier. If the developer has access to query the production, there is no way to prevent them from seeing the data b/c at the end of the day, the UI has to decrypt / display the data anwyays.
In the experience I've had, the amount of work it takes to achieve the "developers cannot see production data at all" is immense and nearly imposible. At the end of the day, if the developers have to support the system, it will be difficult to achieve. If you have to debug a production problem, then it's impossible not to give some developers access to production data. The alternative is to create a large number of levels and groups of support, backups, test data, etc..
It can work, but it's not as easy as business owners may think.
Another approach is to use a single system-wide key stored in the database - perhaps with a unique id so that new keys can be added periodically. Using Counter Mode, the standard MySQL AES encryption can be used without directly exposing the cleartext to the database, and the size of the encrypted data will be exactly the same as the size of the cleartext. A sketch of the algorithm:
The application generates a unique initial counter value for the record. This might be based on some unique attribute of the record, or you could generate and store a unique value for this purpose.
The application generates a stream of counter blocks for the record based on the initial counter value. The counter stream must be the same size or up to 1 block larger than the cleartext.
The application determines which key to use. If keys are being periodically rotated, then the most recent one should be used.
The counter stream is sent to the database to be encrypted: something like
select aes_encrypt( 'counter', key ) from hrkeys where key_id = 'id';
The resulting encrypted counter value is trimmed to the length of the cleartext, and XORed with the cleartext to produce the encrypted text.
The encrypted text is stored.
Decryption is exactly the same process applied to the encrypted text.
The advantages are that the cleartext never goes any where near the database, and so the administrators cannot see the sensitive data. However, you are then left with the problem of preventing your adminstrators from accessing the encrypted counter values or the keys. The first can be achieved by using SSL connections between your application and database for the encryption operations. The second can be mitigated with access control, ensuring that the keys never appear in the database dumps, storing the keys in in-memory tables so that access control cannot be subverted by restarting the database with "skip-grants". Ultimately, the only way to eliminate this threat is to use a tamper-proof device (HSM) for performing encryption. The higher the security you require, the less likely you will be able to store the keys in the database.
See Wikipedia - Counter Mode
I am just thinking out loud.
This seems to call for a public/private key mechanism. The information would be stored encrypted with the HR public key and would only be viewable by someone in possession of the associated private key.
This, to me, seems to rule out a web based interface to view these confidential data (entering them via the web interface is certainly feasible).
Given that individuals come and go, tying the keys to a specific person's account seems infeasible. Instead, one must handle key distribution separately and have a mechanism for someone to change the keypair used (and re-encrypt the database — again without the use of a web interface) in case the current HR manager is replaced with someone else. Of course, nothing would prevent the HR manager from dumping all the data before leaving while before the keys are replaced.
I'm not sure how feasible this is currently, or what current stable DB systems have support for this, but alternate authentication mechanisms at the database level may help. For example Drizzle, a refactoring of the MySQL code base, supports (or aims to?) completely pluggable authentication, allowing no auth, server housed auth, or auth through PAM or some other mechanism, meaning you can use LDAP.
If you had different levels of access based on the database connection, and the application login also specified what you could actually access in the database, you could theoretically build a system where it wasn't possible to access the confidential database info unless using an account with specific access rights, regardless of the privilege escalation attempts in the application itself.
As long as the people setting user account access rights can be trusted or themselves are OK to see the confidential information, this should be fairly secure.
P.S. It might be useful to use a generic DB connection for "regular" application information, but when an attempt to access confidential information is made, then the specific DB connection is attempted. This allows for a few DB connections to handle most requests, assuming the majority of users aren't viewing confidential info. Otherwise, a separate DB connection per user may become burdensome to the DB.