Having hard time with selecting appropriate variable names - language-agnostic

I'm mostly developing in JS, but since React and his friends entered my life I feel like I'm writing much more variable names, more than I used to do and I'm having troubles with naming them.
Usually trying to pick something that I can remember and someone else can remember too, also trying to be logical and not annoyingly long.
So for instance in a messaging app, user might represent logged in user, recipient user, or user API.
What I do is usually user refers to any other user in the system.
me refers to the logged-in user. API depends...
What do you do? Is it something that should be camelcase with long variable names?
userLoggedIn, userRecepient, userAPI etc? Is there a commonly used pattern or a source (book, community, etc.) that I can look up to?

My strategy is to first think about what the variable stands for in the most general sense. In your example it's a user. Then I prefix the noun with attributes which makes it sufficiently precise. In your example it would be loggedInUser or currentUser, and recipientUser. I would not recommend the name userLoggedIn in this context because it sounds like a truth value (boolean variable). For boolean variables I start by thinking of an adjective and add nouns in front of it.

Related

Email Activation in rails

I have been looking through examples of email activation on rails and most examples just have a column for activation token and confirmed in their user table. I am not sure but I don't think that this is a good idea as when a user is activated almost both those columns seems like a waste. The way I was thinking of doing activation was having a seperate model called Activation which would have_one :user a ONE-WAY association and I would set the role of the user in my site as "PENDING" or something similar. The activation table would hold an activation token(s) per user. Then a link would be generated with the activation token(s) and the user would be sent an email containing something like www.mysite.com/activate?token='some_really_long_hash'. Upon clicking the link the role of my user would be set to "MEMBER" or something similar. Does this seem like a good idea? I can't conceive any pitfalls of activation this way. Suggestions? Comments?
It sounds like you're at the intro stages of implementing a state machine design pattern on your user model, and no it isn't a bad approach to design. Its just more complicated than what most people need.
I think the State Machine Plugin might be the type of approach you're looking to perform. Obviously this might be more than you're looking for but the approach would be the same.
Also check out these posts:
Why developers should be force-fed state machines
why-developers-never-use-state-machines
Good Luck!
The most straight-forward approach is to generate a random token and save it into a column of the user or member record. It doesn't have to be "really long", 20 random characters will suffice as the probability of guessing that is so slim it will never happen.
Typically the token is used once and once only to validate the user, but if the user clicks on the email a subsequent time it's nice if it still redirects back to their profile.
Usually the user is switched to "validated" or something of the sort, a status flag stored in a separate column. This preserves their initial membership type which might be one of many values. This is why you often see validated_at fields or banned_until fields.

How do I prevent my HTML from being exploitable while avoiding GUIDs?

I've recently inherited a ASP.NET MVC 4 code base. One problem I noted was the use of some database ids (ints) in the urls as well in html form submissions. The code in its present state is exploitable through both URL tinkering and creating custom HTML posts with different numbers.
Now while I can easily fix the URL problems by using session state or additional auth checks i'm less sure about the database ids that get embedded into the HTML that the site spits out (i.e. I give them a drop down to fill). When the ids come back in a post how can I be sure I put them there as valid options?
What is considered "best practice" in terms of addressing this problem?
While I appreciate I could just "GUID it up" I'm hesitant to do so because I find them a pain in the ass to work with when debugging databases.
Do I have a choice here? Must I GUID to prevent easy guessing of ids or is there some kind of DRY mechanism I can use to validate the usage of ids as they come back into the site?
UPDATE: A commenter asked about the exploits I'm expecting. Lets say I spit out a HTML form with a drop down list of all the locations one can import "treasure" from. The id of the locations that the user owns are 1,2 and 3, these are provided in the HTML. But the user examines the html, fiddles with it and decides to put together a POST with the id of 4 selected. 4 is not his location, its someone else's.
Validate the ID passed against the IDs the user can modify.
It may seem tedious, but this is really the only way to make sure the user has access to what they're trying to modify. Using GUIDs without validation is security by obscurity: sure guessing them is hard, but you can potentially guess them given enough resources.
You can do this at the top of the controller before you do anything else with the posted data. If there's a violation, just throw an exception and have your global exception handler deal with it; you don't need to handle it in a pretty way since you can safely assume that the user is tampering with data in an unsupported way.
The issue you describe is known as "insecure direct object references," and the OWASP group recommends two policies for dealing with this issue:
using session-based indirect object references, and
validating all accesses to object references.
An example of Suggestion #1 would be that instead of having dropdown options 1, 2, and 3, you assign each option a GUID that is associated with the original ID in a map in the user's session. When you get a POST from that user, you check to see what object the given ID was supposed to be tied to. OWASP's ESAPI has some libraries to help with this in various languages.
But in many cases Suggestion #1 is actually counterproductive. For example, in many cases you want to have URLs that can be copy/pasted from one user to another. Process #2 is generally seen as the most foolproof way to address this issue.
You are describing Broken Access Control with Insecure Ids. Once you've identified the threat and decided which Ids are owned by certain users, ensure checks are in place for this server side.

Should I instantiate a collection or inherit from collection?

I've asked myself this question a number of times when creating classes, particularly those involving collections, but I've never come up with a satisfactory answer. It's a OOP design question.
For example, in a checkbook register program say I have a class of BankAccount. BankAccounts contain data involving the Name of the account, the Type of account (enum of Checking, Saving,...), and other data, but most importantly is a collection of the Adjustments (deposits or withdrawals) in the account.
Here, I have two options for keeping a collection of the Adjustments:
Instantiate a collection and keep it as a member within the BankAccount class. This is like saying "BankAccount has a collection of Adjustments."
Inherit from collection. This is like saying "BankAccount is a collection of Adjustments."
I think both solutions are intuitive, and, of course, both offer some advantages and disadvantages. For example, instantiating allows the class (in languages that allow only a single base class) to inherit from another class, while inheriting from collection makes it easy to control the Add, Remove, and other methods without having to write surrogate methods to 'wrap' those.
So, in such situations, which is a better approach?
To me, a bank account has a collection of adjustments. A bank account is not a collection of adjustments, because it "is" much more than that: it also "is" a name and a type, etc.
So, in your case (and similar cases), I suggest you aggregate a collection inside your class.
In case of doubt, avoid inheritance. :-)
I can argument this further. In order to use inheritance properly, the subclass must satisfy Liskov's substitution principle; this means that, in your case, BankAccount should be a valid type anywhere a Collection is expected. I don't think that's the case, because a Collection probably exposes methods such as Add() and Remove(), whereas you will want to exert some control over adding and removing adjustments from your bank account rather than letting people add and remove them freely.
Personally, I would say BankAccount has a collection of Adjustment. It will probably have other properties that aren't exclusively about what has been deposited or withdrawn ( customer, bank account type, etc ).
In terms of design, my BankAccount object would expose a late-loading property of type Adjustments.
In terms of use within the code, I would instantiate the bank account, and if I needed to know what had gone in and out of the account, I would use the exposed property. The BankAccount would be the primary object, responsible for providing the Adjustments related only to the instantiated account.
Instantiate, definitely.
I agree with the other posters about Bank Account being "more" than just a collection of other items.
Or maybe you jut picked an example which really screams out for "instantiate".
Examples:
What happens if your Bank Account needs a second collection of completely different items? (Example: collection of people who can operate on it, like husband and wife, for example, collection of credit cards, paypal accounts or anything else that can "operate" on your bank account).
Depending on the language a collection exposes too much of its info: if another object needs to access Adjustements... say for displaying your movements history on a web page you automatically expose your "collection" for injection, deletion and so on.
I think getting overly caught up in semantics like "is this more is-a or has-a" is a little bit dangerous - at the end of the day, what matters is how well your design solved the problem, how maintainable it is, etc. In fact, personally, a turning point in the way I understand object oriented programming was letting go of "objects as nouns". Objects are, when you get down to it, an abstraction one level up from a function, nothing more or less.
This was a long way to say "has a". :-) Subclassing is complicated, using is easy.

Generally a Good Idea to Always Hash Unique Identifiers in URL?

Most sites which use an auto-increment primary-key display it openly in the url.
i.e.
example.org/?id=5
This makes it very easy for anyone to spider a site and collect all the information by simply incrementing the value of id. I can understand where in some cases this is a bad thing if permissions/authentication are not setup correctly and anyone could view anything by simply guessing the id, but is it ever a good thing?
example.org/?id=e4da3b7fbbce2345d7772b0674a318d5
Is there ever a situation where hashing the id to prevent crawling is bad-practice (besides losing the time it takes to setup this functionality)? Or is this all a moot topic because by putting something on the web you accept the risk of it being stolen/mined?
Generally with web-sites you're trying to make them easy to crawl and get access to all the information so that you can get good search rankings and drive traffic to your site. Good web developers design their HTML with search engines in mind, and often also provide things like RSS feeds and site maps to make it easier to crawl content. So if you're trying to make crawling more difficult by not using sequential identifiers then (a) you aren't making it more difficult, because crawlers work by following links, not by guessing URLs, and (b) you're trying to make something more difficult that you also spend time trying to make easier, which makes no sense.
If you need security then use actual security. Use checks of the principal to authorize or deny access to resources. Obfuscating URLs is no security at all.
So I don't see any problem with using numeric identifiers, or any value in trying to obfuscate them.
Using a hash like MD5 or SHA on the ID is not a good idea:
there is always the possibility of collisions. That is, two different IDs hash to the same value.
How are you going to unhash it back to the actual ID?
A better approach if you're set on avoiding incrementing IDs would be to use a GUID, or just a random value when you create the ID.
That said, if your application security relies on people not guessing an ID, that shows some flaws elsewhere in the system. My advice: stick to the plain and easy auto-incrementing ID and apply some proper access control.
I think hashing for publicly accessible id's is not a bad thing, but showing sequential id's will in some cases be a bad thing. Even better, use GUID/UUIDs for all your IDs. You can even use sequential GUIDS in a lot of technologies, so it's faster (insert-stage) (though not as good in a distributed environment)
Hashing or randomizing identifiers or other URL components can be a good practice when you don't want your URLs to be traversable. This is not security, but it will discourage the use (or abuse) of your server resources by crawlers, and can help you to identify when it does happen.
In general, you don't want to expose application state, such as which IDs will be allocated in the future, since it may allow an attacker to use a prediction in ways that you didn't forsee. For example, BIND's sequential transaction IDs were a security flaw.
If you do want to encourage crawling or other traversal, a more rigorous way would be to provide links, rather than by providing an implementation detail which may change in the future.
Using sequential integers as IDs can make many things cheaper on your end, and might be a resonable tradeoff to make.
My opinion is that if something is on the web, and is served without requiring authorization, it was put with the intention that it should be publicly accessible. Actively trying to make it more difficult to access seems counter-intuitive.
Often, spidering a site is a Good Thing. If you want your information available as much as possible, you want sites like Google to gather data on your site, so that others can find it.
If you don't want people to read through your site, use authentication, and deny access to people who don't have access.
Random-looking URLs only give the impression of security, without giving the reality. If you put account information (hidden) in a URL, everyone will have access to that web spider's account.
My general rule is to use a GUID if I'm showing something that has to be displayed in a URL and also requires credentials to access or is unique to a particular user (like an order id). http://site.com/orders?id=e4da3b7fbbce2345d7772b0674a318d5
That way another user won't be able to "peek" at the next order by hacking the url. They may be denied access to someone else's order, but throwing a zillion letters and numbers at them is a pretty clear way to say "don't mess with this".
If I'm showing something that's public and not tied to a particular user, then I may use the integer key. For example, for displaying pictures, you might wish to allow your users to hack the url to see the next picture.
http://example.org/pictures?id=4, http://example.org/pictures?id=5, etc.
(I actually wouldn't do either as a simple GET parameter, I'd use mod_rewrite (or something) to make readable urls. Something like http://example.org/pictures/4 -> /pictures.php?picture_id=4, etc.)
Hashing an integer is a poor implementation of security by obscurity, so if that's the goal, a true GUID or even a "sequential" GUID (whether via NEWSEQUENTIALID() or COMB algorithm) is much better.
Either way, no one types URLs anymore, so I don't see much sense in worrying about the difference in length.

How do you handle exceptional cases

This is often situation, but here is latest example:
Companies have various contact data (addresses, phone numbers, e-mails...) when they make job ad, they have checkboxes where they choose how they want to be contacted. It is basically descriptive data. User when reading an ad sees something like "You can apply by mail, in person...", except if it's "through web portal" or "by e-mail" because then appropriate buttons should appear. These options are stored in database, and client (owner of the site, not company making an ad) can change them (e.g. they can add "by telepathy" or whatever), yet if they tamper with "e-mail" and "web-portal" options, they screw their web site.
So how should I handle data where everything behaves same way except "this thing" that behaves this way, and "that thing" that behaves some other way, and data itself is live should be editable by client.
You've tagged your question as "language-agnostic", and not all languages cleanly support polymorphism, but that's the way I would approach this.
Each option has some type, and different types require different properties to be set. However, every type supports some sort of "render" method that can display the contact method as needed. Since the properties (phone number, or web address, etc.) are type-specific, you can validate the administrator's input when creating these "objects", to make sure that the necessary data is provided and valid. Since you implement the render method, rather than spitting out HTML provided by a user, you can ensure that the rendered page is correct. It's less flexible, but safer and more user friendly.
In the database, you can have one sparsely populated table that holds data for all types of contacts, or a "parent" table with common properties and sub-tables with type-specific properties. It depends on how many types you have and how different they are. In either case, you would have some sort of type indicator, so that you know the type of object to which the data should be be bound.
First of all, think twice do you really need it. Reason is simple. You are supposed to serve specific need and input data is a mean to provide that service. If data does not fit with existing service then what is its value and who are consumer of that specific information?
There are two possible answers: You are expanding your client base or you need to change existing service because of change of demand. In both cases you need to star from development of business model. If you describe what service you need and what information it should provide you will avoid much of specific data and come with clear requirements easy to implement in software.
I'd recommend the resolution pattern for this, based on the mention of a database. The link above describes it, but it's actually a lot simpler than it sounds. You write a database query that returns all the possible options (for example, you read the standard options and the customized options together using perhaps a UNION or a JOIN depending on your schema) - the COALESCE SQL keyword is then useful to find the first 'resolution' of the option value that isn't NULL.
Well, if all it is is that you have two options that are special, and then anything else is dealt with in the same way, then store your options as strings, and if either of the two special ones appears in that list, then show the appropriate stuff for that special item.
Just check your list of items for the two special ones. Nothing fancy.
By writing a very simple Rules Engine. You can use an out-of-the box implementation, or you can roll your own. Since your case seems so simple, I tend to roll my own, because it means less dependencies (YMMV).