Store activation key in separated table using myisam - mysql

I want to make a login system. I want a confirmation by sending an activation code (clickable link) by email. I considered storing the activation key in separate table from the user information, since these are only relevant for non-activated users. When a user registers a row containing user information would be inserted in the users table and the activation key in the activation table. Once the link is clicked, I remove the record from the activation table.
But since I have no way of using innodb on my hosting, this is not fault-proof since I can't use transactions. I have 2 options.
Option A:
I keep the key in the activation table.
I store a boolean in the user table to check whether activation is necessary. If activation is needed and there is no record to be found in activation table, there can be a new attempt to add the record and resend an email to the user.
* more checks (php, in case no record was found)
* joins in selects for checking
* more inserts/deletes/updates
Option B:
Or I can store the activation key in the user table, having to use more space, that is not always used.
*Does unused storage always take up space using myisam?
*what is the recommended length for an activation key?
*Is the boolean still needed or can I set the activation key to NULL in order to check whether a user has been activated or not?
What is the best solution and why? Speed, space,....?

The heart of your question seems to relate to the moment of activation. You seem to be concerned that you'll get a transactional race condition around activation, and you won't be able to prevent it because you must use MyISAM instead of InnoDB.
This doesn't seem to be a critical problem. No harm will be done if a new user attempts to activate multiple times with the same correct token, or if she attempts to activate with an incorrect token at the same time as a correct token.
What is a critical success factor? The performance of a normal authentication operation (login) for an active user. If your query must join to a separate activation-token table for every users that logs in, that's not going to give you ideal performance. Neither is a query containing a clause like this:
AND user.activation_token IS NOT NULL
You may want to use a Boolean column value (a short integer) to indicate "activation pending" in your user table. If that column comes up true during normal login, you can invoke the extra, infrequently used, logic for activation. If, for example, you need to be able to accelerate this operation:
SELECT hashed_password, activation_pending
FROM user
WHERE username = ?
you can create a compound index on (username, hashed_password, activation_pending) and make that operation very efficient. Then when you successfully complete the pending activation (a relatively infrequent operation), you can do these two operations.
UPDATE user
SET activation_pending = 0
WHERE user = ?;
DELETE
FROM activation_token
WHERE user = ?;
Once activation_pending is set to zero, that's enough for the race condition: your logic won't look at your activation_token table.
varchar columns don't take much space if they contain zero-length strings. char columns do.

Related

How do I decrease an Integer in MySQL?

I have a server which uses a MySQL database for storing users. I have an integer as the primary key. The integer is the primary key and with auto increment.
The problem is when registration fails (on the website provided by my server) the integer still increases by 1, which means: if a user succeeds in signing up, the user will get the id of one. Just as it should be. However, if a user then fails to register (the username already being taken for example), and then succeeds to register the user will get the id of 3.
I am looking for syntax like: UserId-- or UserId = Userid - 1.
Auto-increment values are not intended to be consecutive. They must be unique, that's all.
If you try to decrement the auto-increment value on error, you'll create a race condition in your app. For example:
Your session tries to register a user. Suppose this generates id 41.
A second session running at the same time also tries to register a user. This generates id 42.
Your session returns an error, because the username you tried to register already exists. So you try to force MySQL to decrement the auto-increment.
A third session registers another user, using id 41. The auto-increment increments so the next registration will use id 42.
The next session tries to register with id 42, but this id has already been used, resulting in mass hysteria, the stock market crashes, and dogs and cats start living together.
Lesson: Don't get obsessed with id's being consecutive. They're bound to have gaps from time to time. Either an insert fails, or else you roll back a transaction, or you delete a record, etc. There are even some bugs in MySQL, that cause auto-increment to skip numbers sometimes. But there's no actual consequence to those bugs.
The way to avoid the race condition described above is that auto-increment must not decrement, even if an insert fails. It only increases. This does result in "lost" values sometimes.
I helped a company in exactly the same situation you are in, where registration of new usernames caused errors if the username was already taken. In their case, we found that someone using their site was creating a new account every day, and they were automating it. They would try "user1" and get an error, then try "user2" and "user3" and so on, until they found one that was available. But it was causing auto-increment values to be discarded, and every day the gap became larger and larger. Eventually, they ran out of integers, because the gaps were losing 1500+ id values for each registration.
(The site was an online gambling site, and we guessed that some user was trying to create daily throwaway accounts, and they had scripted a loop to register user1 through userN, but they start over at 1 each day, not realizing the consequences for the database.)
I recommended to them this fix: Change the registration code in their app to SELECT for the username first. If it exists, then return a friendly warning to the user and ask them to choose another name to register. This is done without attempting the INSERT.
If the SELECT finds there is no username, then try the INSERT, but be ready to handle the error anyway, in case some other session "stole" that username in the moment between the SELECT and the INSERT. Hopefully this will be rare, because it's a slim chance for someone to sneak their registration in between those two steps.
In any case, do not feel obliged to have consecutive id values.

When to use a relational database structure?

I'm creating a file hosting service, but right now I am creating the account email activation part of registering. So I had to come up with a database structure.
And right now it's:
users
id
first_name
last_name
email
password
since
active
hash_activate
But I can do it like a relational database too:
users
id
first_name
last_name
email
password
since
activation
id
user_id
hash
active
What would be the best way to go about it? And why?
If every person has only one activation hash active at at time, then it's feasible to store it in same table with users.
However, one advantage of separating it is that users only have an activation hash for a brief period of time, so to keep the user records smaller, you could store the hashes in a separate table. Keeping the user records small keeps it more performant. In this case, you wouldn't have active column. You'd just delete inactive hashes.
If you do store the activation columns in the user table, just be sure to select the columns by name. E.g. in most cases, you'll want do this:
SELECT id, first_name, last_name, email, password
FROM users
Instead of:
SELECT *
FROM users
You'd only want to select the activation columns when you needed them.
The second would only be sensible if one user could have multiple activations. You don't say whether this is true or false, so I couldn't possibly advise you.
If activations are a temporary thing, or having a hash defines someone as active, then make them different. Otherwise, that really won't matter.
However, neither is necessarily more or less relational than the other, without much more information. If you put a unique constraint on the combination of values in each row, and set each column up with a NOT NULL constraint, your first one would be quite relational.
You use a relational design when correctness of data, over time, is as important, if not more important, than what the application does with that data, and/or when data structure correctness/consistency is critical to the correct operation of an application, but might not necessarily be guaranteed by the application's own operation.

Recommend to track all logins, update login table, or both?

Currently I am having a hard time deciding/weighing the pros/cons of tracking login information for a member website.
Currently
I have two tables, login_i and login_d.
login_i contains the member's id, password, last login datetime, and total count of logins. (member id is primary key and obviously unique so one row per member)
login_d contains a list of all login data in history which tracks each and every time a login occurs. It contains member's id, datetime of login, ip_address of login. This table's primary key is simply an auto-incremented INT field, really purposeless but need a primary and the only unique single field (an index on the otherhand is different but still not concerned).
In many ways I see these tables as being very similar but the benefit of having the latter is to view exactly when a member logged in, how many times, and which IP it came from. All of the information in login_i (last login and count) truthfully exists in login_d but in a more concise form without ever needing to calculate a COUNT(*) on the latter table.
Does anybody have advice on which method is preferred? Two tables will exist regardless but should I keep record of last_login and count in login_i at all if login_d exists?
added thought/question
good comment made below - what about also tracking login attempts based on a username/email/ip? Should this ALSO be stored in a table (a 3rd table I assume).
this is called denormalization.
you ideally would never denormalize.
it is sometimes done anyway to save on computationally expensive results - possibly like your total login count value.
the downside is that you may at some point get into a situation where the value in one table does not match the values in the other table(s). of course you will try your best to keep them properly up to date, but sometimes things happen. In this case, you will possibly generate bugs in application logic if they receive an incorrect value from one of the sources.
In this specific case, a count of logins is probably not that critical to the successful running of the app - so not a big risk - although you will still have the overhead of maintaining the value.
Do you often need last login and count? If Yes, then you should store it in login_i aswell. If it's rarely used then you can take your time process the query in the giant table of all logins instead of storing duplicated data.

Question for Conflict in insertion of data in DB by user and admin, see below for description

I have a case that what will happen when at one end Admin is editing the Details of user "A" in a table "users" and at the same time user "A" itself edits its details in table users. Whose effect will reflected.. And what can be done to make it specific to some one or to give the priority?
Thanks and Regards...
As Michael J.V. says, the last one wins - unless you have a locking mechanism, or build application logic to deal with this case.
Locking mechanisms tend to dramatically reduce the performance of your database.
http://dev.mysql.com/doc/refman/5.5/en/internal-locking.html gives an overview of the options in MySQL. However, the scenario you describe - Admin accesses record, has a lock on that record until they modify the record - will cause all kinds of performance issues.
The alternative is to check for a "dirty" record prior to writing the record back. Pseudocode:
User finds record
Application stores (hash of) record in memory
User modifies copy of record
User instructs application to write record to database
Application retrieves current database state, compares to original
If identical
write change to database
If not identical
notify user
In this model, the admin's change would trigger the "notify user" flow; your application may decide to stop the write, or force the user to refresh the record from the database prior to modifying it and trying again.
More code, but far less likely to cause performance/scalability issues.

MySQL - Saving and loading

I'm currently working on a game, and just a while ago i started getting start on loading and saving.
I've been thinking, but i really can't decide, since I'm not sure which would be more efficient.
My first option:
When a user registers, only the one record is inserted (into 'characters' table). When the user tries to login, and after he/she has done so successfully, the server will try loading all information from the user (which is separate across multiple tables, and combines via mysql 'LEFT JOIN'), it'll run though all the information it has and apply them to the entity instance, if it runs into a NULL (which means the information isn't in the database yet) it'll automatically use a default value.
At saving, it'll insert or update, so that any defaults that have been generated at loading will be saved now.
My second option:
Simply insert all the required rows at registration (rows are inserted when from website when the registration is finished).
Downsides to first option: useless checks if the user has logged in once already, since all the tables will be generated after first login.
Upsides to first option: if any records from tables are deleted, it would insert default data instead of kicking player off saying it's character information is damaged/lost.
Downsides to second option: it could waste a bit of memory, since all tables are inserted at registration, and there could be spamming bots, and people who don't even manage to get online.
Upsides to first option: We don't have to check for anything in the server.
I also noted that the first option may screw up any search systems (via admincp, if we try looking a specific users).
I would go with the second option, add default rows to your user account, and flag the main user table as incomplete. This will maintain data integrity across your database, whereas every user record is complete in it's entirety. If you need to remove the record, you can simply add a cascading delete script to clean house.
Also, I wouldn't develop your data schema based off of malacious bots creating accounts. If you are concerned about the integrity of your user accounts, add some sort of data validation into your solution or an automated clean-house script to clear out incomplete accounts once the meet a certain criteria, i.e. the date created meeting a certain threshold.
You mention that there's multiple tables of data for each user, with some that can have a default value if none exist in the table. I'm guessing this is set up something like a main "characters" table, with username, password, and email, and a separate table for something like "favorite shortcuts around the site", and if they haven't specified personal preferences, it defaults to a basic list of "profile, games list, games by category" etc.
Then the question becomes when registering, should an explicit copy of the favorite shortcuts default be added for that user, or have the null value default to a default list?
I'd suggest that it depends on the nature of the auxiliary data tables; specifically the default value for those tables. How often would the defaults change? If the default changes often, a setup like your first option would result in users with only a 'basic' entry would frequently get new auxiliary data, while those that did specify their own entries would keep their preferences. Using your second option, if the default changed, in order to keep users updated, a search/replace would have to be done to change entries that were the old default to the new default.
The other suggestion is to take another look at your database structure. You don't mention that your current table layout is set in stone; is there a way to not have all the LEFT JOIN tables, and have just one 'characters' table?