How to handle optional fields in MySQL? - mysql

I have a MySQL table that records classified listings. We don't force users to join to post a listing, and therefore the listing will not always have a user_id associated with it.
I therefore need a method of recording the poster's email if they are not signed in.
Is it bad practice to create a column email that will sometimes be blank and sometimes be filled?
Or is there a better way to go about this that I don't realize?

Is it bad practice to create a column
email that will sometimes be blank and
sometimes be filled?
It is not a bad practice, no : juste use a NULL column -- that's why they exist ;-)
See 12.1.17. CREATE TABLE Syntax : in the column_definition part of the create table query, you can specify NULL or NOT NULL.
BTW: Using NULL, which literally means "no value" is better than using some kind of "impossible value", like an empty string : NULL really means "no value", and make your point obvious -- while an empty string could mean an error in your code.
And I don't really see another "logical" way, actually...
Note, though, that you'll have to handle a NULL value for the email, in your application's code, of course ;-)

this is exactly what NULL is for. but you already knew that because your user_id column will also sometimes be NULL, right?

I think the approach you have laid out is perfectly acceptable. As longneck points out, thats what NULL is for in SQL databases.
However, if you're truly concerned about it, you could save space (possibly a significant amount, depending on the column type and number of rows) if you use the user_id column for the userid and the email address, and then have another boolean column, say is_email to distinguish which type of value is stored in the user_id column. This may simplify things for you because it is likely that your application does not care, in many places, whether the data is actually a user_id or an email address.

I have a MySQL table that records classified listings. We don't force users to join to post a listing, and therefore the listing will not always have a user_id associated with it.
I therefore need a method of recording the poster's email if they are not signed in.
What is the business key of your user entity? Or, more directly: what is your user entity? Is every distinct email address a key for a User entity with some users having registered and their email set in some profile, and others not registered and giving an email address every time they post? Or do you have two distinct entities, RegisteredUser and UnknownPosterWithEmailAddress, with their attributes stored in separate places?
In the latter case, you would use a NULLable user_id and a NULLable email field, like you suggested, but then queries like "for a given post, find the email address the reply should be sent to" are going to be awkward, e.g. a list of all post with their respective reply addresses will look like this:
select post.id,
case when post.user_id is not null then user.email
else post.email end as email
from post
left join user on user.id=post.user_id;
This can get real messy after a while.
I'd rather use the former approach: each row in User is a dsitinct poster, with an non-NULLable unique email address, and a surrogate key as foreign key in posts:
create table user(id integer primary key,
email text not null unique,
is_registered boolean default false);
create table post(id integer primary key,
user_id integer not null references user(id),
content text);
If a non-registered user enters an email address, you look it up in the user table, and retrieve the user.id, adding a new entry in user if necessary. As a result, you can answer questions like: for a given email address, how many posts has this user made in the past week? via the foreign key field, without having to compare strings in some NULLable attribute field.
When a user chooses to register, you can add the registration data either in user itself or in some separate table (again with user.id as a foreign key, some might argue that a boolean field is_registered is actually redundant then). Added benefits:
If he has posted before under the same email address, now all of his old posts become associated with his new registered identity automatically.
If the user changes his email address in his profile, all replies to older posts of his "see" the new updated email address.

Related

Best practice store email unique MySQL and Validate JS

I have a table where email is a unique key, the point is... when the user get deleted in the manage form, I do an update, inserting "1" in a column called "deleted", doing this I keep the data and the history of that user... But if I have to add a new user with the same email, Bang MySQL catches me
So.. my question is, the best practice is?
Do a remove in table when user get deleted, losing the history of that user
Remove the unique key in the column email, and keep the validate only in JS for prevent duplicates emails
Another one...
Thanks for your time
You can restrict emails to being unique only if not deleted with a virtual column:
create table user (
email varchar(320),
deleted tinyint,
undeleted_email varchar(320) as (if(deleted,null,email)) unique
);
fiddle
You could reverse the logic and instead of storing a nullable deletion mark, store an active mark. Use a check constraint (in MySQL 8+) or a trigger (in lower MySQL versions which don't enforce check constraints) to restrict the possible values of the active mark to one (non-NULL) value (and NULL). Then put the unique constraint on the e-mail address and the active mark combined. That way you can have multiple rows with the same e-mail address and a NULL active mark but maximally one with that address and a non-NULL active mark.
CREATE TABLE user
(id integer,
email varchar(256),
active tinyint,
PRIMARY KEY (id),
CHECK (active = 1),
UNIQUE (email,
active));
db<>fiddle
But as a non technical side note, check, if you're legally allowed to keep the history of a user when they delete their account according to the data protection laws applying to you. They may have the right to expect that everything is deleted when they delete their account.
And you could also consider, no to create a new account, if a user comes back, but instead offer them to reactivate their old account.

Covering index for a column on the right side of equals

Please give me advice on what covering index should be created for the following query:
SELECT id
FROM user
WHERE email_address = '...' AND
hashed_password = SHA2(CONCAT(salt, '...'), 256))
Because salt is also a column in the table, I'm unsure which of these is correct:
INDEX (email_address, hashed_password, id)
INDEX (email_address, hashed_password, salt, id)
Then again, since email address returns one row, putting hashed_password in the index seems redundant.
You are correct, its a bit redundant.
As long as one of the indexes begins with email that is fine. You'll probably want a unique key on email alone to prevent duplicate email addresses being entered.
An innodb (and probably other engines) index will implicitly end in its primary key (assumed to be id) so there's never a need to explicitly add it.
While adding hashed_password, salt to the index will improve performance allowing the entire query to be served by the index, it does increase the index size slightly and I'm not sure you'll gain much from it, particularly as this just looks like a login.
As a general rule for index selection, you should look at all the queries that you will be running against that table and select an index using the smallest number of columns common to all of those queries.
A common choice might be the primary key (PK). In the case of a table of users, the PK is probably a good choice.
For example, if there is another use of the "user" table that, say, just returns the user's details (e.g. department, phone number, full name etc) that is used by a system administrator, would the system administrator have that user's password?
Answer, probably not. So I would be inclined in this case to just use the email_address - assuming that this is equivalent to the user ID.
Having said, that, I note that you are selecting "id" in your query. What is the id column? Is that the real user ID? i.e. is that the value they would normally enter when logging in / changing password? If so, then "id" is probably a better candidate for the index.
FWIW, as written, if the user gets the password (or the email_addr) wrong, then the query will return no records. If they get it right, assuming email_addr is unique in the table, then it will return one record. Is that what you were expecting?

A User with many Addresses, but User has a primary billing, shipping, and profile Address

I'm building a Rails app where a User can have many Addresses. A User also needs a primary billing Address, primary shipping Address, and primary profile Address. These primary fields can point to the same Address. A User cannot have more than one of any primary Address.
I've created a join table called AddressInfo, and I'm bouncing between a few options:
Make 3 columns on the User model corresponding to each of the primary Address ids (this would remove the need for the join model I think).
Add a primary boolean column to AddressInfo, and make sure only one is true when scoped by user_id, address_id and purpose (purpose being billing, shipping or profile).
Add a primary date time column to AddressInfo, and use the most recently updated as the primary address (scoped like option 2).
Maybe these options aren't the best, but it's what I've come up with so far.
Any help on how to resolve this issue would be appreciated.
UPDATE:
To be clear, once an Address is created it should always belong to that User and be undeletable. Ex. a User changes their primary billing address to a new Address: they should still be able to retrieve that old Address (maybe even make it a primary address again). If I go with option 1 and remove the join table, that means I'll need a user_id on Address.
Go with option 1, 3 columns. This will make less of a headache (as a programmer), will run faster, and is more flexible for doing things like combining similar addresses into one. Maybe you have 2 people with the same address, they could share the same record (not recomeneded though).
If you're using rails I would look to a rails solution to 'stay on the rails'.
I would consider STI (Single Table Inheritance)
More info at Rails: Need help defining association for address table
or a Polymorphic relationship - http://guides.rubyonrails.org/association_basics.html#polymorphic-associations
As you want 1 user to have multiple address (rather than ther being multiple user types) than STI may be best for you.
Note: they can also be combined, e.g. http://www.archonsystems.com/devblog/2011/12/20/rails-single-table-inheritance-with-polymorphic-association/
There's a great example of STI addressing this issue at: http://blog.arkency.com/2013/07/sti/
Another possibility is to create another model to hold user.id, addressinfo.id, and an primary_for_address_type (containing "Shipping", "Profile", "Billing").
Constrain this to be unique on user and primary_for_address_type, and you can tag addresses in AddressInfo as being the primary for particular address types in a completely extensible way that still guarantees uniqueness of the primary address.
You might join directly from this model to address, but there's scope for getting it out of sync with your addressinfo model.

Create new user with email as a soft deleted user

I am using ActsAsParanoid for soft deleting users.After deleting(soft) a user, my client wants to create user with same email id.But it generating unique field error since email column is unique.So my question is can we set the uniqueness for email column only if the deleted_at column is null.
Pls reply if u dont understand my question.
I suppose you could change the uniqueness constraint of your users table to be:
UNIQUE (email, deletion_date)
This would effectively:
For standard (non-deleted) users, guarantee they have unique email addresses, since their deletion dates would presumably all be NULL.
For deleted users, not make any guarantee about email addresses, since they all have unique deletion dates.
For new users, allow them to use an email address that a deleted user has, since the new user will have a NULL deletion date, while the deleted user has a value there.
Ah, just change old email to something like
Me#yourmail.com_deleted
That way if you need to view the old email it's everything before the underscore deleted.
In other words here have new user create new account.
Probably have a mutator in the background add the underscore deleted on the old account.
Underscore deleted just an example.

about database design

I need some idea about my database design. I have about 5 fields for basic information of user, such as name, email, gender etc.
Then I want to have about 5 fields for optional information such as messenger id's.
And 1 optional text field for info about user.
Should i create only one tabel with all fields all together or i should create separate table for the 5 optional fields in order to avoid redundancy etc?
Thanks.
I'll stick with only one table.
Adding another table would only makes thins more complicated and you will only gain really little disk space.
And I really don't see how this can be redundant in any way ;)
I think that you should definately stick with one table. Since all information is relevant to a user and do not reflect any other logical model (like an article, blog post or such), you can safely keep everything in one place, even if they are optional.
I would create only one table for additional fields. But not with 5 fields but a foreign key relation to base table and key/pair value info. Something like:
create table users (
user_id integer,
name varchar(200),
-- the rest of the fields
)
create table users_additional_info (
user_id integer references users(user_id) not null,
ai_type varchar(10) not null, -- type of additional info: messenger, extra email
ai_value varchar(200) not null
)
Eventually you might want an additional_info table to hold possible valid values for extra info: messenger, extra email, whatever. But that is up to you. I wouldn't bother.
It depends on how many people will be having all of that optional information and whether you plan on adding more fields. If you think you're going to add more fields in the future, it might be useful to move that information to a meta table using the EAV pattern : http://en.wikipedia.org/wiki/Entity-attribute-value_model
So, if you're unsure, your table would be like
User : id, name, email, gender, field1, field2
User_Meta : id, user_id, attribute, value
Using the user_id field in your meta table, you can link it to your user table and add as many sparsely used optional fields as you want.
Note : This pays off ONLY if you have many sparsely populated optional fields. Otherwise have it in one field
I would suggest using a single table for this. Databases are very good at optimizing away space for empty columns.
Splitting this table out into two or more tables is an example of vertical partitioning and in this case is likely to be a case of premature optimization. However, this technique can be useful when you have columns that you only need to query some of the time, eg. large binary blobs.