Necessity of auto-incrementing ID - mysql

Every implementation of a credentials table I've seen has an auto-incrmenting id to to track users.
However,
If I verify unique email addresses before inserting into a mySQL table, than I can guarantee the uniqueness of each row by email address...furthermore I can access the table as needed through the email address..
Does anyone see a problem with this?
I'm trying to understand why others don't follow this approach?

Those email addresses are much larger than 4 bytes, perhaps even worse for the storage engine they are variable length.
Also one person might want two accounts, or might have several email addresses over time.
Then there are the problems associated with case folding.

When other tables have data that relates to users, what do you use as a foreign key? Their email address? What if they want to change their email address? What would have been a single one-row update now becomes a giant mess.
A generated key allows you to decouple data that can change from the relationships between records and tables.

Related

what is the best practice - a new column or a new table?

I have a users table, that contains many attributes like email, username, password, phone, etc.
I would like to save a new type of data (integer), let's call it "superpower", but only very few users will have it. the users table contains 10K+ records, while fewer than 10 users will have a superpower (for all others it will be null).
So my question is which of the following options is more correct and better in terms of performance:
add another column in the users table called "superpower", which will be null for almost all users
have a new table calles users_superpower, which will at most contains 10 records and will map users to superpowers.
some things i have thought about:
a. the first option seems wasteful of space, but it really just an ingeger...
b. the second option will require a left join every time i query the users...
c. will the answer change if "superpower" data was 5 columns, for example?
note: i'm using hibenate and mysql, if it changes the answer
This might be a matter of opinion. My viewpoint on this follows:
If superpower is an attribute of users and you are not in the habit of adding attributes, then you should add it as a column. 10,000*4 additional bytes is not very much overhead.
If superpower is just one attribute and you might add others, then I would suggest using JSON or another EAV table to store the value.
If superpower is really a new type of user with other attributes and dates and so on, then create another table. In this table, the primary key can be the user_id, making the joins between the tables even more efficient.
I would go with just adding a new boolean field in your user entity which keeps track of whether or not that user has superpowers.
Appreciate that adding a new table and linking it requires the creation of a foreign key in your current users table, and this key will be another column taking up space. So it doesn't really get around avoiding storage. If you just want a really small column to store whether a user has superpowers, you can use a boolean variable, which would map to a MySQL BIT(1) column. Because this is a fixed width column, NULL values would still take up a single bit of space, but this not a big storage concern most likely as compared to the rest of your table.

Database design for addresses

I'm designing a database that holds information on suppliers, clients, users, client sites etc that all have address data. I have elected to use three standard address lines, town/city, county and postcode fields.
My question is, would it be better to have these fields in all the tables that require them or have a address table and just link the address id to the relavent table?
Many Thanks
Gavin.
If it's possible for multiple fields to have the same address, I'd put the addresses in their own table. This helps prevent insertion/update anomalies, among other things. If every address is unique, it might not be that important.
In general, a rule of thumb is "never repeat data". So, if multiple rows have the same values, there's a chance those values can be moved into their own table.

Many-to-One and One-to-One Relationships on Same Two Tables?

I'm designing a database where two fields have a many-to-one relationship, but I also need a one-to-one relationship between them, and I would like some advice on whether there is a better way to do it than what I've got right now.
My tables are accounts and users. An account can have multiple users, but each account can only and must have one owner. A user can be related to only one account.
I have an account field in the users table, which stores the ID of the account the user is related to. In the accounts table, I have an owner field, which stores the ID of the user who owns the account (i.e. the head admin).
I'm using InnoDB so I can make use of foreign keys. The problem is that I can't create an account or a user without the other being created first (due to the restraints of the foreign keys), so I made owner nullable. Now I can create an account with a null owner, then create the user, and finally set the owner on the account to the user.
Is this acceptable, and is there a better way?
Here are some possible other ways I've come up with, and my thoughts on each:
Have a boolean owner field in the users table. Since every account can only have one owner, this way seems less than ideal because I'd have to ensure only one user per account has the attribute set to true.
Have a third table called owners. This seems like more overhead and more work for no good reason since it's effectively the same as having an owner field in the users table.
How I have it now makes the most sense to me, but it's a little awkward having to set a null owner until I create the user, and then coming back to set it after the fact.
I'd appreciate any input you can give me. Thanks!
This question is similar, but there's no mention of foreign keys: Designing Tables: One to many and one to one at same time?
In general is a bad idea if your schema cannot be sorted topologically, i.e. if you cannot establish an ordering where a table only refers to tables preceding it in the ordering. This sort of "layered" dependency is also a very nice property to have for example for software modules (you have a problem if two modules depends on each other).
In your case you have user that refers to account and account that refers to user so clearly there's no way to find a topological ordering.
One standard solution in this case is to introduce a separate table e.g. "role" where you have three columns: user, account and role. The column role can be either "owner" or "guest".
The fact that you know that (given the current requests) an account must have one and only one owner, or that a user must be listed in one and only one account are not IMO rules that are really pertinent to the domain of "users" and "accounts".
You can implement those rules easily, but structuring your data so that you have no other possibility is IMO a mistake. You should aim to model the domain, not the specific rules... because people will change their mind about what those rules are.
Can you conceive a user with two accounts? Can you conceive an account with multiple owners/admins? I can... and this means that most probably quite soon this will be a request. Structuring the data so that you cannot represent this is looking for troubles.
Also when you have cyclical dependencies in the model your queries will be harder to write.
A very common case is for example to try to represent a hierarchical part list database using just one table with a "parent" field that points to the table itself... much better is having two tables instead, part and component, where component has two references to part and and a quantity.
Your solution is fine.
If you're uncomfortable with the owner column being nullable, you could rely on some magic user record (perhaps with an id of zero) which would be the "system user". So newly created accounts would be owned by user-zero, until their ownership was suitably redefined. That seems smellier than allowing accounts to have a null owner, to me, anyway.
For the current requirement to have only one account per user
alter table UserAccount add constraint un_user_account unique(UserID);
and when the requirement changes to many-to-many, drop the constraint
alter table UserAccount drop constraint un_user_account;
For the one owner only, simply enforce that on the application level.

Alternative to using same foreign key in almost every table

I am working with a database where "almost" every table in the database has the same field and same value. For example, almost all tables have a field called GroupId and there is only one group id in the database now.
Benefits
All data is related to that field and can be identified by said field
When a new group is created data will be properly identified for the group
Disadvantages
All tables have the this field
All stored procedures need to have this field as a parameter
All queries have to filtered by this field
Is this a big deal? Is there an alternative to this approach?
Thanks
If you need to be able to identify data by more than one group in the future, having foreign keys is a good practice. However, that deosn't mean all tables need to have this field, only the ones directly related to the group. For instance a lookuptable with state values may not need it, but the customers table might. Adding it to all tables willy-nilly can lead to bad things when you try to delete a record and have to check 579 tables (only 25 of which are pertinent). All this depends greatly on what the meaning of the groups is. Most of our tables have a relationship to the client table, because they contain data related to specific clients and because we don't want various clients to have the ability to see data for other clients. Tables which do not contain that kind of data do not.
Yes most queries may need the field and many stored procs will want to have it as an input variable, but if you truly need to filter on this information, then that is as it should be.
If however there is only one group and will never be more than one group, it is a waste of time, effort and space.

Mysql Constraign Database Entries in Rails

I am using Mysql 5 and Ruby On Rails 2.3.1. I have some validations that occasionally fail to prevent duplicate items being saved to my database. Is it possible at the database level to restrict a duplicate entry to be created based on certain parameters?
I am saving emails to a database, and don't want to save a duplicate subject line, body, and sender address. Does anyone know how to impose such a limit on a DB through a migration?
You have a number of options to ensure a unique value set is inserted into your table. Lets consider 1) Push responsibility to the database engine or 2) your application's responsibilitiy.
Pushing responsibility to the database engine could entail the use of creating a UNIQUE index on your table. See MySql Create Index syntax. Note, this solution may result in an exception thrown in case a duplicate value is inserted. As you've identified what I infer to be three columns to determine uniqueness (subject line, body, and sender address) you'll create the index to include all three columns. Its been a while since I've worked with Rails so you may want to check the record count inserted as well.
If you desire to push this responsibility to your application software you'll need to contend with potential data insertion conflicts. That is, assume you have two users creating an email simultaneously (just work with me here) having the same subject line, body, and send address. Should your code simple query for any records consisting of the text (identical for both users in this example) both will return no records found and will proceed along merily inserting their emails which now violate your premise. So, you can address this with perhaps a table lock, or some other syncing field in the database to ensure duplicates don't appear. This latter approach could consist of another table with a single field indicating if someone is inserting a record or not, once completed it updates that record to state it has completed and then others can proceed.
While there you can have a separate architectural discussion on the implications of each alternative I'll leave that to a separate post. Hopefully this suffices in answering your question.
You should be able to add a unique index to any columns you want to be unique throughout the table.