Reference tables instead of Enum fields for lookup efficiency - mysql

In my application I have a user which has a profile and an address. The relationship between those tables are:
users: id, type, registered, email...
profiles: user_id, address_id, first_name, last_name, gender, status, etc..
addresses: id, city, street, house_number, apartment
Those tables have some Enum fields on them, but I think it might not be efficient at-all since I'm going to do some intensive user's lookup based on their address and profile so I thought maybe I should use reference tables instead? (I also gain the indexing with an integer which is better).
For example, In profiles I have a status enum field which gets the following values for now:
single
married
widowed
divorced
so I thought about maybe having a statuses table and a foreign key on profiles - status_id.
Another dilemma about this is should I have a reference table for a gender as-well? Currently I only accept male and female values in my enum field for gender, but maybe in the future we will want to add a transgender or anything else. I will also do an intensive user lookup based on gender of-course. Should I also extract it into a reference table?

Enums are internally stored as numbers. The data like gender or status in profile table doesn't get modified very often. So I personally would prefer enums. This would avoid the referencing overhead.
However, it has its own disadvantages.
Please refer to http://chateau-logic.com/content/why-we-should-not-use-enums-databases to know why not to use enums. If you are using multiple languages in your application then enums are a definite NO.

Related

Would a new table really be needed?

I'm making a sql database for a small company.. Pretty much the other tables don't relate to the question so ill list the two that do...
There is a table:
NextofKin:
fname
lname
street
no
houseno
city
AND
Patient:
ID[pk]
fname
lname
houseno
city
Pretty much would I need a seperate table for street, house and city?
also any idea what i could use as a primary key for NextOfKin?
Your questions are starting to get into database normalization.
What you should be doing is never duplicating data between tables unless that data relates the tables, and that data should be indexed. Something like this comes to mind ( there are different ways you might construct it based on business logic )
PersonalData: id, fname, lname, address1, address2, city, state, zip
Patient: id PK, personal_data_id FK, next_of_kin_id FK
Granted most of the tables already exist so this may be impossible. But to answer your question directly, since the database is not normalized already, there's no good place to put further address records ( don't want them under Patient right? ) and so you're stuck duplicating the data. Even so, there has to be some relationship between Patient and NextOfKin, so either Patient holds a reference to NextOfKin, or NextOfKin hods reference to Patient. Either way, you might consider using a foreign key between them to enforce, and explicitly state, this relationship.
Yes, use a pk for next of kin.
Use a joining table between patient and next of kin. Multiple patients could list the same person as next of kin, and while your app may not today require someone to designate multiple people as next of kin, they may change their mind in the future and your application will support it.
Myself, I always use a separate address table. Since usually more than one person lives in a house, and a person can have more than one home, you would again use a joining table.

Mysql database design for customer multiple addresses and default address

I am creating the database structure of an ecommerce with Mysql and INNODB engine.
Point 1: To create multiple addresses for the customers i have this tables
Am I doing it in the correct way? And how should I store the default address (in which table)?
Point 2: I have another table called "Suppliers", should i just connect it to addresses with a "supplier_address" table or is there a better way?
Point 3: What about the tables cities and countries? Should i add something or is that ok? Maybe a field "district" in another table beetween the two?
In my view you're making this far too complex. There's no need to make your address schema so over-normalized. Most systems I've seen that handle multiple customer addresses have a customer table like yours, and then have an address table, as follows:
customer_id
address_ordinal (small number for each customer: 0,1,2,3 etc).
primary (boolean)
address_1
address_2
locality (city, village, etc)
province (state, etc)
postcode (zip, postcode etc)
country
customer_id is a foreign key to the customer table. The primary key is a composite of (customer_id, address_ordinal). The primary column is true if the address is the primary one.
Regarding your question about suppliers, you might want to create a common table called "contacts", and give both your customers and suppliers contact_ids.
If your system contains a reference table (perhaps something you purchase from a data supplier) containing (postcode, locality, province) rows, you can use that to help populate your address table. But you should avoid forcing your addresses to only contain hard-coded postcodes: those reference tables get out of date very fast.
I'll start out my answer with the ole cliche: "There's more than one way to skin a cat." That said, I have a few suggestions:
Point 1 - Assuming a customer can have multiple addresses (i.e. billing and mailing), then yes, you have the right idea in terms of the separate mapping table. As for adding a field to customer_addresses called default or preferred, or something like that, it's not a bad idea, but another option is to add a new field called address_type that would reference a separate table with two records, "Billing" and "Mailing" and/or whatever else you would want. Then, in whatever application you are coding that is going to use the address data, depending on what context i.e. if you're on the billing info page, then code the address type that you use on the page itself something like SELECT * FROM customer_addresses WHERE address_type = 2 /* Billing */.
Point 2 - Same as for customers.
Point 3 - Do you want to be able to display shortened country names? For example, abbreviate "United States" to "US," "Canada" to "CAN," or "United Kingdom" to "UK?" I'd consider adding a field for abbreviated country names for that purpose.

What is the best or most accepted way to store a custom value for a field that is generally relational to a table of possible values?

In my application each of my users is required to select a suburb to which to associate their profile. The users table has a field suburb_id and a table called suburbs has both an id and name field.
Our suburbs table contains most of the suburbs that we will need, however occasionally users will need to enter suburbs that we don't have in our table, or have popped up since we populated our table.
What is the best way in terms of database design to solve this problem.
I had considered changing the field suburb_id to just suburb and then testing in the application whether it was an integer or a string - if it was an integer the application would assume it is related to an item in the suburbs table, if it was a string it would assume otherwise. However, if a user was to simply enter an integer in the suburb field then the application would obviously mistake it and try to match it up with a value in the table.
Is that an acceptable way to deal with the problem (it seems gimmicky to me - I am sure there must be a better solution).
EDIT: I would also like to avoid inserting data provided from users into the suburbs table (even if flagged) as I don't want to affect the quality of the suburbs data we have.
There might be several ways to handle that, but I think the most clean way is to leave the suburbs and userstable as they are, and add the suburb to the suburbs table in case the suburb doesn't already exist. Maybe with a flag that this in an user generated entry for later cleanup.
I had considered changing the field suburb_id to just suburb and then testing in the application whether it was an integer or a string - if it was an integer the application would assume it is related to an item in the suburbs table, if it was a string it would assume otherwise.
That can lead easily to performance issues.
There's no magic bullet for this kind of problem. If there's a foreign key reference, you only have a few choices.
Let the user insert rows into the suburbs table.
Don't let the user insert rows into the suburbs table.
Remove the foreign key reference.
Replace the suburbs table with supertype/subtype tables, where the supertype would contain all suburbs, and the subtype tables would distinguish user-submitted suburbs from validated suburbs.

Enum datatype versus table of data in MySQL?

I have one MySQL table, users, with the following columns:
user_id (PK)
email
name
password
To manage a roles system, would there be a downside to either of the following options?
Option 1:
Create a second table called roles with three columns: role_id (Primary key), name, and description, then associate users.user_id with roles.role_id as foreign keys in a third table called users_roles?
Or...
Option 2:
Create a second table called roles with two columns: user_id (Foreign key from users.user_id) and role (ENUM)? The ENUM datatype column would allow for a short list of allowable roles to be inserted as values.
I've never used the ENUM datatype in MySQL before, so I'm just curious, as option 2 would mean one less table. I hope that makes sense, this is the first time I've attempted to describe MySQL tables in a forum.
In general, ENUM types are not meant to be used in these situations. This is especially the case if you intend to cater for the flexibility of adding or removing roles in the future. The only way to change the values of an ENUM is with an ALTER TABLE, while defining the roles in their own table will simply require a new row in the roles table.
In addition, using the roles table allows you to add additional columns to better define the role, like the description field you suggested in Option 1. This is not possible if you were to use an ENUM type as in Option 2.
Personally I would not opt for an ENUM in these scenarios. Maybe I can see them being used for columns with an absolutely finite set of values, such as {Spades, Hearts, Diamonds, Clubs} to define the suit of a card, but not in cases such as the one in question, for the disadvantages mentioned earlier.
Using ENUM for the case You suggested only makes sense when You have a strictly definded ORM on the receiving end that for istance maps db rows into a list of flat objects automatically.
Example:
table animal( ENUM('reptiles','mamals') Category, (varchar 50)Name );
is automatically maped to
object animal
animal->Category
animal->Name

How to handle customers with multiple addresses in CakePHP

I'm putting together a system to track customer orders. Each order will have three addresses; a Main contact address, a billing address and a shipping address. I do not want to have columns in my orders table for the three addresses, I'd like to reference them from a separate table and have some way to enumerate the entry so I can determine if the addressing is main, shipping or billing. Does it make sense to create a column in the address table for AddressType and enumerate that or create another table - AddressTypes - that defines the address enumeration and link to that table?
I have found other questions that touch on this topic and that is where I've taken my model. The problem I'm having is taking that into the cakePHP convention. I've been struggling to internalize the direction BelongsTo relationships are formed - the way the documentation states feels backwards to me.
Any help would be appreciated,
Thanks!
You are spot on. You could do either, depending on how much you want to normalise your database. Personally for me, I'd go with an AddressType model, which gives you the flexibility to add and remove address types at will.
If you want it simpler, then I would just go with an ENUM() field in your Address model.
I prefer to think of hasOne, hasMany and hasAndBelongsToMany as the three real relation types.
The belongsTo relation is there simply to do the reverse of what the former two (hasOne/Many) do.
If you look at this diagram, you will notice the pairing of hasOne/belongsTo and hasMany/belongsTo.
Also, note that model that "belongsTo" is the one storing the foreign key (eg. address_type_id).
So in your case, since an AddressType hasMany Address (ie. you can have many Home addresses), then the Address belongsTo AddressType (ie. each address needs an address_type_id).
Table1: address_types (id, name, active)
Table2: customers (id, name, active, etc.)
Table3: addresses (id, address_type_id, customer_id, country, city, street, etc.)
This way customers can have as many addresses as you want. If you need to add a new address type you must not alter the customers or addresses table.