I'm designing a database that holds information on suppliers, clients, users, client sites etc that all have address data. I have elected to use three standard address lines, town/city, county and postcode fields.
My question is, would it be better to have these fields in all the tables that require them or have a address table and just link the address id to the relavent table?
Many Thanks
Gavin.
If it's possible for multiple fields to have the same address, I'd put the addresses in their own table. This helps prevent insertion/update anomalies, among other things. If every address is unique, it might not be that important.
In general, a rule of thumb is "never repeat data". So, if multiple rows have the same values, there's a chance those values can be moved into their own table.
Related
I have a database with contacts in it. There are two different types of contacts, Vendors and Clients.
The Vendor table has a vendor_contacts table attached via foreign key value to allow for a one to many relationship. The client has a similar table.
These contacts can have a one or many relationship with a phone numbers table. Should i have a separate phone numbers table for each of these or one shared phone number table with two foreign keys allowing one to be null?
OPTION 1
Here I would have to enforce that one of vendor_id or client_id was NULL and the other not NULL in the shared phone table.
OPTION 2
Here each table would have its own phone number table.
TBH I would merge the vendor and client tables and have a 'contact' table. This could have a contact type and would allow for newer contacts to be added.
Consider you want to add something to your contacts - address, you may have to change each table in the same way, then you want birthday (OK maybe not but just as an example) and again, changes to multiple tables. Whereas if you have a single table, it can reduce the overhead of managing this.
This will also mean you have one contact phone number table!
"wasting space" is not really a meaningful concern in modern database systems - and "null" values are usually optimized by the storage engine to take no space anyway.
Instead, I think you need to look at likely query scenarios, at maintainability, and at intelligibility of your schema.
So, in general, a schema that repeats itself - many tables with similar columns - suggest poor maintainability, and often lead to complicated queries.
In your example, imagine a query to find out who called from a given number, and whom they might have been trying to reach.
In option 1, you query the phone number, and outer join it to the two contact tables - relatively easy. In option 2, you have a union of two similar queries (only the table names would change) - duplication and lots of chance for bugs.
Imagine you want to break the phone number into country, region and phone number - in option 2, you have to do this twice (and modify all the queries twice); in option 1, you have to do this only once.
In general terms, repetition is a sign of a bad software design; this also counts for database schemas.
That's also a reason (as #siggisv and #NigelRen suggested) to flatten the vendor_contact and client_contact tables into a single table with a "contact_type" column.
I would use two different tables, a vendor_contacts table and a client_contacts table.
If you only have one table, you always waste space as you will have in each row a null column
option 2
but change vendor_contact and client_contact to 'contact'
and add a 'type' column to 'contact' that identified 'Client' or 'vendor' if you need to separate the records.
I would do as others have suggested and merge vendor_contact and client_contact into one contact table.
But on top of that, I doubt that contact<->phone is a one-to-many relationship. If you consider this example you will see that it's a many-to-many relationship:
"Joe and Mary are both vendors, working in the same office. Therefore they both have the same landline number. They also have each their own mobile number."
So in my opinion you would need to add a contact_number table with two columns of foreign keys, contact_id and phone_id.
I defined Master tables (data definition tables, static in nature) to generate content in my web page; and Transaction tables to store data entered by users (these tables are dynamic in nature). Consider following example:
Set of Master tables consisting of State having 1:M relationship with City, City having 1:M relationship with Locality. A Transaction table User to store personal details entered by a user. The User table has address attributes like Address, State, City and Locality. These can be defined as 1:M relationships from corresponding Master Tables (a particular record in State, City, Locality tables can be a part of multiple records in User table).
Is the design correct? I think it's sufficient to define 1:M relationship between Locality and User tables since the other two attributes (City and State) can be obtained from relationships between the Master tables. Would it be better to change the ER design to the following?
Are there alternatives to my requirement?
What queries do you have? Do you ever need to search by state or city? Even if you do search by those, it may not impact what I am about to say...
Since locality, city, and state are 'nested' and it is not likely for the names to change, I suggest that both of your options are "over-normalized". One table with all three items in it is the way I would go.
As I see it, there are two main reasons for normalizing:
Locating some string that is likely to change. By putting it in a separate table and pointing to that table, you can change it on only one place. This is not needed in your example.
Saving space (hence providing speed, etc). This does apply in your example, but only at the locality level, not at address. You might argue that city and state can be dedupped; I would counter with "The added complexity (extra tables) does not warrant the minimal benefit.".
A side note: If locality is zipcode, then your option 1 is in trouble at least one place I know of: Los Altos and Los Altos Hills (two different cities in California) both have sections of zipcodes 94022 and 94024.
I am designing a database in MySQL and need some guidance on how it should be structured and the relationships between tables. I have identified the following facts:
I have:
many users. (Fine)
One user has many organisations. (One - Many)
One organisation has many events. (One - Many)
One user has many addresses. (One - Many)
One organisation has many addresses (One - Many), but these addresses could be used by another organisation or event. Thus many organisations have many addresses. (Many - Many)
One event has one address. (One - One)
One organisation has one main address, but many organisations could be working from the same address, so (Many - One)
This is where I am stuck, because although an organisation or event has an address, they do not own it, the user does. So are these relationships necessary? Do I need to define foreign key relationships or can I get away without them? Do I need to maintain a separate table of default addresses because 2 organisations could use the same address but it not be the default address of one, so referencing it in the address table would be problematic (which one is actually the main address)?
Or am I looking at this in too complicated a way? Perhaps have the user maintain the addresses, then when they add an organisation, the organisation references that address but knows nothing of the other addresses (is that a one - one relationship for organisations or a many - one for addresses? Clearly removal of an organisation shouldn't mean the removal of an address and vice versa).
Then, when an event is added, it also just references an address, but can lookup the default address of the organisation to which it belongs. The same questions arise as above. The event is at that address but the address doesn't belong to the event or vice versa.
That almost simplifies it to:
One user, many events.
One user, many organisations.
One user, many addresses.
One organisation, one address.
One event, one address.
Is this the correct way to be looking at this problem? Are there any difficulties that could arise that I don't appear to have considered? Is there a better way to tackle this? The biggest problem I have is how to relate the tables to each other so I can set the relationships accordingly.
-- Edit: Added Info
Thinking about it further, more than one organisation might be having the same event. I would like to be able to link events also. These organisations could be added by separate users, but all need to be related. Is this something that MySQL can handle easily or should I be looking at other types of database logic such as graph databases?
It really depends on your use case. Sure, having a user or organization share an address (so, an addresses table) might seem technically correct, but it is probably adding unnecessary complexity. Think about how you're going to query your database (try to be forward thinking - what stuff might you want to do later?) and design your tables to support those queries.
You don't have to have foreign key constraints in your database. Actually, most databases I've looked at don't have them, the business logic handles checking the integrity of the records. I would suggest, again, that you be pragmatic about it and do what you are comfortable building and supporting.
Every implementation of a credentials table I've seen has an auto-incrmenting id to to track users.
However,
If I verify unique email addresses before inserting into a mySQL table, than I can guarantee the uniqueness of each row by email address...furthermore I can access the table as needed through the email address..
Does anyone see a problem with this?
I'm trying to understand why others don't follow this approach?
Those email addresses are much larger than 4 bytes, perhaps even worse for the storage engine they are variable length.
Also one person might want two accounts, or might have several email addresses over time.
Then there are the problems associated with case folding.
When other tables have data that relates to users, what do you use as a foreign key? Their email address? What if they want to change their email address? What would have been a single one-row update now becomes a giant mess.
A generated key allows you to decouple data that can change from the relationships between records and tables.
i wanna have a Users details stored in the database.. with columns like firstname, last name, username, password, email, cellphone number, activation codes, gender, birthday, occupation, and a few other more. is it good to store all of these on the same table or should i split it between two users and profile ?
If those are attributes of a User (and they are 1-1) then they belong in the user table.
You would only normally split if there were many columns; then you might create another table in a 1-1 mapping.
Another table is obviously required if there are many profile rows per user.
One table should be good enough.
Two tables or more generally vertical portioning comes in when you want to scale out. So you split your tables in multiple tables where usually the partiotioning criteria is the usage i.e., the most common attributes which are used together are housed in one table and others in another table.
One table should be okay. I'd be storing a hash in the password column.
I suggest you read this article on Wikipedia. about database normalization.
It describes the different possibilities and the pros and cons of each. It really depends on what else you want to store and the relationship between the user and its properties.
Ideally one table should be used. If the number of columns becomes harder to manage only then you should move them to another table. In that case, ideally, the two tables should have a one-one relationship which you can easily establish by setting the foreign key in the related table as the primary key:
User
-------------------------------
UserID INT NOT NULL PRIMARY KEY
UserProfile
-------------------------------------------------------
UserID INT NOT NULL PRIMARY KEY REFERENCES User(UserID)
Depend on what kind of application it is, it might be different.
for an enterprise application that my users are the employees as well, I would suggest two tables.
tbl_UserPersonallInformation
(contains the personal information
like name, address, email,...)
tbl_UserSystemInformation (contains
other information like ( Title,
JoinedTheCompanyOn,
LeftTheCompanyOn)
In systems such as "Document Managements" , "Project Information Managements",... this might be necessary.
for example in a company the employees might leave and rejoin after few years and even they will have different job title. The employee had have some activities and records with his old title and he will have some more with the new one. So it should be recorded in the system that with which title (authority) he had done some stuff.