Abstract databases design. - mysql

So I'm creating a web app with different user types that can come from different countries. Examples of the user types would be company, staff etc. Where a company would have a company_name field and staff would not.
In the users database I'm wondering if it's a good idea to implement a one table per column approach i.e for each user attribute there would be a table with a foreign key which would be the user_id and a value for the attribute value.
eg.
users.company_name =
id(PK), | user_id(FK) | 'company_name'
1 | 1 | company 1
users.email =
id(PK), | user_id(FK) | 'email'
1 | 1 | user#email.com
The same could be applied to an address database where different countries' addresses have different values.
Opinions?

The term you're looking for is "The Party Model"
You want to use Table Inheritance†, also known as subtype/supertype relationships to model stuff like this.
An Individual is a concretion of an abstract Legal Party. An Organization (e.g. a Company) is also a concretion of an abstract Legal Party.
"Staff" is not a subtype of Legal Party. It's a relationship between a Company and an Individual. A company hasMany staffRelationships with individuals.
I recommend Single Table Inheritance, as it's fast and simple. If you really don't like nulls, then go for Class Table Inheritance.
create table parties (
party_id int primary key,
type smallint not null references party_types(party_type_id), --elided,
individual_name text null,
company_name text null,
/* use check constraints for type vs individual/company values */
);
I'd go with PostgreSQL over MySQL (or MariaDB) if you're going to use Single Table Inheritance, as the latter do not support check constraints.
You can make user belongTo a party, or make party haveOne user.
† Which is different than PostgreSQL's Inheritance feature.

I'd create a single users table with company_name and email columns.
For addresses table, I'd start with something simple like this: id, address_line_1, address_line_2, city, state, country, zip.

With this strategy you'll have to do a lot of joining tables to get a meaningful query result. As a result your performance will suffer and you have very ineffective use of storage.
You should at least combine columns that will typically be combined for a logical entity in your application. So if a 'company' differs from 'staff' in that it has extra columns, you would create a table 'users.company_properties'.

Related

How to store many to many relationship with extra column data

in the above scenario 'signs and symptoms' is a multi selection and if 'others' selected 'specify-others' field must be filled . how to store this .
what is the best table structure for performance and querying
Either to provide 15 columns in single table and store null if no value or to store foreign key of symptoms in another table (in this strategy how to store 'others symptom' description column ie specify-other field data).
There is no universal answer, your choice may depend on multiple factors including external issues, i.e. coding framework you use to support database (if any). The "classic" way to do it:
1. Patient table:
id (PK)
name
2. Symptom table:
id (PK)
symptom
3. Patient to Symptom table:
id (PK)
patient_id (FK)
symptom_id (FK)
other_symptoms (text)
But once again, any approach (including this one) has its own pros and cons and this is not a universal solution.
I would definitely exclude the 15 columns in a table option because whenever a new symptom would be needed to be added, and it will be needed rather sooner than later, you'll have to:
alter the table schema
the code that displays the symptoms
the code that inserts/updates patient records
who knows what else.
I'd go with a classic many to many relationship, with tables similar to:
patients: patient_id, name, etc
symptoms: symptom_id, name, description, etc
patient_symptoms: patient_id, symptom_id
Even better would be an extra table:
visits: doctor_id, patient_id, date, other_symptoms
And then, your patient_symptoms table can be related to an actual visit to a doctor:
patient_symptoms: visit_id, symptom_id

MySQL relationship within the same table?

I have a web application where you can create a group. A group can be one of 3 options,
Organization
Client
Team
A client group and a team group are relatively simple, however the organisation is a little more complicated.
An organisation can have multiple clients, now my confusion is coming from how do I create this relationship, as the organizations, clients and teams are all saved in the same table. What is the best way to set this up? Should I create a client table that just contains a unique ID and the ID of each client in the groups table, and create a relationship between that and the groups table?
The way I understand it, your application requires hierarchical groups. In other words, Organization is a group but it also contains another group, such as Client. From your comments, it appears that you want to treat all three as groups.
I can suggest the following table:
entity
+ id INT UNSIGNED AUTO_INCREMENT
+ parentId INT UNSIGNED
+ type ENUM ('Client','Team','Organization')
+ name VARCHAR(255)
+ address VARCHAR(255)
For top-level entity such as an Organization, parentId will be zero. For a client/team group, parentId will refer to the id of an organization group. Actually, any kind of hierarchy is possible with the above definition.
If your columns for different groups need to be different, then you need multiple tables but one table can contain the group hierarchy as noted above.

Table for each region in MySQL

There are four regions with more than one million records total. Should I create a table with a region column or a table for each region and combine them to get the top ranks?
If I combine all four regions, none of my columns will be unique so I will need to also add an id column for my primary key. Otherwise, name, accountId & characterId would be candidate keys or should I just add an id column anyways.
Table:
----------------------------------------------------------------
| name | accountId | iconId | level | characterId | updateDate |
----------------------------------------------------------------
Edit:
Should I look into partitioning the table by region_id?
Because all records are related to a particular region, a single database table in 3NF(e.g All-Regions) containing a regionId along with other attributes should work.
The correct answer, as usually with database design, is "It depends".
First of all, (IMHO) a good primary key should belong to the database, not to the users :)
So, if accountId and characterId are user-editable or prominently displayed to the user, they should not be used for the primary key of the table(s) anyway. And using name (or any other user-generated string) for a key is just asking for trouble.
As for the regions, try to divine how the records will be used.
Whether most of the queries will use only a single region, or most of them will use data across regions?
Is there a possibility that the schemas for different regions might diverge?
Will there be different usage scenarios for similar data? (e.g. different phone number patterns for different regions)
Bottom line, both approaches will work, let your data tell you which approach will be more manageable.

Database many-to-many intermediate tables: extra fields

I have created a 'shops' and a 'customers' table and an intermediate table customers_shops. Every shop has a site_url web address, except that some customers use an alternative url to access the shop's site (this url is unique to a particular customer).
In the intermediate table below, I have added an additional field, shop_site_url. My understanding is that this is in 2nd normalised form, as the shop_site_url field is unique to a particular customer and shop (therefore won't be duplicated for different customers/shops). Also, since it depends on customer and shop, I think this is in 3rd normalised form. I'm just not used to using the 'mapping' table (customers_shops) to contain additional fields - does the design below make sense, or should I reserve the intermediate tables purely as a to convert many-to-many relationships to one-to-one?
######
customers
######
id INT(11) NOT NULL PRIMARY KEY
name VARCHAR(80) NOT NULL
######
shops
######
id INT(11) NOT NULL PRIMARY KEY
site_url TEXT
######
customers_shops
######
id INT(11) NOT NULL PRIMARY KEY
customer_id INT(11) NOT NULL
shop_id INT(11) NOT NULL
shop_site_url TEXT //added for a specific url for customer
Thanks
What you are calling an "intermediate" table is not a special type of table. There is only one kind of table and the same design principles ought to be applicable to all.
Well, let's create the table, insert some sample data, and look at the results.
id cust_id shop_id shop_site_url
--
1 1000 2000 NULL
2 1000 2000 http://here-an-url.com
3 1000 2000 http://there-an-url.com
4 1000 2000 http://everywhere-an-url-url.com
5 1001 2000 NULL
6 1001 2000 http://here-an-url.com
7 1001 2000 http://there-an-url.com
8 1001 2000 http://everywhere-an-url-url.com
Hmm. That doesn't look good. Let's ignore the alternative URL for a minute. To create a table that resolves a m:n relationship, you need a constraint on the columns that make up the m:n relationship.
create table customers_shops (
customer_id integer not null references customers (customer_id),
shop_id integer not null references shops (shop_id),
primary key (customer_id, shop_id)
);
(I dropped the "id" column, because it tends to obscure what's going on. You can add it later, if you like.)
Insert some sample data . . . then
select customer_id as cust_id, shop_id
from customers_shops;
cust_id shop_id
--
1000 2000
1001 2000
1000 2001
1001 2001
That's closer. You should have only one row for each combination of customer and shop in this kind of table. (This is useful data even without the url.) Now what do we do about the alternative URLs? That depends on a couple of things.
Do customers access the sites through
only one URL, or might they use more
than one?
If the answer is "only one", then you can add a column to this table for the URL, and make that column unique. It's a candidate key for this table.
If the answer is "more than one--at the very least the site url and the alternative url", then you need to make more decisions about constraints, because altering this table to allow multiple urls for each combination of customer and shop cuts across the grain of this requirement:
the shop_site_url field is unique to a
particular customer and shop
(therefore won't be duplicated for
different customers/shops)
Essentially, I'm asking you to decide what this table means--to define the table's predicate. For example, these two different predicates lead to different table structures.
customer 'n' has visited the web site
for shop 'm' using url 's'
customer 'n' is allowed to visit the
web site for shop 'm' using alternate
url 's'
Your schema does indeed make sense, as shop_site_url is an attribute of the relationship itself. You might want to give it a more meaningful name in order to distinguish it from shops.site_url.
Where else would you put this information? It's not an attribute of a shop, and it's not an attribute of a customer. You could put this in a separate table, if you wanted to avoid having a NULLable column, but you'd end up having to have a reference to your intermediate table from this new table, which probably would look even weirder to you.
Relationships can have attributes, just like entities can have attributes.
Entity attributes go into columns in entity tables. Relationship attributes, at least for many-to-many relationships, go in relationship tables.
It sounds as though, in general, URL is determined by the combination of shop and customer. So I would put it in the shop-customer table. The fact that many shops have only one URL suggests that there is a fifth normal form that is more subtle than this. But I'm too lazy to work it out.

Design of MySQL DB to avoid having a table with mutually exclusive fields

I'm creating a new DB and I have this problem: I have two type of users that can place orders: registered users (that is, they have a login) and guest users (that is, no login). The data for registered users and guest users are different and that's why I'm thinking of using two different tables, but the orders (that share the same workflow) are all the same, so I'm thinking about using only one table.
I've read here and here (even if I don't understand fully this example) that I can enforce a MySQL rule to have mutually exclusive columns in a table (in my case they'd be "idGuest" and "idUser") but I don't like that approach.
Is there a better way to do it?
There are several approaches, which depends on the number of records and number of unique fields. For example, if you would say they differ in only two fields, I would have suggested that you just put everything in the same table.
My approach, assuming they differ a lot, would be to think "objects":
You have a main user table, and for each user type you have another table that "elaborates" that user info.
Users
-----
id,email,phone,user_type(guest or registered)
reg_users
---------
users_id, username,password etc.....
unreg_users
-----------
user_id,last_known_address, favorite_color....etc
Where user_id is foreign key to users table
Sounds like mostly a relational supertype/subtype issue. I've answered a similar question and included sample code that you should be able to adapt without much trouble. (Make sure you read the comments.)
The mildly complicating factor for you is that one subtype (guest users) could someday become a different subtype (registered users). How you'd handle that would be application-dependent. (Meaning you'd know, but probably nobody else would.)
I think I would have three tables :
A user table, that would contain :
One row for each user, no matter what type of user
The data that's present for both guests and registered
A field that indicates if a row corresponds to a registered or a guest
A guest table, that would contain :
One row per guest user,
The data that's specific to guests
And a registered table, that would contain :
One row per registered user,
The data that's specific to registered users
Then, when referencing a user (in your orders table, for example), you'd always use the id of the user table.
What you are describing is a polymorphic table. It sounds scary, but it really isn't so bad.
You can keep your separate User and Guest tables. For your Orders table, you have two columns: foreign_id and foreign_type (you can name them anything). The foreign_id is the id of the User or Guest in your case, and the content of the foreign_type is going to be either user or guest:
id | foreign_id | foreign_type | other_data
-------------------------------------------------
1 | 1 | user | ...
2 | 1 | guest | ...
To select rows for a particular user or guest, just specify the foreign_type along with the ID:
SELECT * FROM orders WHERE foreign_id = 1 AND foreign_type = 'guest';
The foreign key in the Orders table pointing back to the Customer entity that placed the order is typically a non-nullable column. If you have two different Customer tables (RegisteredCustomer and GuestCustomer) then you would requiree two separate nullable columns in the Orders table pointing back to the separate customer tables. What I would suggest is to have only one Customers table, containing only those rows (EDIT: sorry, meant to write only those COLUMNS) that are common to registered users and guest users, and then a RegisteredUsers table which has a foreign-key relationship with the Customers table.