I have a tables called userAccounts userProfiles and usersearches.
Each userAccount may have multiply Profiles. Each user may have many searches.
I have the db set up working with this. However in each search there may be several user profiles.
Ie, each user account may have a profile for each member of their family.
They then want to search and include all or some of their family members in their search. The way i would kinda like it to work is have a column in user searches called profiles and basically have a list of profileID that are included in that search. (But as far as i know, you can't do this in sql)
The only way i can think i can do this is have 10 columns called profile1, profile2 ... profile10 and place each profileid into the column and 0 or null in the unused space. (but this is clearly messy )
Creating columns of the form name1...nameN is a clear violation of the Zero, One or Infinity Rule of database normalization. Arbitrarily having ten of them is not the right approach, that's an assumption that will prove to be either wildly generous or too constrained most of the time. Since you're using a relational database, try and store your data relationally.
Consider the schema:
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT NOT NULL,
name VARCHAR(255),
UNIQUE KEY index_on_name (name)
);
CREATE TABLE profiles (
id INT PRIMARY KEY AUTO_INCREMENT NOT NULL,
user_id INT NOT NULL,
name VARCHAR(255),
email VARCHAR(255),
KEY index_on_user_id (user_id)
);
With that you can create zero or more profile records as required. You can also add or remove fields from the profile records without impacting the main user records.
If you ever want to search for all profiles associated with a user:
SELECT ... FROM profiles
LEFT JOIN users ON
users.id=profiles.user_id
WHERE users.name=?
Using a simple JOIN or subquery you can easily exercise this relationship.
Related
I have a doubt about this DB schema I'm making.
I have two similar users, but one has extra information than the other:
Type 1 : Administrator
- Name
- Lastname
- Email
- Password
Type 2: Student
- Name
- Lastname
- Email
- Password
- Status
- Sex
- Among other type of personal information fields
So, I'm hesitating about either make these two separate tables, and when they're going to log in, query them both (Because I have only one logging screen), or unify them as only one table User, and make another called like "extra" with a foreign key from User pointed to the latter.
What would be the most efficent way to accomplish this? Thanks for your time
I would make two tables and do the join after log in. Cache the extra facts about the user after they're logged in.
You should have a User table with these columns:
Id, Name, Lastname, Email, Password, IsAdmin
With a Student table:
UserId, Status, Sex, ...
A Student must also be a User - this will reduce duplication of data.
If you need more permissions than IsAdmin then remove that column and make UserPermissions and Permission tables.
If you're really that concerned about a join, then just make everything nullable and in one User table. I doubt it will matter in your use case (this is a much bigger topic).
An administrator is a role played by a person.
A student is a role played by a person.
A person could play one role at a time, or maybe multiple down the road. This is a business rule and should not factor into your database schema.
Use single table inheritance to allow for different types of roles in the same table.
create table people (
person_id int primary key,
given_name varchar(...),
surname varchar(...),
password varchar(...)--consider a `users` table instead
);
create table roles (
role_id int primary key,
person_id int not null references people(person_id), --use the long-hand foreign key syntax for mysql
type varchar(...), --admin or student
status varchar(...), --consider PostgreSQL over mysql, as it supports check constraints
sex varchar(...)
);
I'm developing a classifieds site. And I'm totally stuck at database design level.
Advertisiment can only be in 1 category.
In my database I have table called "ads", which has columns, common for all advertisements.
CREATE TABLE Ads (
AdID int not null,
AdDate datetime not null,
AdCategory int not null,
AdHeading varchar(255) not null,
AdText varchar(255) not null,
etc...
);
I also have a lot of categories.
Ads that are posted in "cars" category, for example, have additional columns like make, model, color, etc. Ads, posted in "housing" have columns like housing type, sqft. etc...
I did something like:
CREATE TABLE Cars (
AdID int not null,
CarMake varchar (255) not null,
CarModel varchar(255) not null,
...
);
CREATE TABLE Housing (
AdID int not null,
HousingType varchar (255) not null
...
);
AdId in those is a foreign key to Ads.
But when I need to retrieve information from Ads, I have to look up all those additional tables and check if AdId in Ads equals to AdId in those tables.
For every category I need a new table. I'm gonna end up with like 15 tables or so.
I had an idea to have a boolean columns in Ads table like is_Cars, is_Housing, etc but having a 15 columns, where 14 would be NULL seems to be horrible.
Is there any better way to design this database? I need my database to be in a 3rd normal form, this is the most important requirement.
Don't worry too much - it's a well known dilemma, there are no 'silver bullets' and all solutions have some trade-offs. Your solution sounds good to me, and is commonly used in the industry. On the down side it has JOINS as you mentioned (which is a well-known trade-off of normalization anyway), and also each new product type requires a new TABLE. On the up side the table structure precisely reflects your business logic, it's readable and efficient in storage.
Your other suggestion, as far as I understand, was a single table where each row has a "type" indication - car, house etc (btw no need for multiple columns such as 'is_car', 'is_house' - it's simpler to have a single column 'type', e.g. type=1 indicates car, type=2 indicates house etc). Then multiple columns where some of them are unused for some product types.
Well, here the advantage is capability to add new types dynamically (even user-defined types) without changing the database schema. Also no 'JOINs'. On the down side you'll be storing & retrieving lots of 'null' cells, and also the schema would be less descriptive: e.g. it's harder to put a constraint "carModel column is not nullable", because it is nullable for houses (you can use triggers, but it's less readable).
Personally I prefer the 1st solution (of course depending on the usecase, but the 1st solution is my first instinct). And I can use it with some peace of mind after considering the trade-offs, e.g. understanding that I'm tolerating those JOINS as payment for a readable & compact schema.
One, you are confusing categories and product specifications.
Two, you need to read up on Table Inheritance.
If you don't mind nulls, use Single Table Inheritance. All "categories" (cars, houses, ...) go in one table and have a "type" column.
If you don't like nulls, use Class Table Inheritance. Make a master table with the primary keys that you point your category foreign key at. Make child tables for each type (cars, houses, ...) whose primary key is also a foreign key to the master table. This is easier with an ORM like Hibernate.
I have a website that allows users to be different types. Each of these types can do specific things. I am asking if I should set up 1 table for ALL my users and store the types in an enum, or should I make different tables for each type. Now, if the only thing different was the type it would be easy for me to choose only using one table. However, here's a scenario.
The 4 users are A, B, C, D.
User A has data for:
name
email
User B has data for:
name
email
phone
User C has data for:
name
email
phone
about
User D has data for:
name
email
phone
about
address
If I were to create a single table, should I just leave different fields null for the different users? Or should I create a whole separate table for each user?
Much better if you could create a single table for all of them. Though some fileds are nullable. And add an extra column (enum) for each type of users. If you keep your current design, you will have to use some joins and unions for the records. (which adds extra overhead on the server)
CREATE TABLE users
(
ID INT,
name VARCHAR(50),
email VARCHAR(50),
phone VARCHAR(50),
about VARCHAR(50),
address VARCHAR(50),
userType ENUM() -- put types of user here
)
Another suggested design is to create two tables, one for user and the other one is for the types. The main advantage here is whenever you have another type of user, you don't have to alter the table but by adding only extra record on the user type table which will then be referenced by the users table.
CREATE TABLE UserType
(
ID INT PRIMARY KEY,
name VARCHAR(50)
)
CREATE TABLE users
(
ID INT,
name VARCHAR(50),
email VARCHAR(50),
phone VARCHAR(50),
about VARCHAR(50),
address VARCHAR(50),
TypeID INT,
CONSTRAINT rf_fk FOREIGN KEY (TypeID) REFERENCES UserType(ID)
)
Basic database design principals suggest one table for the common elements and additional tables, JOINed back to the base table, for the attributes that are unique to each type of user.
Your example suggests one and only one additional field per user-type in a straightforward inheritance hierarchy. Is that really what the data looks like, or did you simply for the example? If that's a true representation of your requirements, I might be tempted (for expedience) to use a single table. But if the real requirements are more complex, I'd bite the bullet and do it "correctly".
Try creating four tables:
Table 1: Name, email
Table 2: Name, phone
Table 3: Name, about
Table 4: Name, address
Name is your primary key on all four tables. There are no nulls in the database. You're not storing an enumerated type but derive the type from table joins:
To find all User A select all records in table 1 not in table 2
To find all User B select all records in table 2 not in table 3
To find all User C select all records in table 3 not in table 4
To find all User D select all records in table 4
You should not create tables for different people because this will lead to a bloated database. It's best to create a single table with all the fields you need. If you don't use the field, pass in null values.
I would suggest that you use 1 single table with nullable fields. And a table of something like roles.
I'm building a website where I will have users logging into the site from multiple sources, including Facebook and Google+ and I want to be able to keep some basic info on each user in my data base, so that I can track the creation of things like comments and posts. How do I efficiently do this in a SQL database. Do I create a new table for each type of user?
You just need two tables, a user table and a usertype table. The user table would have a column for type that would link to the usertype table and tell you the type. I would be something basic like this:
User (
Id Int NOT NULL PRIMARY KEY,
UserName VarChar(50) NOT NULL,
EmailAddress VarChar(100),
{... More fields generally used by all account types ...}
UserTypeId Int NOT NULL
)
UserType (
Id Int NOT NULL PRIMARY KEY,
Type VarChar(50) NOT NULL
)
If you have information that is specific to each log on type like Google+ or Facebook, you could create a table for each specific log on type. However, the reality is that you probably will get the same set of basic information fields for all of the different types possible so there is not much to worry about.
The usual approach is:
A USER table with the common data
A FACEBOOK_USER table (which has it's own PK and a USER FK) with the Facebook specific data
A GOOGLE_USER table...
When loading a user, you can join all those tables or you can create a view that contains the join or, if you have many special types, you can load the user and then read the others individually (maybe keep a IS_x_USER in the USER table to speed this up).
I want to make user group system that imitates group policy in instant messengers.
Each user can create as many as groups as they want, but they cannot have groups with duplicate names, and they can put as many friends as they want into any groups.
For example, John's friend Jen can be in 'school' group of John and 'coworker' group of John at the same time. And, it is totally independent from how Jen puts John into her group.
I'm thinking two possible ways to implement this in database user_group table.
1.
user_group (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT,
group_name VARCHAR(30),
UNIQUE KEY (user_id, group_name)
)
In this case, all groups owned by all users will have a unique id. So, id alone can identify which user and the name of the group.
2.
user_group (
user_id INT,
group_id INT AUTO_INCREMENT,
group_name VARCHAR(30),
PRIMARY KEY (user_id, group_id),
UNIQUE KEY (user_id, group_name)
)
In this case, group_id always starts from 0 for each user, so, there could exist many groups with same group_id s. But, pk pair (user_id, group_id) is unique in the table.
which way is better implementation and why?
what are advantages and drawbacks for each case?
EDIT:
added AUTO_INCREMENT to group_id in second scenario to insure it is auto-assigned from 0 for each user_id.
EDIT:
'better' means...
- better performance in SELECT/INSERT/UPDATE friends to the group since that will be the mostly used operations regarding the user group.
- robustness of database like which one will be more safe in terms of user size.
- popularity or general preference of either one over another.
- flexibility
- extensibility
- usability - easier to use.
Personally, I would go with the 1st approach, but it really depends on how your application is going to work. If it would ever be possible for ownership of a group to be changed, or to merge user profiles, this will be much easier to do in your 1st approach than in the 2nd. In the 2nd approach, if either of those situations ever happen, you would not only have to update your user_group table, but any dependent tables as well that have a foreign key relation to user_group. This will also be a many to many relation (there will be multiple users in a group, and a user will be a member of multiple groups), so it will require a separate joining table. In the 1st approach, this is fairly straightforward:
group_member (
group_id int,
user_id int
)
For your 2nd approach, it would require a 3rd column, which will not only be more confusing since you're now including user_id twice, but also require 33% additional storage space (this may or may not be an issue depending on how large you expect your database to be):
group_member (
owner_id int,
group_id int,
user_id int
)
Also, if you ever plan to move from MySQL to another database platform, this behavior of auto_increment may not be supported. I know in MS SQL Server, an auto_increment field (identity in MSSQL) will always be incremented, not made unique according to indexes on the table, so to get the same functionality you would have to implement it yourself.
Please define "better".
From my gut, I would pick the 2nd one.
The searchable pieces are broken down more, but that wouldn't be what I'd pick if insert/update performance is a concern.
I see no possible benefit to number 2 at all, it is more complex, more fragile (it would not work at all in SQL Server) and gains nothing. Remeber the groupId is without meaning except to identify a record uniquely, likely the user willonly see the group name not the id. So it doesn't matter if they all start from 0 or if there are gaps because a group was rolled back or deleted.