How to better build a database - mysql

We have a DB on SQL, where we have a table (1) for users and a table (2) for user's saved information. Each piece of information is one line in table (2). So my question is the following - If we are intending to grow number of users to more than 1.000.000 and each user can have more than 10 piece of information, which of the following is a better way to build our DB:
a) Having 2 tables - 1 for users and 1 for information from all users, related to users with ID
b) Having a separate table for each user.
Thanks in advance.

Definitely it should be having a single table for the user is much better. Think from the DB prospective. You are thinking about the search time in a 1.000.000 row for a sorted ID. In the second case you have to search 1.000.000 table to get into a right table. So better go for option A.

I'm going to agree that option A is the better of the two options presented.
That being said, I would personally break up the information for the users into more tables as well. This would all be connected using foreign keys and will allow for more specific querying of the information.
SQL is not really horizontally scalable, so if you end up with users with less or more information than others, then you'll have NULL columns and this requires dealing with in various ways.
By using separate tables, you can still have all of the information contained, but not have to worry if one user has a home and cell phone number, while another only has a cell number.
If and when you do need to access a lot of the information at once, SQL is very good at dealing with this through joins and the like.
Option B is not bad, it just does not fit SQL. I would work if the DB in question was document based instead of tables. In that case, creating a single document for each user is a good idea, and likely preferred.

Option C)
table for users with a unique UserID as Clustered Index (Primary Key)
table for Type of saved information with a unique InformationID as Clustered Index (Primary Key)
table for UserInformation with unique UserInformationID as Clustered Index (Primary Key), a column for UserID (nonclustered index, foreign key to user table) and a column for InformationID (nonclustered index, foreign key to Information table). Have a "Value" or similar column to hold the data being save as it relates to the type of information.
Example:
Users Table
UserID UserName
1 | UserName1
2 | UserName2
Information Table
InfoID InfoName
1 | FavoriteColor
2 | FavoriteNumber
3 | Birthday
UserInformation Table
ID UserID InfoID Value
1 | 1 | 1 | Blue
2 | 1 | 2 | 7
3 | 1 | 3 | '11/01/1999'
4 | 2 | 3 | '05/16/1960'
This method allows for you to save any combination of values for any user without recording any of the non-supplied user information. It keeps the information table 'clean' because you won't need to keep adding columns for each new piece of information you wish to track. Just add a new record to the Info table, and then record only the values submitted to the UserInformation table.

Related

How do I make a combination of values unique in a mySQL table no mater what order they are in?

I am creating a application that involves a friend system such as the one in facebook. The way I structured this in my SQL database is by having a friend table which has the columns ID, accountID1, accountID2 so that the each of the two accounts involved in the friendship is noted. The problem is that a friendship can be noted in two different ways for example:
ID | accountID1 | accountID2
1 | 1 | 2
2 | 2 | 1
If I make the combination unique it does not protect against this from occurring. How can I create a constraint in MySQL to prevent a friendship to be present in two different ways to ensure data integrity? or is there a different way of storing this information to prevent such problems in the first place?
The final solution I used is to first of all get rid of the ID for the friends table and make a composite primary key out of the two account ID's PrimaryKey(accountID0, accountID1). This ensures that the combination of them are unique. Then I created a "before Insert trigger" to switch the values so that the smaller accountID is always in accountID0. This method worked perfectly and made no problems so far.

how to deal with multiple values for a single field in table?

i have 1 table in phpmyadmin users which contains below fields.
users:
uid | name | contact.no
There can be more then one contact number for a single user.
One way to solve it is using one more table for contact number and pass its primary key to users table.
Is there any other way other then this one.
Can we implement array structure in contact.no field?
You could put commas over there and save multiple numbers but then it kills the whole concept of an RDMS and Normalization. That will not be a good database design. So it is advisable to normalize your table and not store such multiple information in one field. Database doesn't really stress itself if you have 1 more table.
A very well written explanation can be found Here on Microsoft Website
You wouldn't have to create multiple tables for each type of entry, just a more robust table structure. Make sure that the information that needs to be normalized is in a consistent format.
Users:
uid | name | username
1,Bob,bcratchet
Info:
iid | itype | icontent | uid
1,cell,000.000.0000
2,home,000.000.0000
3,home_addr,1234 Anystreet, anytown USA
4,work_addr,4567 Anystreet, anytown USA
select * from Users u,Info i where u.uid=i.uid and name="Bob";
Pull it into a multidimensional array in any application and you're good to go.
edit*
Ideally it would go further and show a table like itypes where you would further normalize the types like so:
itypes: itype_id | itype
1,cell
2,home
3,home_addr
4,work_addr
Then in the Info table it would say "itype_id" instead of "itype."

Does it make sense to have three primary keys, two of which are foreign keys, in one table?

I've created a database with three tables in it:
Restaurant
restaurant_id (autoincrement, PK)
Owner
owner_id (autoincrement, PK)
restaurant_id (FK to Restaurant)
Deal
deal_id (autoincrement)
owner_id (FK to Owner)
restaurant_id (FK to Restaurant)
(PK: deal_id, owner_id, restaurant_id)
There can be many owners for each restaurant. I chose two foreign keys for Deal so I can reference the deal by either the owner or the restaurant. The deal table would have three primary keys, two being foreign keys. And it would have two one-to-many relationships pointing to it. All of my foreign keys are primary keys and I don't know if I'll regret doing it like this later on down the road. Does this design make sense, and seem good for what I'm trying to achieve?
Edit: What I really need to be able to accomplish here is when a owner is logged in and viewing their account, I want them to be able to see and edit all the deals that are associated with that particular restaurant. And because there can be more that one owner per restaurant, I need to be able to perform a query something like: select *from deals where restaurant_id = restaurant_id. In other words, if I'm an owner and I'm logged in, I need to be able to make query: get all of the deal that are related to not just me, the owner, but to all of the owners associated with this restaurant.
You're having some trouble with terminology.
A table can only ever have a one primary key. It is not possible to create a table with two different primary keys. You can create a table with two different unique indexes (which are much like a primary key) but only one primary key can exist.
What you're asking about is whether you should have a composite or compound primary key; a primary key using more than one column.
Your design is okay, but as written you probably have no need for the column deal_id. It seems to me that restaurant_id and owner_id together are enough to uniquely identify a row in Deal. (This may not be true if one owner can have two different ownership stakes in a single restaurant as the result of recapitalization or buying out another owner, but you don't mention anything like that in your problem statement).
In this case, deal_id is largely wasted storage. There might be an argument to be made for using the deal_id column if you have many tables that have foreign keys pointing to Deal, or if you have instances in which you want to display to the user Deals for multiple restaurants and owners at the same time.
If one of those arguments sways you to adopt the deal_id column, then it, and only it, should be the primary key. There would be nothing added by including the other two columns since the autoincrement value itself would be unique.
If u have a unique field, this should be the PK, that would be the incremented field.
In this specific case it gives u nothing at all to add more fields to this key, it actually somewhat impacts performance (don't ask me how much, u bench it).
if you want to create 2 foreign keys in the deal table which are the restaurant and the owner the logic is something like a table could exist in the deal even without an owner or an owner could exist in the deal even without identifying the table on it but you could still identify the table because it's being used as a foreign key on the owner table, but if your going to put values on each columns that you defined as foreign key then I think it's going to be redundant cause I'm not sure how you would use the deal table later on but by it's name I think it speaks like it would be used to identify if a restaurant table is being reserved or not by a customer and to see how you have designed your database you could already identify the table which they have reserved even without specifying the table as foreign key in the deal table cause by the use of the owner table you would able to identify which table they have reserved already since you use it as foreign key on the owner table you just really have to be wise on defining relationships between your tables and avoid redundancy as much as possible. :)
I think it is not best.
First of all, the Deal table PK should be the deal_id. There is no reason to add additional columns to it--and if you did want to refer to the deal_id in another table, you'd have to include the restaurant_id and owner_id which is not good. Whether deal_id should also be the clustered index (a.k.a. index organized on this column) depends on the data access pattern. Will your database be full of data_id values most often used for lookup, or will you primarily be looking deals up by owner_id or restaurant_id?
Also, using two separate FKs way the you have described it (as far as I can tell!) would allow a deal to have an owner and restaurant combination that are not a valid (combining an owner that does not belong to that restaurant). In the Deal table, instead of one FK to Owner and one FK to Restaurant, if you must have both columns, there should be a composite FK to only the Owner table on (OwnerID, RestaurantID) with a corresponding unique key in the Owner table to allow this link up.
However, with such a simple table structure I don't really see the problem in leaving RestaurantID out of the Deal table, since the OwnerID always fully implies the RestaurantID. Obviously your deals cannot be linked only with the restaurant, because that would imply a 1:M relationship on Deal:Owner. The cost of searching based on Restaurant through the Owner table shouldn't really be that bad.
Its not wrong, it works. But, its not recommended.
Autoincrement Primary Keys works without Foreign Keys (or Master Keys)
In some databases, you cannot use several fields as a single primary key.
Compound Primary Keys or Compose Primary Keys are more difficult to handle in a query.
Compound Primary Key Query Example:
SELECT
D.*
FROM
Restaurant AS R,
Owner AS O,
Deal AS D
WHERE
(1=1) AND
(D.RestaurantKey = D.RestaurantKey) AND
(D.OwnerKey = D.OwnerKey)
Versus
Single Primary Key Query Example:
SELECT
D.*
FROM
Restaurant AS R,
Owner AS O,
Deal AS D
WHERE
(D.OwnerKey = O.OwnerKey)
Sometimes, you have to change the value of foreign key of a record, to another record. For Example, your customers already order, the deal record is registered, and they decide to change from one restaurant table to another. So, the data must be updated, in the "Owner", and "Deal" tables.
+-----------+-------------+
| OwnerKey | OwnerName |
+-----------+-------------+
| 1 | Anne Smith |
+-----------+-------------+
| 2 | John Connor |
+-----------+-------------+
| 3 | Mike Doe |
+-----------+-------------+
+-----------+-------------+-------------+
| OwnerKey | DealKey | Food |
+-----------+-------------+-------------+
| 1 | 1 | Hamburguer |
+-----------+-------------+-------------+
| 2 | 2 | Hot-Dog |
+-----------+-------------+-------------+
| 3 | 3 | Hamburguer |
+-----------+-------------+-------------+
| 1 | 3 | Soda |
+-----------+-------------+-------------+
| 2 | 1 | Apple Pie |
+-----------+-------------+-------------+
| 3 | 3 | Chips |
+-----------+-------------+-------------+
If you use compound primary keys, you have to create a new record for "Owner", and new records for "Deals", copy the other fields, and delete the previous records.
If you use single keys, you just have to change the foreign key of Table, without inserting or deleting new records.
Cheers.

Database table design - are my fields correct?

I need to sell items on my fictitious website and as such have come up with a couple of tables and was wondering if anyone could let me know if this is plausible and if not where i might be able to change things?
I am thinking along the lines of;
Products table : ID, Name, Cost, mediaType(FK)
Media: Id, Name(book, cd, dvd etc)
What is confusing me is that a user might have / own many products, but how would you store an array of product id's in a single column?
Thanks
You could something like store a JSON array in a text or varchar field and let the application handle parsing it.
MySQL doesn't have a native array type, unlike say PostgreSQL, but in general I find if you're trying to store an array you're probably doing something wrong. Of course every rule has its exceptions.
What your probably want is a user table and then a table that correlates products to users. If a product is only going to relate to one user then you can add a user ID column to your Products table. If not, then you'll want another lookup table which handles the many to many relationship. It would look something like this:
------------------------
| user_id | product_id |
------------------------
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 2 |
| 3 | 1 |
| 3 | 5 |
------------------------
I think one way of storing all the products which user has in one column is to store it as a string where product ids are separated by some delimiters like comma. Though this is not the way you want to solve. The best way to solve this problem would be to have a seperate user table and than have a user product table where you associate userid with product id. You could than simple use a simple query to get list of all the products owned by a particular userid
As a starting point, try to think of the system in terms of the major parts - you would have a 'warehouse', so you need a table to list the products you have, and you are going to possibly have users who register their details with you for regular visits - so an account per user. You would generally hold all details of a single product in the same row of the same table (unless you have a really complex product to detail, but not likely). If you're going to keep track of products bought per user account, there's always the option of keeping the order history as a delimited list in a large text field. For example: date,id,id,id,id;date,id,id. Or you could simply refer to order numbers and have a separate table for orders placed [by any customer].
What is confusing me is that a user might have / own many products, but how would you store an array of product id's in a single column?
This is called a "many-to-many" relationship. In essence you would have a table for users, a table for products, and a table to map them like this:
[table] Users
- id
- name
[table] Products
- id
- name
- price
[table] Users_Products
- user_id
- product_id
Then when you want to know what products a user has, you could perform a query like:
SELECT product_id FROM Users_Products WHERE user_id=23;
Of course, user id 23 is fictituous for examples sake. The resulting recordset would contain the id's of all the products the user owns.
You wouldn't store an array of things into a single column. In fact you usually wouldn't store them in separate columns either.
You need to step away from design for a bit and go investigate third normal form. That should be you starting point and, in the vast majority of cases, your ending point for designing database schemas.
The correct way of handling variable size "arrays" is with two tables with a many to one relationship, something like:
Users
User ID (primary key)
Name
Other user info
Objects:
Object Id (primary key)
User id (foreign key, references Users(User id)
Other object info
That's the simplest form where one object is tied to a specific user, but a specific user may have any number of objects.
Where an object can be "owned" by multiple users (I say an object meaning (for example) the book "Death of a Salesman", but obviously each user has their own copy of an object), you use a joining table:
Users
User ID (primary key)
Name
Other user info
Objects:
Object Id (primary key)
User id (foreign key, references Users(User id))
Other object info
UserObjects:
User id (foreign key, references Users(User id))
Object id (foreign key, references Objects(Object id))
Count
primary key (User id, Object id)
Similarly, you can handle one or more by adding an object id to the Users table.
But, until you've nutted out the simplest form and understand 3NF, they won't generally matter to you.

Composite primary keys or surrogates when dealing with date/time

I've seen a lot of discussion regarding this. I'm just seeking for your suggestions regarding this. Basically, what I'm using is PHP and MySQL. I have a users table which goes:
users
------------------------------
uid(pk) | username | password
------------------------------
12 | user1 | hashedpw
------------------------------
and another table which stores updates by the user
updates
--------------------------------------------
uid | date | content
--------------------------------------------
12 | 2011-11-17 08:21:01 | updated profile
12 | 2011-11-17 11:42:01 | created group
--------------------------------------------
The user's profile page will show the 5 most recent updates of a user. The questions are:
For the updates table, would it be possible to set both uid and date as composite primary keys with uid referencing uid from users
OR would it be better to just create another column in updates which auto-increments and will be used as the primary key (while uid will be FK to uid in users)?
Your idea (under 1.) rests on the assumption that a user can never do two "updates" within one second. That is very poor design. You never know what functions you will implement in the future, but chances are that some day 1 click leads to 2 actions and therefore 2 lines in this table.
I say "updates" quoted because I see this more as a logging table. And who knows what you may want to log somewhere in the future.
As for unusual primary keys: don't do it, it almost always comes right back in your face and you have to do a lot of work to add a proper autoincremented key afterwards.
It depends on the requirement but a third possibility is that you could make the key (uid, date, content). You could still add a surrogate key as well but in that case you would presumably want to implement both keys - a composite and a surrogate - not just one. Don't make the mistake of thinking you have to make an either/or choice.
Whether it is useful to add the surrogate or not depends on how it's being used - don't add a surrogate unless or until you need it. In any case uid I would assume to be a foreign key referencing the users table.