mysql performance & index - mysql

I have 5 relation tables like
ID | FK_USER | FK_POST | DATE
Is it faster and efficient to user separate tables for each type of relation, or to create just one table like
ID | FK_USER | FK_POST | TYPE | DATE
where type is an Enum, and I put an INDEX on TYPE ?
Assume that I search about "subscription" (which is one of my relation types) Is it faster to use separate table and search on it, or use combined table and add "where TYPE = 1" to query string?

It is better to have a combined table with a where clause on 'Type'. There will not be much of a difference in performance either way u store the data. But when u have separate tables, in future if you are going to add new type, another table has to be created. This will add upto to your data management headache.

Related

How to better build a database

We have a DB on SQL, where we have a table (1) for users and a table (2) for user's saved information. Each piece of information is one line in table (2). So my question is the following - If we are intending to grow number of users to more than 1.000.000 and each user can have more than 10 piece of information, which of the following is a better way to build our DB:
a) Having 2 tables - 1 for users and 1 for information from all users, related to users with ID
b) Having a separate table for each user.
Thanks in advance.
Definitely it should be having a single table for the user is much better. Think from the DB prospective. You are thinking about the search time in a 1.000.000 row for a sorted ID. In the second case you have to search 1.000.000 table to get into a right table. So better go for option A.
I'm going to agree that option A is the better of the two options presented.
That being said, I would personally break up the information for the users into more tables as well. This would all be connected using foreign keys and will allow for more specific querying of the information.
SQL is not really horizontally scalable, so if you end up with users with less or more information than others, then you'll have NULL columns and this requires dealing with in various ways.
By using separate tables, you can still have all of the information contained, but not have to worry if one user has a home and cell phone number, while another only has a cell number.
If and when you do need to access a lot of the information at once, SQL is very good at dealing with this through joins and the like.
Option B is not bad, it just does not fit SQL. I would work if the DB in question was document based instead of tables. In that case, creating a single document for each user is a good idea, and likely preferred.
Option C)
table for users with a unique UserID as Clustered Index (Primary Key)
table for Type of saved information with a unique InformationID as Clustered Index (Primary Key)
table for UserInformation with unique UserInformationID as Clustered Index (Primary Key), a column for UserID (nonclustered index, foreign key to user table) and a column for InformationID (nonclustered index, foreign key to Information table). Have a "Value" or similar column to hold the data being save as it relates to the type of information.
Example:
Users Table
UserID UserName
1 | UserName1
2 | UserName2
Information Table
InfoID InfoName
1 | FavoriteColor
2 | FavoriteNumber
3 | Birthday
UserInformation Table
ID UserID InfoID Value
1 | 1 | 1 | Blue
2 | 1 | 2 | 7
3 | 1 | 3 | '11/01/1999'
4 | 2 | 3 | '05/16/1960'
This method allows for you to save any combination of values for any user without recording any of the non-supplied user information. It keeps the information table 'clean' because you won't need to keep adding columns for each new piece of information you wish to track. Just add a new record to the Info table, and then record only the values submitted to the UserInformation table.

Should I use a single table for many categorized rows?

I want to implement some user event tracking in my website for statistics etc.
I thought about creating a table called tracking_events that will contain the following fields:
| id (int, primart) |
| event_type (int) |
| user_id (int) |
| date_happened (timestamp)|
this table will contain a large amount of rows (let's assume at least every page view is a tracked event and there are 1,000 daily visitors to the site).
Is it a good practice to create this table with the event_type field to differentiate between essentially different, yet identically structured rows?
or will it be a better idea to make a separate table for each type? e.g.:
table pageview_events
| id (int, primart) |
| user_id (int) |
| date_happened (timestamp)|
table share_events
| id (int, primart) |
| user_id (int) |
| date_happened (timestamp)|
and so on for 5-10 tables.
(the main concern is performance when selecting rows WHERE event_type = ...)
Thanks.
It really depends. If you need to have them separated, because you will only be querying them separately, then splitting them into two tables should be fine. That saves you from having to store an extra discriminator column.
BUT... if you need to query these sets together, as if they were a single table, it would be much easier to have them stored together, with a discriminator column.
As far as WHERE event_type=, if there are only two distinct values, with a pretty even distribution, then an index on just that column isn't going to help much. Including that column as the leading column in a multicolumn index(es) is probably the way to go, if a large number of your queries will include an equality predicate on that column.
Obviously, if these tables are going to be "large", then you'll want them indexed appropriately for your queries.

Foreign-Keys for multiple tables

i'm refactoring a db structure and have a little problem.
This DB have various tables with same structure, like:
People -> People_contacts
Activities -> Activities_contacts
Now, i want to create only one Contact table, and use an ENUM() to distinguish from the nature of the parent (for search requirements and data reversibility)
the structure will be:
People -> Contacts[People]
Activities -> Contacts[Activities]
But now i need to put a Foreign-key, and based on the ENUM property distinguish from two different tables...
How i can effort this? There are a way or is better maintain the old tables?
why you are using view? if the People_contacts and Activities_contacts are exactly the same, you can try this:
create view `test` as select *,'People' as Type from `People_contacts` union select *,'Activities' from `Activities _contacts` union;
and then select what you want from the view:
select * from `test` where Type = 'People' and .....
and your query answer should be this
+----+------+ +--------+
| ID | Data |...| Type |
+----+------+ +--------+
| 1 | foo |...| People |
| 2 | foo |...| People |
+----+------+ +--------+
You cannot have a declared foreign key, pointing to one table or another depending on a field.
You can do a few things, but none of then are really clean.
You can have the integer field, and the enum, but do not declare the field as a foreign key. You will have to implement all the logic by yourself, and it will be harder to maintain, and harder to decouple database from programing.
You can have 2 nullable foreign keys (people_id and activity_id), and forget the enum field. if one FK is null, the other will have the real relation. This is better since you declare the foreign keys as usual and the model is stronger
If you prefer to keep your contact table clean, you can have a relation table where you put this dirty stuff. So in this table you store the contact_id, and the id of the activity or the person, as explained in whatever 1 or 2
But anyways, probably you are obfuscated and you dont need to have the foreign key in the contact table. I would bet you will always access first the people or the activities table, so you probably can change this tables, and add a contact_id foreign key. In the contact table you just need to add, if you dont have it already, de id primary key, and delete de ENUM field, and the foreign keys, since you dont really need them

how to deal with multiple values for a single field in table?

i have 1 table in phpmyadmin users which contains below fields.
users:
uid | name | contact.no
There can be more then one contact number for a single user.
One way to solve it is using one more table for contact number and pass its primary key to users table.
Is there any other way other then this one.
Can we implement array structure in contact.no field?
You could put commas over there and save multiple numbers but then it kills the whole concept of an RDMS and Normalization. That will not be a good database design. So it is advisable to normalize your table and not store such multiple information in one field. Database doesn't really stress itself if you have 1 more table.
A very well written explanation can be found Here on Microsoft Website
You wouldn't have to create multiple tables for each type of entry, just a more robust table structure. Make sure that the information that needs to be normalized is in a consistent format.
Users:
uid | name | username
1,Bob,bcratchet
Info:
iid | itype | icontent | uid
1,cell,000.000.0000
2,home,000.000.0000
3,home_addr,1234 Anystreet, anytown USA
4,work_addr,4567 Anystreet, anytown USA
select * from Users u,Info i where u.uid=i.uid and name="Bob";
Pull it into a multidimensional array in any application and you're good to go.
edit*
Ideally it would go further and show a table like itypes where you would further normalize the types like so:
itypes: itype_id | itype
1,cell
2,home
3,home_addr
4,work_addr
Then in the Info table it would say "itype_id" instead of "itype."

Database Design: need unique rows + relationships

Say I have the following table:
TABLE: product
============================================================
| product_id | name | invoice_price | msrp |
------------------------------------------------------------
| 1 | Widget 1 | 10.00 | 15.00 |
------------------------------------------------------------
| 2 | Widget 2 | 8.00 | 12.00 |
------------------------------------------------------------
In this model, product_id is the PK and is referenced by a number of other tables.
I have a requirement that each row be unique. In the example about, a row is defined to be the name, invoice_price, and msrp columns. (Different tables may have varying definitions of which columns define a "row".)
QUESTIONS:
In the example above, should I make name, invoice_price, and msrp a composite key to guarantee uniqueness of each row?
If the answer to #1 is "yes", this would mean that the current PK, product_id, would not be defined as a key; rather, it would be just an auto-incrementing column. Would that be enough for other tables to use to create relationships to specific rows in the product table?
Note that in some cases, the table may have 10 or more columns that need to be unique. That'll be a lot of columns defining a composite key! Is that a bad thing?
I'm trying to decide if I should try to enforce such uniqueness in the database tier or the application tier. I feel I should do this in the database level, but I am concerned that there may be unintended side effects of using a non-key as a FK or having so many columns define a composite key.
When you have a lot of columns that you need to create a unique key across, create your own "key" using the data from the columns as the source. This would mean creating the key in the application layer, but the database would "enforce" the uniqueness. A simple method would be to use the md5 hash of all the sets of data for the record as your unique key. Then you just have a single piece of data you need to use in relations.
md5 is not guaranteed to be unique, but it may be good enough for your needs.
First off, your intuition to do it in the DB layer is correct if you can do it easily. This means even if your application logic changes, your DB constraints are still valid, lowering the chance of bugs.
But, are you sure you want uniqueness on that? I could easily see the same widget having different prices, say for sale items or what not.
I would recommend against enforcing uniqueness unless there's a real reason to.
You might have something like this (obvoiusly, don't use * in production code)
# get the lowest price for an item that's currently active
select *
from product p
where p.name = "widget 1" # a non-primary index on product.name would be advised
and p.active
order-by sale_price ascending
limit 1
You can define composite primary keys and also unique indexes. As long as your requirement is met, defining composite unique keys is not a bad design. Clearly, the more columns you add, the slower the process of updating the keys and searching the keys, but if the business requirement needs this, I don't think it is a negative as they have very optimized routines to do these.