Different databases or tables sets for each user - mysql

I'll have to create database(s) to store very large amounts of data but being able to extract data fast enough using MySQL.
I was wondering if it will help if I create a new database or a new tables set for each user instead of using a single large database.
The only worry I have is that the users will be many but I hope that they will not use the project at the same time.
Does anyone have any experience with similar structures or any other advice to solve the problem?

use single database & single "User" table for all users...allocate theme unique id....
or if a single user have many datas.....
Example
if user_a have 10 books....
make a different table like "Book_table" with this structure.....
id , user_id , book_name
you can use this structure for multiple users......
Please explain more what user's details you have......i will try to give more accurate answer.....

Definitely use only one User table. If you need to search that table quickly you can index columns that you need fast searching on. For examplel you have a user table with first_name, last_name and email address. These are text fields so searching would be slow since it needs to scan the entire table and compare all the strings (note string comparison is much slower than integers). To get around this you can index columns like email address which is sort of like an ordered table (hidden away) which only has emails in it. Ordered tables are much easier to search so your searches would be fast.
Here is a good basic example of how to use indexing. Be careful with indexes though since they add overhead on your inserts. Some reading up on your part will really help here.
Anyway that is a really over simplified explanation of indexing.

Related

One large column or many small rows?

In a news feed app, our customer requires to store all the ids of articles that read by a user.
We decided to create a single table for this, but from performance point of view, which of the following is a better approach:
Have one row per user with two fields, a user_id and article_ids, then each time a user read an article, append the id to the article_ids text - using update and concat (we might end up with a huge data in one column).
Have many rows with two columns, user_id and article_id, then each time a user read an article, insert the article_id along with the user_id in as a new record (we might end up with too many rows).
Or if there is a better way, any suggestions are very welcome.
With the second approach, you can keep track of other things which your client might ask going forward.
First, open/read/visit time.
Count of a total number of open/read/visit.
Last open/read/visit time.
In this approach, you can apply the indexing on article_id later on if required.
Note: As #Arjan said in his answer, with proper indexing there is no such a thing as too many rows.
Many records, one for each user_id and article_id combination. That's much easier to update (just insert a row, no need to apply logic) and also allows you to get information about articles when you want to list which ones a user has read. You can use a join and retrieve the correct information from the database at once, instead of having to convert a string to ids and then go back to the database to get the additional data.
With proper indexes there's not really such a thing as too many rows.
Try to split them as much as possible. Your performance will be increased a lot if, because you just have to pick small pieces of your database. If you go for the first option, you have to split it after certain characters to get the information you want. First it is more challenging in programming and if a user has a bad internet connection, the application would be very slow.

Can I have one million tables in my database?

Would there be any advantages/disadvantages to having one million tables in my database.
I am trying to implement comments. So far, I can think of two ways to do this:
1. Have all comments from all posts in 1 table.
2. Have a separate table for each post and store all comments from that post in it's respective table.
Which one would be better?
Thanks
You're better off having one table for comments, with a field that identifies which post id each comment belongs to. It will be a lot easier to write queries to get comments for a given post id if you do this, as you won't first need to dynamically determine the name of the table you're looking in.
I can only speak for MySQL here (not sure how this works in Postgresql) but make sure you add an index on the post id field so the queries run quickly.
You can have a million tables but this might not be ideal for a number of reasons[*]. Classical RDBMS are typically deployed & optimised for storing millions/billions of rows in hundreds/thousands of tables.
As for the problem you're trying to solve, as others state, use foreign keys to relate a pair of tables: posts & comments a la [MySQL syntax]:
create table post(id integer primary key, post text);
create table comment(id integer primary key, postid integer , comment text, key fk (postid));
{you can add constraints to enforce referential integrity between comment and posts to avoid orphaned comments but this requires certain capabilities of the storage engine to be effective}
The generation of primary key IDs is left to the reader, but something as simple as auto increment might give you a quick start [http://dev.mysql.com/doc/refman/5.0/en/example-auto-increment.html].
Which is better?
Unless this is a homework assignment, storing this kind of material in a classic RDBMS might not fit with contemporary idioms. Keep the same spiritual schema and use something like SOLR/Elasticsearch to store your material and benefit from the content indexing since I trust that you'll want to avoid writing your own search engine? You can use something like sphinx [http://sphinxsearch.com] to index MySQL in an equal manner.
[*] Without some unconventional structuring of your schema, the amount of metadata and pressure on the underlying filesystem will be problematic (for example some dated/legacy storage engines, like MyISAM on MySQL will create three files per table).
When working with relational databases, you have to understand (a little bit about) normalization. The third normal form (3NF) is easy to understand and works in almost any case. A short tutorial can be found here. Use Google if need more/other/better examples.
One table per record is a red light, you know you're missing something. It also means you need dynamic DDL, you must create new tables when you have new records. This is also a security issue, the database user needs to many permissions and becomes a security risk.

Store multiple values with a common key but keep the table short in MYSQL?

I have three tables user, meeting, and meetingContent.
The users can select different rows from the meetingContent when they create a meeting.
I would like to create a table that records which rows from meetingContent were used for each meeting.
In a previous but similar version of this system there was a table called meeting_meetingContent that held a huge amount of rows as each case when used for each meeting was stored.
I'm not sure if this perhaps the most efficient way, but it seems it could be streamlined somewhat. Are there any altenatives?
Maybe you can try NoSQL like MongoDB.
Check here to see introduction.
in my opinion, maybe NoSQL would be better than SQL. What you do is just save meetings content directly into db, and shows by read content.

When is it a good idea to move columns off a main table into an auxiliary table?

Say I have a table like this:
create table users (
user_id int not null auto_increment,
username varchar,
joined_at datetime,
bio text,
favorite_color varchar,
favorite_band varchar
....
);
Say that over time, more and more columns -- like favorite_animal, favorite_city, etc. -- get added to this table.
Eventually, there are like 20 or more columns.
At this point, I'm feeling like I want to move columns to a separate
user_profiles table is so I can do select * from users without
returning a large number of usually irrelevant columns (like
favorite_color). And when I do need to query by favorite_color, I can just do
something like this:
select * from users inner join user_profiles using user_id where
user_profiles.favorite_color = 'red';
Is moving columns off the main table into an "auxiliary" table a good
idea?
Or is it better to keep all the columns in the users table, and always
be explicit about the columns I want to return? E.g.
select user_id, username, last_logged_in_at, etc. etc. from users;
What performance considerations are involved here?
Don't use an auxiliary table if it's going to contain a collection of miscellaneous fields with no conceptual cohesion.
Do use a separate table if you can come up with a good conceptual grouping of a number of fields e.g. an Address table.
Of course, your application has its own performance and normalisation needs, and you should only apply this advice with proper respect to your own situation.
I would say that the best option is to have properly normalized tables, and also to only ask for the columns you need.
A user profile table might not be a bad idea, if it is structured well to provide data integrity and simple enhancement/modification later. Only you can truly know your requirements.
One thing that no one else has mentioned is that it is often a good idea to have an auxiliary table if the row size of the main table would get too large. Read about the row size limits of your specific databases in the documentation. There are often performance benefits to having tables that are less wide and moving the fields you don't use as often off to a separate table. If you choose to create an auxiliarary table with a one-to-one relationship make sure to set up the PK/FK relationship to maintain data integrity and set a unique index or constraint on the FK field to mainatin the one-to-one relationship.
And to go along with everyone else, I cannot stress too strongly how bad it is to ever use select * in production queries. You save a few seconds of development time and create a performance problem as well as make the application less maintainable (yes less - as you should not willy nilly return things you may not want to show on the application but you need in the database. You will break insert statements that use selects and show users things you don't want them to see when you use select *.).
Try not to get in the habit of using SELECT * FROM ... If your application becomes large, and you query the users table for different things in different parts of your application, then when you do add favorite_animal you are more likely to break some spot that uses SELECT *. Or at the least, that place is now getting unused fields that slows it down.
Select the data you need specifically. It self-documents to the next person exactly what you're trying to do with that code.
Don't de-normalize unless you have good reason to.
Adding a favorite column ever other day every time a user has a new favorite is a maintenance headache at best. I would highly consider creating a table to hold a favorites value in your case. I'm pretty sure I wouldn't just keep adding a new column all the time.
The general guideline that applies to this (called normalization) is that tables are grouped by distinct entities/objects/concepts and that each column(field) in that table should describe some aspect of that entity
In your example, it seems that favorite_color describes (or belongs to) the user. Some times it is a good idea to moved data to a second table: when it becomes clear that that data actually describes a second entity. For example: You start your database collecting user_id, name, email, and zip_code. Then at some point in time, the ceo decides he would also like to collect the street_address. At this point a new entity has been formed, and you could conceptually view your data as two tables:
user: userid, name, email
address: steetaddress, city, state, zip, userid(as a foreign key)
So, to sum it up: the real challenge is to decide what data describes the main entity of the table, and what, if any, other entity exists.
Here is a great example of normalization that helped me understand it better
When there is no other reason (e.g. there are normal forms for databases) you should not do it. You dont save any space, as the data must still stored, instead you waste more as you need another index to access them.
It is always better (though may require more maintenance if schemas change) to fetch only the columns you need.
This will result in lower memory usage by both MySQL and your client application, and reduced query times as the amount of data transferred is reduced. You'll see a benefit whether this is over a network or not.
Here's a rule of thumb: if adding a column to an existing table would require making it nullable (after data has been migrated etc) then instead create a new table with all NOT NULL columns (with a foreign key reference to the original table, of course).
You should not rely on using SELECT * for a variety of reasons (google it).

Multiple table or one single table?

I already saw a few forums with this question but they do not answer one thing I want to know. I'll explain first my topic:
I have a system where each log of multiple users are entered to the database (ex. User1 logged in, User2 logged in, User1 entered User management, User2 changed password, etc). So I would be expecting 100 to 200 entries per user per day. Right now, I'm doing it in a single table and to view it, I just have to filter out using UserID.
My question is, which is more efficient? Should I use one single table or create a table per user?
I am worried that if I use a single table, the system might have some difficulty filtering thousands of entries. I've read some pros and cons using multiple tables and a single table especially concerning updating the table(s).
I also want to know which one saves more space? multiple table or single table?
As long as you use indexes on the fields you're selecting from, you shouldn't have any speed problems (although indexes slow writes, so too many are a bad thing). A table with a few thousand entries is nothing to mySQL (or any other database engine).
The overhead of creating thousands of tables is much worse -- say you want to make a change to the fields in your user table -- now you'd have to change thousands of tables.
A table we regularly search against for a single record # work has about 150,000 rows, and because the field we search for is indexed, the search time is in very small fractions of a second.
If you're selecting those records without using the primary key, create an index on the field you use to select like this:
CREATE INDEX my_column_name ON my_table(my_column_name);
Thats the most basic form. To learn more about it, check here
I would go with a single table. With an index on userId, you should be able to scale easily to millions of rows with little issue.
A table per user might be more efficient, but it's generally poor design. The problem with a table per user is it makes it difficult to answer other kinds of questions like "who was in user management yesterday?" or "how many people have changed their passwords?"
As for storage space used - I would say a table per user would probably use a little more space, but the difference between the two options should be quite small.
I would go with just 1 table. I certainly wouldn't want to create a new table every time a user is added to the system. The number of entries you mention for each day really is really not that much data.
Also, create an index on the user column of your table to improve query times.
Definitely a single table. Having tables created dynamically for entities that are created by the application does not scale. Also, you would need to create your queries with variable tables names, something which makes things difficult to debug and maintain.
If you have an index on the user id you use for filtering it's not a big deal for a db to work through millions of lines.
Any database worth its salt will handle a single table containing all that user information without breaking a sweat. A single table is definitely the right way to do it.
If you used multiple tables, you'd need to create a new table every time a new user registered. You'd need to create a new statement object for each user you queried. It would be a complete mess.
I would go for the single table as well. You might want to go for multiple tables, when you want to server multiple customers with different set of users (multi tenancy).
Otherwise if you go for multiple tables, take a look at this refactoring tool: http://www.liquibase.org/. You can do schema modifications on the fly.
I guess, if you are using i.e. proper indexing, then the single table solution can perform well enough (and the maintenance will be much more simple).
Single table brings efficiency in $_POST and $_GET prepared statements of PHP. I think, for small to medium platforms, single table will be fine. Summary, few tables to many tables will be ideal.
However, multiple tables will not cause any much havoc as well. But, the best is on a single table.