large databases - mysql

I have an online service (online vocabulary trainer). Each user has its vocabulary.
Now, I'm not sure, how I should structure my Mysql-DB.
As far as I know, I have different possibilities:
everything in one table (MyISAM): I store all the vocabulary in one large MyISAM-table and add a column "userid" to identify each user's vocabulary
every user has its own table (MyISAM): Every time, when a user is created, the programm adds a table named like "vocabulary_{userid}" where {userid} is to connect the table to a user.
everything in one table (InnoDB): Like point one, but with InnoDB instead of MyISAM.
The problem is, that one large vocabulary table can reach up to 100 millions rows. With MyISAM, the problem is, that every query locks the whole table. So I imagine, if there are many users online (and send many queries), the table might be locked a lot. And with InnoDB, I'm simply not sure, wheather this is a good solution as I'm having quite some SELECT-, UPDATE-, and INSERT- commands.
I hope anyone can help me. Thank you in advance.

It is almost always better to go with InnoDB. InnoDB can handle 100 milllions rows, the max size is 64tb.
It doesn't sound like you have a relational dataset, but more of a key/value store. Maybe Riak is a better solution.

It depends
If you start having one table per user (aka sharding) you will have some troubles at the beginning.
if you don't have the need of scale right now. go for 1 table with good indexes. I wouldn't use MyISAM but InnoDB instead otherwise you can get hit by the bigests issue of MyISAM (locks...)

The normal relational design for this would, I think, use three tables:
Users — user ID, and other attributes: name, email, etc
Vocabulary — least clear from the question, but presumably words with attributes such as part of speech and maybe meaning, probably including a word ID (because some word spellings have multiple meanings).
User_Vocabulary — a table with a User ID, Word ID, and maybe attributes such as 'date learned'.
If MyISAM locks the table while a query is going on, then you can't afford to use MyISAM if you need concurrent updates to the User_Vocabulary table. So, go with InnoDB for all the tables.

Related

one table or several table in mysql

i have a huge data.what is the best way for store this data in database.store this data to one table or store in several table?
if i want to save data to several table i must create table for every user.
I think a table per users is not a good, please read above link it will hwlp you to design a better and more efficient database
MySQL :: An Introduction to Database Normalization
http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/intro-to-normalization.html
In a relational dbms, data table size doesn't dictate the database design, but only the relations of entities.
So even with a table of billions of records holding large multi media data, you would not make this several tables, just because the table gets so big.
Maybe using an RDBMS is a wrong approach for your task. Maybe a NoSQL dbms would be better. Even a simple file system could be the way to go.
Maybe however, an RDBMS is the right approach. We don't know, because you have told us almost nothing about your database. And what seems huge to you may be considered small by your dbms.
Are you saying that you would have a huge table holding data for all users, but every user will only be interested in their own data? Then simply partition the table by user. The database design will remain the same, only the underlying storage and internal data access will be different. As long as your queries always select one user's data, you will stay within one partition and data access will be fast.
Here is how to partition a table:
alter table user_movies
add primary key (user_id, movie_id)
partition by hash(user_id) partitions 100;
So many factors you should be taken into consideration before making decisions. But how about this one, try "no-sql" type db, each user will be treated as key, the info about this user will be treated as value.

How to store the specific (polls eg.) data in a MySQL database?

Let's say I would like to store votes to polls in mysql database.
As far as I know I have two options:
1. Create one table (let's say votes) with fields like poll_id, user_id, selected_option_id, vote_date and so on..
2. Create a new database for votes (let's say votes_base) and for each poll add a table to this base (a table, which consist the id of the poll in the name), let's say poll[id of the poll].
The problem with the first option is that the table will become big very soon. Let's say I have 1000 polls and each poll has 1000 votes - that's already a million records in the table. I don't know how much of the speed performance that will costs.
The problem with the second option is I'm not sure if this is the correct solution from the programming rules point of view. But I'm sure with this option it will be (much?) faster to find all votes to some poll.
Or maybe there is a better option?
Your first option is the better option. It is structurally more sound. Millions of rows in a table is no problem from MySQL. A new table per poll is an antipattern.
EDIT for first comment:
Even for a billion or more votes, MySQL should handle. Indexes are the key here. What is the difference between one database with 100 times the same table, or one table with 100 times the rows?
Technically, the second option works as well. Sometimes it might be even better. But we frequently see this:
Instead of one table, users, with 10 columns
Make 100 tables, users_uk, users_us, ... depending on where the users are from.
Great, no? Works, yes? Well it does, until you want to select all the male users, or join the users table onto another table. You'll have a huge UNION coming, and you won't even know the tables beforehand.
One big users table, with the appropriate indexes, is better. If it gets too big for your liking (or your disk), you can start with PARTITIONING: you still have the benefit of one table, but the partitions are stored on different locations.
Now, with your polls, these kind of queries might not happen. In that case, one big InnoDB table or 1000s of small tables might both work.. but the first option is a lot easier to program, and has no drawbacks over the second option. Why choose the second option?
The first option is the better, no doubt. Just be sure to define INDEXes for fields you will use to search data (such as poll_id, for sure) and you will not experience performance issues. MySQL is a DBMS perfectly capable to handle such amount of rows. Do not worry.
First option is better. And you can archive tables after a while, if you not going to use it often

DataBase Design(Big Table)

What is big table(i.e Google DataBase design), I have such type of requirement, but I don't know how to design it.
In Big Table, how to maintain relations among them?
create all table with innodb storage engine which maintain relationships
Choose table fields according to and limited to requirement
The paper of Big table, published by google, may be hard to read. Hope my answer can help you to start understanding.
In old days, RDBMS stores data according rows, one record one row, 1,2,3,4,5.....
Then if you want to find record 5, it's ok, database will seek in a B+ tree(or something similar) to get the address of record 5, load it for you.
But the nightmare is when you want to get records that have column "user=Michael", the database has no way but seek every record to check out if the user is "Michael".
Big Table has a different way to store data. It stores all the columns by an inverted table. When we want to find out all the records that satisfy "user=Michael", it seeks this as a key via a B+ tree or hash table, and gets the address of inverted table where stores the list of all records satisfying.
Maybe a good starting point is Lucene, an open source full text search engine, a fully implementation of big table principles.
Be noticed, inverted table is not a column-based storage in RDBMS. They are different, please must remember this.

Multiple table or one single table?

I already saw a few forums with this question but they do not answer one thing I want to know. I'll explain first my topic:
I have a system where each log of multiple users are entered to the database (ex. User1 logged in, User2 logged in, User1 entered User management, User2 changed password, etc). So I would be expecting 100 to 200 entries per user per day. Right now, I'm doing it in a single table and to view it, I just have to filter out using UserID.
My question is, which is more efficient? Should I use one single table or create a table per user?
I am worried that if I use a single table, the system might have some difficulty filtering thousands of entries. I've read some pros and cons using multiple tables and a single table especially concerning updating the table(s).
I also want to know which one saves more space? multiple table or single table?
As long as you use indexes on the fields you're selecting from, you shouldn't have any speed problems (although indexes slow writes, so too many are a bad thing). A table with a few thousand entries is nothing to mySQL (or any other database engine).
The overhead of creating thousands of tables is much worse -- say you want to make a change to the fields in your user table -- now you'd have to change thousands of tables.
A table we regularly search against for a single record # work has about 150,000 rows, and because the field we search for is indexed, the search time is in very small fractions of a second.
If you're selecting those records without using the primary key, create an index on the field you use to select like this:
CREATE INDEX my_column_name ON my_table(my_column_name);
Thats the most basic form. To learn more about it, check here
I would go with a single table. With an index on userId, you should be able to scale easily to millions of rows with little issue.
A table per user might be more efficient, but it's generally poor design. The problem with a table per user is it makes it difficult to answer other kinds of questions like "who was in user management yesterday?" or "how many people have changed their passwords?"
As for storage space used - I would say a table per user would probably use a little more space, but the difference between the two options should be quite small.
I would go with just 1 table. I certainly wouldn't want to create a new table every time a user is added to the system. The number of entries you mention for each day really is really not that much data.
Also, create an index on the user column of your table to improve query times.
Definitely a single table. Having tables created dynamically for entities that are created by the application does not scale. Also, you would need to create your queries with variable tables names, something which makes things difficult to debug and maintain.
If you have an index on the user id you use for filtering it's not a big deal for a db to work through millions of lines.
Any database worth its salt will handle a single table containing all that user information without breaking a sweat. A single table is definitely the right way to do it.
If you used multiple tables, you'd need to create a new table every time a new user registered. You'd need to create a new statement object for each user you queried. It would be a complete mess.
I would go for the single table as well. You might want to go for multiple tables, when you want to server multiple customers with different set of users (multi tenancy).
Otherwise if you go for multiple tables, take a look at this refactoring tool: http://www.liquibase.org/. You can do schema modifications on the fly.
I guess, if you are using i.e. proper indexing, then the single table solution can perform well enough (and the maintenance will be much more simple).
Single table brings efficiency in $_POST and $_GET prepared statements of PHP. I think, for small to medium platforms, single table will be fine. Summary, few tables to many tables will be ideal.
However, multiple tables will not cause any much havoc as well. But, the best is on a single table.

New table for every user?

I want to crate new table for each new user on the web site and I assume that there will be many users, I am sure that search performance will be good, but what is with maintenance??
It is MySQL which has no limit in number of tables.
Thanks a lot.
Actually tables are stored in a table too. So in this case you would move searching in a table of users to searching in the system tables for a table.
Performance AND maintainibility will suffer badly.
This is not a good idea:
The maximum number of tables is unlimited, but the table cache is finite in size, opening tables is expensive. In MyISAM, closing a table throws its keycache away. Performance will suck.
When you need to change the schema, you will need to do one ALTER TABLE per user, which will be an unnecessary pain
Searching for things for no particular user will involve a horrible UNION query between all or many users' tables
It will be difficult to construct foreign key constraints correctly, as you won't have a single table with all the user ids in any more
Why are you sure that performance will be good? Have you tested it?
Why would you possibly want to do this? Just have one table for each thing that needs a table, and add a "user" column. Having a bunch of tables vs a bunch of rows isn't going to make your performance better.
To give you a direct answer to your question: maintenance will lower your enthousiasm at the same rate that new users sign up for your site.
Not sure what language / framework you are using for your web site, but in this stage it is best to look up some small examples in that. Our guess is that in every example that you'll find, every new user gets one record in a table, not a table in the database.
I would go with option 1 (a table called tasks with a user_id foreign key) in the short run, assuming that a task can't have more than one user? If so then you'll need a JOIN table. Check into setting up an actual foreign key as well, this promotes referential integrity in the data itself.