One table per database - mysql

I have the following mySql setup:
A database, player_db, that contains one table named player. playerhas three columns: id, alias & score.
A second database, match_db, that contains one table named match. matchhas 6 columns. For instance: id, player1, player2 & outcome.
I've used two databases to be able to locate them on dedicated servers if that need should arise. I don't really like the fact that my tables are named as its database. Have I missed something?
To me, it seems that one table per database must be a very common use case. So common, that there should be something like a 'default table', but I've not found that concept. So maybe I've designed my system incorrectly?

I don't think using one table per database is as common as you think.
This will make it more difficult to join tables and do more complicated data operations.
I think a better solution would be to use only one database, and scale when necessary. There are also other ways to scale: Using indexes, and using hardware and loadbalancing.
I think splitting the data up like this will actually make your nworsen program's performance. You know have to connect to two servers.
If you are worried about diskspace, it is pretty cheap these days. If you are worried about concurrency and availability, you'd probably want the entire database on multiple servers.
MySQL, nor any other SQL Database has a default table, as far as I know.

This is more of a question for https://softwareengineering.stackexchange.com/
than it is stack overflow. I will tell you this. One table per database is not common, nor is it good design. I have to imagine that whatever your circumstance, you can avoid having to use a separate database for each table.
More to your question, no I don't believe mysql has a default table concept.

Related

Is sql views the proper solution

I have a table named 'Customers' which has all the information about users with different types (user, drivers, admins) and I cannot separate this table right now because it's working on production and this is not a proper time to do this.
so If I make 3 views: the first has users types only, the second has drivers and the third has admins.
My goal is to use 3 models instead one in the project I'm working on so
is this a good solution and what does it cost on performance?
How big is your table 'Customers'? According to the name it doesn't sounds like heavy one.
How often these views will be queried?
Do you have some indices or pk constraints on the attribute you're are going to use in where clause for the views?
I cannot separate this table right now because it's working on
production and this is not a proper time to do this.
From what you said it sounds like a temporarily solution so it probably the good one. Later you сan replace the views with three tables and it will not affect the interface.
I suggest that it is improper to give end-users logins directly into the database. Instead, all requests should go through a database-access layer (API) that users must log into. This layer can provide the filtering you require without (perhaps) any impact to the user. The layer would, while constructing the needed SELECT, tack on, for example, AND type = 'admin' to achieve the goal.
For performance, you might also need to have type at the beginning of some of the INDEXes.

Database design to create tables on the fly

I need to create dynamic tables in the database on the fly. For example, in the database I will have tables named:
Table
Column
DataType
TextData
NumberData
DateTimedata
BitData
Here I can add a table in the table named table, then I can add all the columns to that table in the columns table and associate a datatype to each column.
Basically I want to create tables without actually creating a table in the database. Is this even possible? If so, can you direct me to the right place so I can research? Also, I would prefer sql server or any free database software.
Thanks
What you are describing is an entity-attribute-value model (EAV). It is a very poor way to design a data model.
Although the data model is quite flexible, querying such a data model is quite complicated. You frequently end up having to self-join a table n times if you want to select or filter on n different attributes. That gets slow rather slow and becomes rather hard to optimize relatively quickly.
Plus, you generally end up building a lot of functionality that the database or your ORM would provide.
I'm not sure what the real problem you're having is, but the solution you proposed is the "database within a database" antipattern which makes so many people cringe.
Depending on how you're querying your data, if you were to structure things like you're planning, you'd either need a bunch of piece-wise queries which are joined in the middleware (slow) or one monster monolithic query (either slow or creates massive index bloat), if one is even possible.
If you must create tables on the fly, learn the CREATE TABLE ALTER TABLE and DROP TABLE DDL statements for the particular database engine you're using. Better yet, find an ORM that will do this for you. If your real problem is that you need to store unstructured data, check out MongoDB, Redis, or some of the other NoSQL variants.
My final advice is to write up the actual problem you're trying to solve as a separate question, and you'll probably learn a lot more.
Doing this with documents might be easier. Perhaps you should look at a noSQL solution such as mongoDB.
Or you can still create the Temporary tables but use a cronjob and create the Temporary tables every %% hours and rename it to the correct name after the query's are done. so your site is stil in the air
What you are trying to archive is not not bad but you must use it in the correct logic way.
*sorry for my bad english
I did something like this in LedgerSMB. While we use EAV modelling for a few things (where the flexibility is needed and the sort of querying we are doing is straight-forward, for example menu nodes use this in part), in general, you want to stay away from this as much as possible.
A better approach is to do all of what you are doing except for the data columns. Then you can (shock of shocks) just create the tables. This gives you a catalog of what you have added so your app knows this (and you can diff from the system catalogs if you ever have to check!) but at the same time you get actual relational modelling.
What we did in LedgerSMB was to have stored procedures that would accept a table name exists ('extends_' || name supplied). If so would add a column with the datatype required and write this to the application catalogs. This gives us relational modelling of extended attributes. At load time, the application loads the application catalogs and writes queries as appropriate at appropriate points to load/save the data. It works pretty well, actually.

Basic database design and complexity

I am designing a system which has a database for storing users and information related to the users. More specifically each user in the table has very little information. Something like Name, Password, uid.
Then each user has zero or more containers, and the way I've initially done this is to create a second table in the database which holds containers and have a field referencing the user owning it. So something like containerName, content, owner.
So a query on data from a container would look something like:
SELECT content
FROM containers
WHERE (containerName='someContainer' AND owner='someOwner');
My question is if this is a good way, I am thinking scalability say that we have thousands of users with say... 5 containers each (however each user could have a different number of containers, but 5 would probably be a typical case). My concern is that searching through the database will become slow when there is 5 entries out of 5*1000 entries I could ever want in one query. (We may typically only want a specific container's content from our query and we are looking into the database with basically a overhead of 4995 entries, am I right? And what happen if I subscribed a million users, it would become a huge table which just intuitively feel like a bad idea.
A second take on it which I had would be to have tables per user, however that doesn't feel like a very good solution either since that would give me 1000 tables in the database which (also by intuition) seem like a bad way to do it.
Any help in understanding how to design this would be greatly appreciated, I hope it's all clear and easy to follow.
The accepted way of handling this is by creating an INDEX on the owner field. That way, MySQL optimized queries for owner = 'some value' conditions.
See also: http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
You're right in saying that a 1000 tables is not scalable. Once you start reaching a few million records you might want to consider doing sharding (split up records into several locations based on user attributes) ... but by that time you'd already be quite successful I think ;-)
If it is an RBMS(like Oracle / MySQL) datbase, you can create indexes on columns that are frequently queried to optimize the table traversal and query. Indexes are automatically created for PRIMARY and (optionally for) FOREIGN keys.

Vertical partitioning of tables in MySQL

Another question.
Is it better to vertically partition wide table (in my instance I am thinking about splitting login details from address, personal etc. details of the user) on a design stage or better leave it be and partition it after having some data and doing profiling?
Answer seems to be obvious but I am concerned that row splitting a table sometime down the line will mean additional work to rewrite the user model + it seems reasonable to split often accessed login details from more static personal details.
Anyone has some experience backed advice on how to proceed :)? Thanks in advance.
Premature optimization is...
Splitting columns off to a different table has drawbacks:
Some operations that required a single query now require two queries or a join
It's not trivial to enforce that every row in each table needs to have a corresponding row in the other. Thus, you might face integrity problems
On the other hand, it's dubious at best that doing it will improve performance. Unless you can prove it beforehand (and creating a 10 million records table with random data and running some queries is trivial), I wouldn't do it. Doug Kress' suggestions of encapsulation and avoiding SELECT * are the right way.
The only reason to do it is if your single table design is not normalized and normalization implies breaking up the table.
I believe it would be better to keep it as a single table, but encapsulate your access to the data as much as possible, so that it would be easy to refactor later.
When you do access the data, be sure to only gather the information you need in the query (avoid 'SELECT *').
Having said that, be sure that the data saved with the table is normalized appropriately. You may find that you want to store multiple addresses for a user, for instance - in which case you should put it in a separate table.

MySQL: Many tables or many databases?

For a project we having a bunch of data that always have the same structure and is not linked together.
There are two approaches to save the data:
Creating a new database for every pool (about 15-25 tables)
Creating all the tables in one database and differ the pools by table names.
Which one is easier and faster to handle for MySQL?
EDIT: I am not interessed in issues of database design, I am just interessed in which of the two possibilities is faster.
EDIT 2: I will try to make it more clear. As said we will have data, where some of the date rarely belongs together in different pools. Putting all the data of one type in one table and linking it with a pool id is not a good idea:
It is hard to backup/delete a specific pool (and we expect that we are running out primary keys after a while (even when use big int))
So the idea is to make a database for every pool or create a lot of tables in one database. 50% of the queries against the database will be simple inserts. 49% will be some simple selects on a primary key.
The question is, what is faster to handle for MySQL? Many tables or many databases?
There should be no significant performance difference between multiple tables in a single database versus multiple tables in separate databases.
In MySQL, databases (standard SQL uses the term "schema" for this) serve chiefly as a namespace for tables. A database has only a few attributes, e.g. the default character set and collation. And that usage of GRANT makes it convenient to control access privileges per database, but that has nothing to do with performance.
You can access tables in any database from a single connection (provided they are managed by the same instance of MySQL Server). You just have to qualify the table name:
SELECT * FROM database17.accounts_table;
This is purely a syntactical difference. It should have no effect on performance.
Regarding storage, you can't organize tables into a file-per-database as #Chris speculates. With the MyISAM storage engine, you always have a file per table. With the InnoDB storage engine, you either have a single set of storage files that amalgamate all tables, or else you have a file per table (this is configured for the whole MySQL server, not per database). In either case, there's no performance advantage or disadvantage to creating the tables in a single database versus many databases.
There aren't many MySQL configuration parameters that work per database. Most parameters that affect server performance are server-wide in scope.
Regarding backups, you can specify a subset of tables as arguments to the mysqldump command. It may be more convenient to back up logical sets of tables per database, without having to name all the tables on the command-line. But it should make no difference to performance, only convenience for you as you enter the backup command.
Why not create a single table to keep track of your pools (with a PoolID and PoolName as you columns, and whatever else you want to track) and then on your 15-25 tables you would add a column on all of them which would be a foreign key back to you pool table so you know which pool that particular record belongs to.
If you don't want to mix the data like that, I would suggest making multiple databases. Creating multiple tables all for the same functionality makes my spider sense tingle.
If you don't want one set of tables with poolID poolname as TheTXI suggested, use separate databases rather than multiple tables that all do the same thing.
That way, you restrict the variation between the accessing of different pools to the initial "use database" statement, you won't have to recode your SELECTs each time, or have dynamic sql.
The other advantages of this approach are:
Easy backup/restore
Easy start/stop of a database instance.
Disadvantages are:
a little bit more admin work, but not much.
I don't know what your application is, but really really think carefully before creating all of the tables in one database. That way madness lies.
Edit: If performance is the only thing that concerns you, you need to measure it. Take a representative set of queries and measure their performance.
Edit 2: The difference in performance for a single query between the many tables/many databases model will be neglible. If you have one database, you can tune the hell out of it. If you have many databases, you can tune the hell out of all of them.
My (our? - can't speak for anyone else) point is that, for well tuned database(s), there will be practically no difference in performance between the three options (poolid in table, multiple tables, multiple databases), so you can pick the option which is easiest for you, in the short AND long term.
For me, the best option is still one database with poolId, as TheTXI suggested, then multiple databases, depending upon your (mostly administration) needs. If you need to know exactly what the difference in performance is between two options, we can't give you that answer. You need to set it up and test it.
With multiple databases, it becomes easy to throw hardware at it to improve performance.
In the situation you describe, experience has led me to believe that you'll find the separate databases to be faster when you have a large number of pools.
There's a really important general principle to observe here, though: Don't think about how fast it'll be, profile it.
I'm not too sure I completely understand your scenario. Do you want to have all the pools using the same tables, but just differing by a distinguishing key? Or do you want separate pools of tables within the one database, with a suffix on each table to distinguish the pools?
Either way though, you should have multiple databases for two major reasons. The first being if you have to change the schema on one pool, it won't affect the others.
The second, if your load goes up (or for any other reason), you may want to move the pools onto separate physical machines with new database servers.
Also, security access to a database server can be more tightly locked down.
All of these things can still be accomplished without requiring separate databases - but the separation will make all of this easier and reduce the complexity of having to mentally track which tables you want to operate on.
Differing the pools by table name or putting them in separate databases is about the same thing. However, if you have lots of tables in one database, MySQL has to load the table information and do a security check on all those tables when logging in/connecting.
As others mentioned, separate databases will allow you to shift things around and create optimizations specific to a certain pool (i.e. compressed tables). It is extra admin overhead, but there is considerably more flexibility.
Additionally, you can always "pool" the tables that are in separate databases by using federated or merge tables to simplify querying if needed.
As for running out of primary keys, you could always use a compound primary key if you are using MyISAM tables. For example, if you have a field called groupCode (any type) and another called sequenceId (auto increment) and create your primary key as groupCode+sequenceId. The sequenceId will increment based on the next unique ID within the group code set.
For example:
AAA 1
AAA 2
BBB 1
AAA 3
CCC 1
AAA 4
BBB 2
...
Although with large tables you have to be careful about caching and make sure the file system you are using handles large files.
I don't know mysql very well, but I think I'll have to give the standard performance answer -- "It depends".
Some thoughts (dealing only with performance/maintenance, not database design):
Creating a new database means a separate file (or files) in the file system. These files could then be put on different filesystems if performance of one needs to be separate from the others, etc.
A new database will probably handle caching differently; eg. All tables in one DB is going to mean a shared cache for the DB, whereas splitting the tables into separate databases means each database can have a separate cache [obviously all databases will share the same physical memory for cache, but there may be a limit per database, etc].
Related to the separate files, this means that if one of your datasets becomes more important than the others, it can easily be pulled off to a new server.
Separating the databases has an added benefit of allowing you to deploy updates one-at-a-time more easily than with the single database.
However, to contrast, having multiple databases means the server will probably be using more memory (since it has multiple caches). I'm sure there are more "cons" for the multi-database approach, but I am drawing a blank now.
So I suppose I would recommend the multi-database approach. Obviously this is only with the understanding that there may very well be a better "database-designy" way of handling whatever you are actually doing.
Given the restrictions you've placed on it, I'd rather spin up more tables in the existing database, rather than having to connect to multiple databases. Managing connection strings TEND to be harder, in addition to managing the different database optimizations you may have.
FTR, in normal circumstances I'd take the approach described by TheTXI.
In answer to your specific question though, I have found it to be dependant on usage. (Cop out I know, but hear me out.)
A single database is probably easier. You'll have to worry about just one connection and would still have to specify tables. Multiple databases could, under certain conditions, be faster though.
If I were you I'd try both. There's no way we'll be able to give you a useful answer.