I plan to create a websystem where organisations (customers) can setup a website with particular functions and their information stored in a (mysql) database. I already started with a database design which includes one master database plus a customer database for each new organisation (which is created after filling in a webform).
Now I start to question my database design decision and wonder whether just single database for all organisations would not be a better choice ? The reason being that there will be various communication (==information exchange) between some of these organisations and these communicating organisations would have unnecessary copies of some tables (e.g., they both have 2-3 tables which are almost copies of each other and therefore they could also share).
Furthermore, implementing the information exchange seems a bit more complex with various databases than with one database. On the other hand, I assume(d) that database queries by the various customers within a single very large database may require much more time than with a system with multiple databases.
Bottomline, as a non-database expert I'm not sure which of the two options would be best to proceed with, and therefore would appreciate your advice.
Go with one database. If you are successful and you have thousands of users on the system you are going to need a army of database administrators to look after the various systems. For that reason alone I would avoid having multiple databases.
Related
I'm working on a School manager software in ASP that connects to an MYSQL DB. The software is working great when I deploy it in local machine for each user (SCHOOL), but I want to migrate software to AZURE cloud. The users will have an account to connect to the same app but data must not mix with other schools data. My problem is to find the best way to deploy and manage the database.
Must I Deploy 1 DB for each school
All school DATA in the same DB.
I'm not sure my solutions are the best ways.
I don't want ex STUDENT TABLE( content student for school X, for SCHOOL Y, ...)
please help to find the best solution.
There are multiple possible ways to design schema to support multi-tenant. The simplicity of the design depends on the use case.
Separate the data of every tenant (school) physically, i.e., one
schema must contain data related to only a specific tenant.
Pros:
Easy for A/B Testing. You can release updates which require database changes to some tenants and over time make it available for others.
Easy to move the database from one data-center to another. Support different SLA for backup for different customers.
Per tenant database level customization is easy. Adding a new table for customers, or modifying/adding a field becomes easy.
Third party integrations are relatively easy, e.g., connecting your data with Google Data Studio.
Scaling is relatively easy.
Retrieving data from one tenant is easy without worrying about the mixing up foreign key values.
Cons:
When you have to modify any field/table, then your application code needs to handle cases where the alterations are not completed in some databases.
Retrieving analytics across customers becomes difficult. Designing Queries for usage analysis becomes harder.
When integrating with other databases system, especially NoSQL, you will need more resources. e.g., indexing data in Elasticsearch for every tenant will require index per tenant, and if there are thousands of customers, it will result in creating thousands of shards.
Common data across tenants needs to be copied in every database
Separate data for every tenant (school) logically, i.e., one schema
contains data for all the tenants.
Pros:
Software releases are simple.
Easy to query usage analytics across multiple tenants.
Cons:
Scaling is relatively tricky. May need database sharding.
Maintaining the logical isolation of data for every tenant in all the tables requires more attention and may cause data corruption if not handled at the application level carefully.
Designing database systems for the application that support multiple regions is complicated.
Retrieving data from a single tenant is difficult. (Remember: all the records will be associated with some other records using foreign keys.)
This is not a comprehensive list. These are based on my experiences with working on both the type of designs. Both the designs are common and are used by multiple organization based on the usecase.
I have an application in which we want to provide the functionality using which user can add/update/delete the columns of different tables. My approach is to create a different database for each client so that their changes specific to tables will remain in their database.
Since each client will have their own database, I wonder how can I manage authentication and authorization? Do I need to create a different database for that as well? Will it affect the performance of the application?
Edit: The approach that I am planning to use for authentication and authorization is to create an additional field called "Account" on the login page. This account name will guide the program to connect it to correct database. And each database will have it's own users to authenticate.
The answer to your question is of course (and unfortunately) Yes and No. :)
This is known as multi-tenant data architecture.
Having separate databases can definitely be a great design option however so can using one database shared with all of your clients/customers and you will need to consider many factors before choosing.
Each design has pluses and minuses.
Here are your 3 essential choices
1) Each customer shares the same database and database tables.
2) Each customer shares the same database but they get their own schema inside the database so they each get their own set of tables.
3)Each customer gets their own database.
One major benefit (that I really like) to the separate database approach is data security. What I mean by this is that every customer gets their own database and because of this they will edit/update/delete just their database. Because of this, there is no risk in end users overriding other users data either due to programmatic error on your part or due to a security breach in your application.
When all users are in the same database you could accidentally pull and expose another customers data. Or, worse, you could expose a primary key to a record on screen and forget to secure it appropriately and a power user could override this key very easily to a key that belongs to another customer thus exposing another clients data.
However, lets say that all of your customers are actually subsidieries of 1 large company and you need to roll up financials every day/week/month/year etc.
If this is the case, then having a database for every client could be a reporting nightmare and having everyone in a single database sharing tables would just make life so much easier. When it comes time to report on your daily sales for instance, its easier to just sum up a column then go to 10,000 databases and sum them up. :)
So the answer definitely depends on your applicaton and what it will be doing.
I work on a large enterprise system where we have tens of thousands of clients in the same database and in order to support this we took very great care to secure all of our data very carefully.
I also work on a side project in my spare time which supports a database per customer multi-tenant architecture.
So, consider what your application will do, how you will backup your data, do you need to roll up data etc and this will help you decide.
Heres a grea article on MSDN for this:
https://msdn.microsoft.com/en-us/library/aa479086.aspx
Regarding your question about authentication.
Yes, having a separate database for authentication is a great design. When a customer authenticates, you will authenticate them off of your authentication database and they will receive the connectionstring to their database as part of this authentication. Then all data from that point comes from that clients database.
Hope this was helpful.
Good luck!
Little question, I'm developing a saas software (erp).
I designed it with 1 database per account for these reasons :
I make a lot of personalisation, and need to add specific table columns for each account.
Easier to manage db backup (and reload data !)
Less risky : sometimes I need to run SQL queries on a table, in case of an error with bad query (update / delete...), only one customer is affected instead of all of them.
Bas point : I'm turning to have hundreds of databases...
I'm hiring a company to manage my servers, and they said that it's better to have only one database, with a few tables, and put all data in the same tables with column as id_account. I'm very very surprised by these words, so I'm wondering... what are your ideas ?
Thanks !
Frederic
The current environment I am working in, we handle millions of records from numerous clients. Our solution is to use Schema to segregate each individual client. A schema allows you to partition your clients into separate virtual databases while inside a single db. Each schema will have an exact copy of the tables from your application.
The upside:
Segregated client data
data from a single client can be easily backed up, exported or deleted
Programming is still the same, but you have to select the schema before db calls
Moving clients to another db or standalone server is a lot easier
adding specific tables per client is easier (see below)
single instance of the database running
tuning the db affects all tenants
The downside:
Unless you manage your shared schema properly, you may duplicate data
Migrations are repeated for every schema
You have to remember to select the schema before db calls
hard pressed to add many negatives... I guess I may be biased.
Adding Specific Tables: Why would you add client specific tables if this is SAAS and not custom software? Better to use a Postgres DB with a Hstore field and store as much searchable data as you like.
Schemas are ideal for multi-tenant databases Link Link
A lot of what I am telling you depends on your software stack, the capabilities of your developers and the backend db you selected (all of which you neglected to mention)
Your hardware guys should not decide your software architecture. If they do, you are likely shooting yourself in the leg before you even get out of the gate. Get a good senior software architect, the grief they will save you, will likely save your business.
I hope this helps...
Bonne Chance
I'll be soon developing a big cms where users can configure their website managing news, products, services and much more about their company.
Think about a shopify without the ecommerce part (at least for now).
The rdbms is MySQL and the user base will be about 150 (maybe bigger).
I'm trying to figure out which one of these two approaches would fit better.
DEDICATED DATABASE FOR EACH USER
PROS:
performance (and possible future sharding?): is querying smaller database with just your data better than querying a giant database with every user data?
easy "export my data" for users: I can simply dump their own db without fetching everything and putting it in some big encoded logical datastruct
SINGLE DATABASE FOR EVERY USER
PROS:
less general overhead
statistic: just one db to query to get and aggregate whatever I need
backup: one dump (not sure about this one because I've no experience in cluster dumping)
Which way would you go for? I don't think shopify created a dedicated database for any user registered... or maybe they did?
I'd like more experienced people than me to help me figuring out the best way and all the variables I can not guess right now because of my ignorance.
It sounds like you're developing a software-as-a-service hosted system, rather than a software package to distribute to customers for them to run on their own servers. In that case, in general, you will have an easier time developing and administering your service if you design it for a single database handling multiple users.
You'll be able to add new users to your system with data manipulation language (DML) rather than data definition language (DDL). That is, you'll insert rows for new users rather than create tables. That will make your life a LOT easier when you go live.
You are absolutely right that stuff like backups and aggregate reporting will be far easier if you have a single shared database.
Don't worry too much about the user data export functions. You'll have to develop software for those functions anyway; it won't be that hard to filter by user when you do the export.
But there's a downside you should consider to the single-database approach: if part of your requirement is to conceal various users' existence or data from each other, you'll have to be very careful to do this in your development. Will your users be competitors with each other? That could be tricky. You'll need to trust your in-house admin and support teams to refrain from disclosing one user's data to another by mistake (or deliberately). With a separate database per user, you'll have a smaller risk in that area.
150 users aren't many. Don't worry about scalability until you have a workload of paying customers. When that happens you can add MySQL server RAM, partitions, solid-state disks, replication, memcached, sharding, and all that other expensive and high-workload stuff. If you add those things before you go live, you'll just take longer and blow more money before you go live. Not good.
I'm creating a multi-user/company web application in PHP & MySQL. I'm interested to know what the best practice is with regards to structuring my database(s).
There will be hundreds of companies and thousands of users of this web app so this needs to be robust. Each company won't be able to see other companies data, just their own. We will be storing mainly text data and will probably only be a few MB per company.
Currently the database contains 14 tables (for one sample company).
Is it better to put the data for all companies and their users in a single database and create a unique companyID for each one?
or:
Is it better to put each company's data in its own database and create a new database and table set for each new company that I add?
What are the pluses and minuses to each approach?
Thanks,
Stephen
If a single web app is being used by all the different companies, unless you have a very specific need or reason to use separate databases (it doesn't sound like you do), then you should definitely use a single database.
Your application will be responsible for only showing the correct information to the correct authenticated users.
Multiple databases would be a nightmare to maintain. For each new company you'd have to create and administer each one. If you make a change to one schema, you'll have to do it to your 14+.
Thousands of users and thousands of apps shouldn't pose a problem at all as long as you're using something that is a real database and not Access or something silly like that.
Multi-tenant
Pluses
Relatively easy to develop: only change database code in one place.
Lets you easily create queries which use data for multiple tenants.
Straightforward to add new tenants: no code needs to change.
Transforming a multi-tenant to a single-tenant setup is easy, should you need to change your design.
Minuses
Risk of data leak between tenants if coding is sloppy. Tenant view filters can in some cases be employed to reduce this risk. This method is based on using different database user accounts for different tenants.
If you break the code, all tenants will be affected.
Single-tenant
Pluses
If you have very different requirements for different tenants, several different database models can be beneficial. This is the best case for using a single tenant setup.
If you code sloppily, there's practically no risk of data leak between tenants (tenant A will not be able to access tenant B's data). In addition, if you accidentally destroy the schema of one tenant through a botched update, other tenants will remain unaffected.
Less SQL code when you don't need to take tenant ID values into account in your queries
Minuses
Database schemas tend to differentiate with time, often resulting in a nightmare. Using a database compare tool, you can alleviate this problem, but potentially many schemas need to be compared.
Including data from several databases in one query is typically complex, and often requires prepared statements.
Developing is hard, since you need to make the same changes to multiple schemas.
The same database entity can appear in many databases with different ID keys, resulting in confusion.
Transforming a single-tenant to a multi-tenant setup is very hard, should you need to change your design.
A single database is the relational way. One aspect from this perspective is that databases gather statistics about database usage and make heavy use of this. If you split things up you will be shooting yourself in the foot as the statistics will be fragmented.