Need help in designing a database schema for a SaaS application

Need help in designing a database schema for a SaaS application - mysql

I am a developer and have never worked on DB before (designing a DB). I am designing a database for an employee management system which is a Node.js + Express application using MySQL as its DB.
I already have the required tables, columns sorted out but there are still few unknowns I am dealing with. This is my plan so far and I need your input on it.
The end users using this application will be small - mid size companies. The companies won't be sharing the tables in the database. So if there is a table named EmployeeCases I plan to create a new EmployeeCases table for each existing company or a new one who signs up for this application. I am planning to name the table as EmployeeCases_989809890 , where "989809890" will be the company id (or customer id). So if we have 3-4 companies who signed up for us, then all the tables (at least the ones which a company uses) will be recreated and named as TableName_CompanyId. My questions, is this a good way to go? Is there a better way?
All the employee's data is held by the Employee table, including their login and password. Now each Employee table in DB will be named as Employee_CompanyId (as per my plan above). My question is, when an employee logs in, how will I know which Employee table to query to? Or should I remove the login from the Employee table and create a universal Users table where all the employees will be stored? The Users table will also have the CompanyId as one of its column and I will read the CompanyId from there which will be used to query other tables.
Any reference, website or blogs on this type of design will be appreciated.
Thanks.

I don't recommend this approach, I think you should either:
A) Put all the information in the same tables and have a companyId column to sort them out
OR
B) Have separate databases for each company and use the appropriate database using the code.
The thing is, with your approach, you'll have a hard time maintaining your application if you have multiple copies of the same table with different names. If you decide to add a column to one of the tables, for instance, you will have to write as many SQL scripts as you have table instances. You'll also have a bad time with all of your unique identifiers.
Here are some advantages/disadvantages of each design:
A) Put all the information in the same tables and have a compagnyId column to sort them out
Advantages:
Simplest
Allow usage of foreign key / constraints
Great for cross / client data extraction
Disadvantages:
Not portable (a client can't just leave with his/her data)
Can be perceived as less secure (I guess you can make the case both ways)
More likely to have huge tables
Does not scale very well
B) Have separate databases for each company and use the appropriate database using the code.
Advantages:
Portable
Can be perceived as more secure
Disadvantages:
Needs more discipline to keep track of all the databases
Needs a good segregation of what's part of your HUB (Your application that tracks which client access which database) and that's part of your client's database.
You need a login page by company (or have your clients specify the company in a field)
An example of an application that uses this "two-step login" is Slack, when you sign-in you first enter your team domain THEN your user credentials.
I think Google Apps for Work as the same approach. Also, I think most CRM I worked with has a separate database for their clients.
Lastly, I'd like to direct you to this other question on stackoverflow that links to an interesting example.

You shouldn't split your tables just because companies won't share their information. Instead, you should have a companyId column in each table and access to the relevant data for each query. This should be implemented in your backend

Related

Database design: User Types: Many fields in users table or separate tables?

I'm developing an Uber like app using Laravel, as you may know it has different user types, there can be drivers and regular users. i'm not sure how to structure the database since drivers can have other fields and relations that regular users do not but i need both types to be able to login. Also users can take a drive and rate the driver and only drivers can have their bio, license number, years driving, rating and just them can have relations like to the car the are driving and so on...
I want to know your thoughts about what is the best approach to handle this type of situation?
Keep drivers and users in the same users table with the drivers fields nullable and a type field to know if it is a driver or a regular user?
Q: If I go with this option how can I guarantee the driver of a ride is effectively a driver and not a simple user?
users
id
name
password
type
driver_license_number
driver_years_driving
driver_rating
Keep credentials of both drivers and regular users in the users table and store drivers specific info in another?
Q: If I go with this option should drivers have their own primary key or use the user's primary key? which table should keep the 1:1 relationship? the users table? the drivers table?
users
id
name
password
drivers
id
user_id
license_number
years_driving
rating

You are tying two different things together under the term user: user as in “someone who registered in my application” and user as in “someone who’s using my application to get rides”. Both drivers and non-drivers are users in the first definition, but not in the second.
What's confusing is that the Driver entity is just a User entity with more fields, so it's possible to not represent the entity at all, just add more columns to the User entity, and, responding to your first question, add a is_driver column to tell which entity is which.
By doing this, you are crippling your database capabilities to guarantee your data is valid. You now can have a Driver row without a driver_license_number, because your database doesn't know what a Driver is, oops.
There's a lot of benefits by being explicit in your database schema. Part of the database work is to guarantee data consistency, help your database help you.
My suggestion is to go a step further. Credentials are one thing, they get their table. Users are another, they get their table (in your example, users seems to have no data at all, but they will probably have more things than just their name). Drivers are yet another, they get their table too.
credentials
id
username
password_hash (you are hashing your passwords, right?)
users
cred_id
... other user related info
driver
cred_id (you can get with user_id, but it's an unnecessary join)
user_id
... driver related info

I believe you should use two separate tables. This would avoid having lots of nullable fields that are not shared between riders and drivers. Furthermore, if these entities are frequently changing, ALTER TABLE will be a bit of a pain at scale.
Joins of course are a little bit more expensive across the two tables, but the query is more natural to write because of our normalization choice.
As a side note, this application will eventually have trouble scaling no matter which way you choose to write the tables, because MySQL cannot easily be horizontally scaled.
But, if you want easy querying and avoidance of nullable fields, two separate tables sounds like the right choice to me.

Consider 3 tables (plus, perhaps, some others):
Persons -- this contains the stuff that is common to both Drivers and Riders, such as login.
Riders -- bio, etc
Drivers
If it would make things handier for you, build two views:
Rider_all -- JOIN between Persons and Riders
Driver_all -- JOIN between Persons and Drivers

Which database schema should i use?

I am building a Rest Api using node, MySQL and MongoDB, but i am confused with the database schema to go for as the business case is B2B and for each business(customer) there is like 10 tables for general ledger, products, transactions, clients, sales, purchase and many like these. and for accommodating 1 to N relationship in sales and purchase record i will use MongoDB to avoid making default MAX number of columns for products in the purchase/sale orders in SQL.
Considering my customers need a separate data backup option for their data and in near future i am also planning to integrate the relationships between the application customers.
So, which is the best option to go for. I have read this question and answers quite carefully, and would like to ask whether should I go for option number 2 ?
Also, I would like to ask whether I should separate my entire backend (DB +Server) for specific BUSINESS TYPES using hostname mapping to business specific azure WebApp ?

organizing multi-tenant db/MySQL [SaaS]

Good example will be shopify. Where you have N number of users (in this case each user assume site). And each user will have it's own records in DB. But db schema will be the same (same tables for each user, products, customers, orders etc.).
So question is what will be the best way to organize this kind of solution?
Store everything in one DB but in a different tables, or run separate DB for each user (but then will be question with maintaining, scalability and automatization)
possible solution:
We can use one DB with common tables like products, customers, orders etc. And we will have table users where we store records about each site.
In tables products, customers we will group all records by user_id.
This is one of possible solutions. But if we will have 1000 users (sites), each will have ~2k products, and ~100k customers, we can end up with tables which has millions of records, so questions will be:
how it will perform compare to each user (site) would have it's own DB?
how reliable this approach? bigger data, harder maintain, backup/restore
safety, if something wrong with one source thousands will be affected
Any links etc. will be much appreciated, thanks!

Create a mysql user for each tenant
Add a tenant_id column to each table
Add a view for each table that filters based on tenant_id = mysql_user
Use a trigger to automatically populate the tenant_id column on INSERT
Restrict the tenant mysql users to only access the views, not the raw tables
I wrote up a blog post on how I was able to convert a large single-tenant mysql application to a multi-tenant application in a weekend using this technique.
https://opensource.io/it/mysql-multi-tenant/

I recommend reviewing databases by well-supported open source solutions. With this in mind, here's a pretty simple schema I found real quick that'd explain a good working solution for this with scale-ability in mind.
http://www.zentut.com/sql-tutorial/sql-sample-database/

I have this file Generate_multiTanentMysql.php i do all steps with PHP script
https://github.com/ziedtuihri/SaaS_Application
Solution Design Pattern :
Creating a database user for each tenant
Renaming every table to a different and unique name (e.g. using a prefix ‘someprefix_’)
Adding a text column called ‘id_tenant’ to every table to store the name of the tenant the row belongs to
Creating a trigger for each table to automatically store the current database username to the id_tenant column before inserting a new row
Creating a view for each table with the original table name with all the columns except id_tenant. The view will only return rows where (id_tenant = current_database_username)
Only grant permission to the views (not tables) to each tenant’s database user Then, the only part of the application that needs to change is the database connection logic. When someone connects to the SaaS, the application would need to:
Connect to the database as that tenant-specific username

Which MySQL data type is good for storing autocomplete terms?

I am developing an app in cakephp with user auth. Users will add their customer names every time they get orders. So I want to have an auto complete textfield for customer name to add orders. Each user will have their own set of customer names.
So should i create a big text to store customer names(all terms with comma seperated)
/ or /
varchar for each term (1 term in 1 record)?
I will use foreign key to separate users customers.
I am planning to use jQuery ui auto complete with sourcing terms from the customer table values.
My big concern is database capacity, I would like to save the space in database because I have other tables and a lot of users too.
(I do not have a programming background, so please forgive me for my typo)
Thank you.

Use multiple records, one for each term. That is what databases are designed to store.
If you store all the terms in a comma separated list, you will discover that there are lots of things that you cannot easily do.

Managing cross database reading, views or permissions in MySql

I will have multiple tables used by different projects on the same mySql server. Much of the data is sensitive, and needs to be behind permissions wall. However many of the tables of sensitive data rely on tables of insensitive data for user and department information. So I see three options ahead of me and I am unsure which one to pick.
All in one database with table level permissions
The simplest of all solutions except I don't control permissions on the database server, traditionally the server team only does database level permissions and getting them to allow it is a political battle I may not have the influence to win, and keeping track of all the permissions would be a pain.
In multiple databases, with database level permissions
I can split the tables into database zones, so department wide info, like the staff data, can be in it's own database and tools that edit the staff data can have UPDATE&INSERT access to the department database. Other tools would want to access parts of the staff list, the public staff directory have SELECT access to the to the staff tables, but pars of the staff tables would need to remain private, like personal contact info, copy codes, or billing indexes. I would need to split the staff tables into public or private tables, but I would be stuck with table level permissions again. So I would need to split the department database into the department shared and the department private databases.
Cross database views
I would create views in a database that pulls data from other databases that the account does not have access to. So I can put all the staff information in the department database, then create a view in the web database that only pulls the columns that should be publicly available (name, department, extension). This would allow me to, in effect, have column level SELECT rights without having to muck about with permissions. My concern would be speed. The original table the data is being pulled from would be fully indexed but the documentation seems contradictory whether or not the columns would be indexed when querying the view.
Has anyone else used any of these three options? Do you have better ones that I haven't thought of? What are the pitfalls you can see beyond what I have pointed out for any of the options?

Views and stored-procedures are your best bet. Views can be used to provide generic access, but if some queries under-perform, re-write them to use stored-procedures bypassing the view for performance.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008