This is probably a more conceptual problem, but I couldn't find an easy solution.
Scenario: Two shops (say 'M' and 'S']. M is the master and determines the articles in the databases. Each maintains an independent stock. I have M's article table replicating to S, and I separated stock into a separate table with a common reference.
Now when new articles are added in M, they arrive at S too, but they won't have an entry in S's stock table. Similar problem with delete articles. Possible solutions:
Do I create an entry in S's stock table each time a request is made
for a new (not-test-existing) article?
Do I have to scan regularly to check for missing stock entries.
Isn't there a more elegant way to solve this?
NOTE: To clarify, let me explain another way:
M already replicates the 'articles' table to S (using MySQL's replication mechanism.
This works fine.
The problem is that M and S have 'stock' tables which are local to each M and S. What is the normal procedure when, for example, a new product is added (in M) to the 'articles' table, and transferred to S. Now there is new entry which doesn't have a corresponding entry in S's stock table.
I'm guessing this is not an unusual situation - what would be the normal procedure to solve this?
Unless of course if you have two databases located on two different DB servers, why don't you simply create a table articles and a table stock referencing it. You could add the shop (ideally the shop_id) as an extra column of that latter. Something like that:
CREATE TABLE articles(id int primary key not null,
name char(20) not null);
CREATE TABLE stock(article_id int not null,
shop ENUM('M', 'S') not null,
qty int not null,
FOREIGN KEY(article_id) REFERENCES articles(id),
UNIQUE(article_id, shop));
Please see http://sqlfiddle.com/#!2/e6eca/5 for a live example.
If you really need to restrict creation of items on table articles to shop M that could be achieved by creating different users for your DB (*user_m*, *user_s*) and using REVOKE and/or GRANT to setup the various access rights.
EDIT: If the two servers are on distant sites, you would probably be able to use MySQL replication capabilities to keep one or many tables in sync between the two sites. This will require a network access between the two sites. As of myself, for obvious reasons, I would consider using a secure tunnel between the two sites. Finally you still probably have to set-up permissions at DB-level to only allow insert from one site and not the other.
As a "poor's man" solution, you finally have the possibility to backup on regular basis the required tables from one server to update the tables on the second server. A combination of mysqldump + cronjob would be good starting point.
As of myself, I would though push to MySQL replication. The setup is probably more complex. But this will perform better, scale better and have lower latency.
Related
I am a developer and have never worked on DB before (designing a DB). I am designing a database for an employee management system which is a Node.js + Express application using MySQL as its DB.
I already have the required tables, columns sorted out but there are still few unknowns I am dealing with. This is my plan so far and I need your input on it.
The end users using this application will be small - mid size companies. The companies won't be sharing the tables in the database. So if there is a table named EmployeeCases I plan to create a new EmployeeCases table for each existing company or a new one who signs up for this application. I am planning to name the table as EmployeeCases_989809890 , where "989809890" will be the company id (or customer id). So if we have 3-4 companies who signed up for us, then all the tables (at least the ones which a company uses) will be recreated and named as TableName_CompanyId. My questions, is this a good way to go? Is there a better way?
All the employee's data is held by the Employee table, including their login and password. Now each Employee table in DB will be named as Employee_CompanyId (as per my plan above). My question is, when an employee logs in, how will I know which Employee table to query to? Or should I remove the login from the Employee table and create a universal Users table where all the employees will be stored? The Users table will also have the CompanyId as one of its column and I will read the CompanyId from there which will be used to query other tables.
Any reference, website or blogs on this type of design will be appreciated.
Thanks.
I don't recommend this approach, I think you should either:
A) Put all the information in the same tables and have a companyId column to sort them out
OR
B) Have separate databases for each company and use the appropriate database using the code.
The thing is, with your approach, you'll have a hard time maintaining your application if you have multiple copies of the same table with different names. If you decide to add a column to one of the tables, for instance, you will have to write as many SQL scripts as you have table instances. You'll also have a bad time with all of your unique identifiers.
Here are some advantages/disadvantages of each design:
A) Put all the information in the same tables and have a compagnyId column to sort them out
Advantages:
Simplest
Allow usage of foreign key / constraints
Great for cross / client data extraction
Disadvantages:
Not portable (a client can't just leave with his/her data)
Can be perceived as less secure (I guess you can make the case both ways)
More likely to have huge tables
Does not scale very well
B) Have separate databases for each company and use the appropriate database using the code.
Advantages:
Portable
Can be perceived as more secure
Disadvantages:
Needs more discipline to keep track of all the databases
Needs a good segregation of what's part of your HUB (Your application that tracks which client access which database) and that's part of your client's database.
You need a login page by company (or have your clients specify the company in a field)
An example of an application that uses this "two-step login" is Slack, when you sign-in you first enter your team domain THEN your user credentials.
I think Google Apps for Work as the same approach. Also, I think most CRM I worked with has a separate database for their clients.
Lastly, I'd like to direct you to this other question on stackoverflow that links to an interesting example.
You shouldn't split your tables just because companies won't share their information. Instead, you should have a companyId column in each table and access to the relevant data for each query. This should be implemented in your backend
I am trying to design my database. for now my schema is as below:
There exist 3 fixed tables already and during time the numbers of my tables will be increasing with the same fields but different values in tables (mm, xx, yy, zz,...). There is not any duplication between each table.
a) user table, who are the internal users.
b) map_platform_user, which mapped the internal users with the external users. Internal users are our users and external users are our partner users.
c) platform, is a partners' platforms.
d) mm or xx, is the user information that are difference for each platforms.
I have created the look up table called map_platform_user. This table have a field user_platform_id that I though to create a relation with the mm table. (PK-FK).
1) I though to create a multiple references like mm_user_id(PK), xx_user_id(PK),... to the user_platform_id (PK). However I am pretty sure that this solution is wrong and I can not have several references to one table.
2) My next solution is to create each time a new field to the table map_platform_user such as user_platform_id1, user_platform_id2,... but this solution requires altering each time the table map_platform_user.
3) The third solution I thought about that is to not create the extra tables and store everything in the map_platform_user table, so if the new platforms are coming based on our demands, I store them in the field user_platform_id. In this case I have to add name,email,.. to the table map_platform_user.
I would like to mention that the user_platform_id is an external_id which should be received from our clients.
The version of mysql is 5.5.47and the engine in InnoDB. I will be appreciated for any solution and discussion in order to improve the performance of my db. I will have a big amount of data.
This is something that has bothered me for a long time and i still have been unable to find an answer.
I have a huge system with alot of different features. What is common for this system is of course that my users can
create, update, read & delete
different parts of my system.
For simple reasons lets say i have an application that has the following features:
Document administration
Video administration
User administration
Salery administration
(Please do note i took these at random just to prove a point that all of these would have their own separate tables and does not necessarily be connected).
Now i wish to create some sort of logging system. So that when ever someone either create,update or delete an entity it will be recorded.
Now as far as i can see i can do this two ways.
1.
Create a logging table for each of the 4 features that is in my system. However with this method i am required to create a logging table for each new feature i add to the system. i would also have to combine data from X number of tables if i wish to create a log which potentially could be a huge task!
2.
i could create something like the following:
However once again i would have to add a col for each new feature i will add.
So my question is what is the best way for creating logging database architecture
Or is there an easier way?
Instead of one target_xx for each feature, you could do it this way:
target_id | target_type
1 video
4 document
5 user
2 user
or even better. A table with target types and insert only the respective id's on target_type
Something like this:
if you want to capture for each table creation and update date, i would just use the default and the update event from mysql. You can define the fields like this for a table:
ALTER TABLE table
ADD COLUMN CreateDate Datetime DEFAULT CURRENT_TIMESTAMP,
ADD COLUMN LastModifiedDate Datetime ON UPDATE CURRENT_TIMESTAMP;
You can add these 2 fields in all tables. If you want to use one central table for logging (which might be more difficult to manage, because you always need to create joins, maybe also worse performance), then I would work with triggers.
i'm designing a web site for a friend and i'm not sure what's the best way is to go in regards to one of my database tables.
To give you an idea, this is roughly what i have
Table: member_profile
`UserID`
`PlanID`
`Company`
`FirstName`
`LastName`
`DOB`
`Phone`
`AddressID`
`website`
`AllowNonUserComments`
`AllowNonUserBlogComments`
`RequireCaptchaForNonUserComments`
`DisplayMyLocation`
the last four
AllowNonUserComments
AllowNonUserBlogComments
RequireCaptchaForNonUserComments
DisplayMyLocation
(and possibly more such boolean fields to be added in the future) will control certain website functionality based on user preference.
Basically i'm not sure if i should move those fields to a
new table : member_profile_settings
`UserID`
`AllowNonUserComments`
`AllowNonUserBlogComments`
`RequireCaptchaForNonUserComments`
`DisplayMyLocation`
or if i should just leave it be part of the member_profile table since every member is going to have their own settings.
The target is roughly 100000 members on the long run and 10k to 20k in the short run. My main concern is database performance.
And while i'm at it question #2) would it make sense to move contact information of the member such as address street, city, state, zip , phone etc into the member_profile table instead of having address table and having the AddressID like i currently have.
Thank you
I would say "no" and "yes, but" as the answers to 1) and 2). For #1, your queries are going to be a lot easier to manage if you create columns for each preference. The best systems I've worked with were done that way. Moving the preferences into a separate table with "user, preference, value" triples leads to complex queries that join multiple tables just to check a setting.
For #2: there's no reason to put the address in another table, because the single "AddressID" column means there's just one address per member, anyway, and again, it's just going to complicate the queries. If you turn it around backwards and have an address table that embeds userids then that might make sense; it makes even more sense to do phone numbers that way, since people often have multiple phone numbers.
If each member in the database has exactly ONE value for each of the attributes you have listed, then your database is already normalized and thus in a quite convenient form. So, to answer #1, moving these fields to a different table would improve nothing and just make querying more difficult.
As for #2, if you wanted to contemplate the possibility of a member having multiple addresses or phone numbers, you should definitely put those in different tables, allowing many-to-one relationships. This might also make sense if you expect that a number of users will share the same address; this way, you will not be duplicating information by having to store all the same address information for multiple users, you would just reference an addresses table that would have the relevant information one time per address.
However, if you need neither multiple addresses per member nor multiple members per address, then putting the addresses information in another table is just unnecessary complexity. Which solution is more convenient depends on the needs of your specific application.
Since each member has exactly one value in this table, it's already normalized. However, considering query efficiency, sometimes denormalization should be considered.
Except the ID field, the others could seperate into 2 groups: profile group and settings group. If your website usaually use these two groups of data seperately, you should consider to have news table for different usage.
For example, if the profile fields only shows in profile page and the settings fields works in whole site, it's not necessary to look up profile fields all the time.
I have a requirement to store all versions of an entity in a easily indexed way and was wondering if anyone has input on what system to use.
Without versioning the system is simply a relational database with a row per, for example, person. If the person's state changes that row is changed to reflect this. With versioning the entry should be updated in such a way so that we can always go back to a previous version. If I could use a temporal database this would be free and I would be able to ask 'what is the state of all people as of yesterday at 2pm living in Dublin and aged 30'. Unfortunately there doesn't seem to be any mature open source projects that can do temporal.
A really nasty way to do this is just to insert a new row per state change. This leads to duplication, as a person can have many fields but only one changing per update. It is also then quite slow to select the correct version for every person given a timestamp.
In theory it should be possible to use a relational database and a version control system to mimic a temporal database but this sounds pretty horrendous.
So I was wondering if anyone has come across something similar before and how they approached it?
Update
As suggested by Aaron here's the query we currently use (in mysql). It's definitely slow on our table with >200k rows. (id = table key, person_id = id per person, duplicated if the person has many revisions)
select name from person p where p.id = (select max(id) from person where person_id = p.person_id and timestamp <= :timestamp)
Update
It looks like the best way to do this is with a temporal db but given that there aren't any open source ones out there the next best method is to store a new row per update. The only problem is duplication of unchanged columns and a slow query.
There are two ways to tackle this. Both assume that you always insert new rows. In every case, you must insert a timestamp (created) which tells you when a row was "modified".
The first approach uses a number to count how many instances you already have. The primary key is the object key plus the version number. The problem with this approach seems to be that you'll need a select max(version) to make a modification. In practice, this is rarely an issue since for all updates from the app, you must first load the current version of the person, modify it (and increment the version) and then insert the new row. So the real problem is that this design makes it hard to run updates in the database (for example, assign a property to many users).
The next approach uses links in the database. Instead of a composite key, you give each object a new key and you have a replacedBy field which contains the key of the next version. This approach makes it simple to find the current version (... where replacedBy is NULL). Updates are a problem, though, since you must insert a new row and update an existing one.
To solve this, you can add a back pointer (previousVersion). This way, you can insert the new rows and then use the back pointer to update the previous version.
Here is a (somewhat dated) survey of the literature on temporal databases: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.91.6988&rep=rep1&type=pdf
I would recommend spending a good while sitting down with those references and/or Google Scholar to try to find some good techniques that fit your data model. Good luck!