Tree editing by multiple users simultaneously? - mysql

I have a tree like hierarchical category which I need to save in a DB. I used MPTT (nested sets) to save this data.
The problem is that this category needs to be editable by multiple users, sometimes simultaneously.
How to preserve the integrity of the structure, without putting too much constraints on the users?
Given the nature of MPTT that when changing an element in the structure it affects other elements too (changing of the left / right values).
For ex. User A deletes Node1 and User B adds Leaf1 under Node1. This should give an error to User B that Node1 does not exists anymore, but I believe it would just create confusion for User B...
Are there any practical solutions for this problem?

What you are looking for is optimistic concurrency. It means, you allow the user to begin editing the record but before you apply the changes, you check if the record is in the same state while it was when user started editing.
Other scenario is to lock all the records that might get affected by editing but it will restrict the user from making any changes.

How would this differ from any other multi-user transactional system?
Enclose your user operation in a transaction, so that all participating tables will be locked. Then you validate the inputs against the current data and perform the update.

Related

Database Design: Difference between using boolean fields and duplicate tables

I have to design a database schema for an application I'm building. I will be using MySQL. In this application, users enter data and it gets saved in the database obviously. However, this data is not accessible to the public until the user publishes the data. Currently, I have one column for storing all the data. I was wondering if a boolean field in this table that indicates whether the data has been published is a good idea. Or, is it much better design to create one table for saved data and one table for published data and move the saved data to the published data table when the user presses Publish.
What are the advantages and disadvantages of using each one and is one of them considered better design than the other?
Case: Binary
They are about equal. Use this as a learning exercise -- Implement it one way; watch it for a while, then switch to the other way.
(same) Space: Since a row exists exactly once, neither option is 'better'.
(favor 1 table) When "publishing" it takes a transaction to atomically delete from one table and insert into the other.
(favor 2 tables) Certain SELECTs will spend time filtering out records with the other value for published. (This applies to deleted, embargoed, approved, and a host of other possible boolean flags.)
Case: Revision history
If there are many revisions of a record, then two tables, Current data and History, is better. That is because the 'important' queries involve fetching the only Current data.
(PARTITIONs are unlikely to help in either case.)

How to synchronize table updates

I've single database table containing some financial information. Multiple users may be viewing and updating at the same time from a web form on their computers.
What I want is that anyone who does an update must be doing based on latest table contents. I mean two people may click update at the same time. Say first person's update is successful. Now the second person's update is based on stale information and did not get chance to see the latest update from the first person.
How to avoid such situation?
you have to set the isolation level of your database server to REPEATABLE READ at least. When it's used, the dirty reads and nonrepeatable reads cannot occur. It means that locks will be placed on all data that is used in a query, and another transactions cannot update the data.

What kind of locking/transaction isolation level is appropriate for this situation?

Let's say I have a Student and a School table. One operation that I am performing is this:
Delete all Students that belong to a School
Modify the School itself (maybe change the name or some other field)
Add back a bunch of students
I am not concerned about this situation: Two people edit the School/Students at the same time. One submits their changes. Shortly after, someone else submits their changes. This won't be a problem because, in the second user's case, the application will notice that they are attempting to overwrite a new revision.
I am concerned about this: Someone opens the editor for the Schools/Students (which involves reading from the tables) while at the same time a transaction that is modifying them is running.
So basically, a read should not be able to run while a transaction is modifying the tables. Additionally, a write shouldn't be able to occur at the same time either.
Only in serializable isolation level MySQL won't allow you to read the rows that are being modified by another transaction. In any lower isolation level, you will see the rows in the state they were before the transaction, that modifies them, have been started. Of course, in READ_UNCOMITTED, the rows will be seen as deleted / modified, although transaction hasn't been completed.
If you use select for update,
You can use locking of tables to prevent this. Check this for more info on lock tables
EDIT
Have a look at this how to lock some row as they don't be selected in other transaction . Think a similar method can be applied for tables also

Question for Conflict in insertion of data in DB by user and admin, see below for description

I have a case that what will happen when at one end Admin is editing the Details of user "A" in a table "users" and at the same time user "A" itself edits its details in table users. Whose effect will reflected.. And what can be done to make it specific to some one or to give the priority?
Thanks and Regards...
As Michael J.V. says, the last one wins - unless you have a locking mechanism, or build application logic to deal with this case.
Locking mechanisms tend to dramatically reduce the performance of your database.
http://dev.mysql.com/doc/refman/5.5/en/internal-locking.html gives an overview of the options in MySQL. However, the scenario you describe - Admin accesses record, has a lock on that record until they modify the record - will cause all kinds of performance issues.
The alternative is to check for a "dirty" record prior to writing the record back. Pseudocode:
User finds record
Application stores (hash of) record in memory
User modifies copy of record
User instructs application to write record to database
Application retrieves current database state, compares to original
If identical
write change to database
If not identical
notify user
In this model, the admin's change would trigger the "notify user" flow; your application may decide to stop the write, or force the user to refresh the record from the database prior to modifying it and trying again.
More code, but far less likely to cause performance/scalability issues.

MySQL - Saving and loading

I'm currently working on a game, and just a while ago i started getting start on loading and saving.
I've been thinking, but i really can't decide, since I'm not sure which would be more efficient.
My first option:
When a user registers, only the one record is inserted (into 'characters' table). When the user tries to login, and after he/she has done so successfully, the server will try loading all information from the user (which is separate across multiple tables, and combines via mysql 'LEFT JOIN'), it'll run though all the information it has and apply them to the entity instance, if it runs into a NULL (which means the information isn't in the database yet) it'll automatically use a default value.
At saving, it'll insert or update, so that any defaults that have been generated at loading will be saved now.
My second option:
Simply insert all the required rows at registration (rows are inserted when from website when the registration is finished).
Downsides to first option: useless checks if the user has logged in once already, since all the tables will be generated after first login.
Upsides to first option: if any records from tables are deleted, it would insert default data instead of kicking player off saying it's character information is damaged/lost.
Downsides to second option: it could waste a bit of memory, since all tables are inserted at registration, and there could be spamming bots, and people who don't even manage to get online.
Upsides to first option: We don't have to check for anything in the server.
I also noted that the first option may screw up any search systems (via admincp, if we try looking a specific users).
I would go with the second option, add default rows to your user account, and flag the main user table as incomplete. This will maintain data integrity across your database, whereas every user record is complete in it's entirety. If you need to remove the record, you can simply add a cascading delete script to clean house.
Also, I wouldn't develop your data schema based off of malacious bots creating accounts. If you are concerned about the integrity of your user accounts, add some sort of data validation into your solution or an automated clean-house script to clear out incomplete accounts once the meet a certain criteria, i.e. the date created meeting a certain threshold.
You mention that there's multiple tables of data for each user, with some that can have a default value if none exist in the table. I'm guessing this is set up something like a main "characters" table, with username, password, and email, and a separate table for something like "favorite shortcuts around the site", and if they haven't specified personal preferences, it defaults to a basic list of "profile, games list, games by category" etc.
Then the question becomes when registering, should an explicit copy of the favorite shortcuts default be added for that user, or have the null value default to a default list?
I'd suggest that it depends on the nature of the auxiliary data tables; specifically the default value for those tables. How often would the defaults change? If the default changes often, a setup like your first option would result in users with only a 'basic' entry would frequently get new auxiliary data, while those that did specify their own entries would keep their preferences. Using your second option, if the default changed, in order to keep users updated, a search/replace would have to be done to change entries that were the old default to the new default.
The other suggestion is to take another look at your database structure. You don't mention that your current table layout is set in stone; is there a way to not have all the LEFT JOIN tables, and have just one 'characters' table?