Concurrency control in database? - mysql

I am implementing an online judge.The submission of a user goes into the submission table of
the database.The table has an attribute status which is initially Queued. My program connects to the database and sees for submissions in submission table with Queued status, if yes picks one of them and turns the status to Assessing. Then the submission is compiled and run against the test cases.Then according to the result the status attribute is changed to Accepted,Wrong Answer etc.
My question is that if I run my program on two different machines with the same database,the two programs can give a concurrency issue.For example if I make a submission , it will have status=Queued,now suppose the first program reads it first and before it changes the
status=Assessing,the second program also reads the submission.Now there is no error as re-evaluation of same submission is taking place.
But still does Mysql provides concurrency in such a case or I have to add it my self.If yes
what is the best method ?

Can be achieved with read locks. There is good documentation with examples http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html

Related

Concurrent inserts mysql - calling same insert stored proc before the first set of inserts is completed

I am working on social networking site, which includes the creation of media content and also records the interaction of users with the created content.
Background of issue - approach used currently
There is a page called news-feed, which displays the content and activity done with the content by the users they are following on site.
Display order of the content changes with more and more user interactions(eg. if there are more number of comments on a post, its likely to be shown on top of the one with lesser number of comments. However, number of comments is just one of the attributes used to rank the post).
I am using mysql(innodb) database to store the data as follows:
activity_master : activities allowed to be part of news feed(post, comment etc)
activity_set : for aggregation of activities on the same object
activity_feed: details of actual activity
Detailed ER Diagram is at the end of question
Scenario
A user(with 1000 followers) posts something, which initiates an async call to the procedure to insert the relevant entries(1000 rows for 1000 followers) in above mentioned tables for all followers.
Some followers started commenting(activity allowed to be part of news feed) before the above call is completed which initiates another call to the same procedure to insert entries(x total number of their own followers) of this activity for their particular set of followers. (e.g User B commented on this post)
All the insert requests(which seems way too many) will have to be processed in queue by innodb engine
Questions
Is there a better and efficient way to do this? (I definitely think there would be one)
How many insert requests can innodb handle in its default configuration?
How to avoid deadlock (or resource congestion at database end) in this case
Or is there any other type of database best suited in this case
Thanks for showing your interest by reading the description, any sort of help in this regard is much appreciated and let me know if any further details are required, thanks in advance!
ER Diagram of tables (not reputed enough to embed the image directly :( )
A rule of thumb: "Don't queue it, just do it".
Inserting 1000 rows is likely to be untenable. Tomorrow, it will be 10000.
Can't you do the processing on the select side instead of the insert side?

What are best practices for creating table for "log of changes" in Database Table?

We are updating table XYZ have following fields:
First Name|Middle Name|Last Name|Address|DOB|Country|County|(etc.)
Initially, we are calling some web service which is sending updated information for a row in XYZ like either update first name or DOB update or both or all or none.
Now there is requirement to create a log table in database which store summary of old records and changes done to XYZ. Every affected row should be reported.
Is it good to create similar fields in new table say ABC:
First Name|Middle Name|Last Name|Address|DOB|Country|County|Update_Date
with additional field called "Update_datetime"
Now each time service called we will select values from previous row i.e from XYZ and update the same to ABC with update date.
What are loopholes in this practice? What other better practices can be followed?
Is there a requirement for a log table or a requirement for a proper history?
Oracle has history functionality Out of the box
I doubt MySQL does - you may have to do a different way.
The pros of Oracle is that it will not fail - it's a core feature. The cons of hand rolled is, well, it's hand rolled. Lots of SPs, triggers or other nastiness that people can deliberately or inadvertently bypass.
I echo the need to know what the requirements are behind this. Is it to be human readable (auditing, debugging etc.) or machine readable (e.g. event sourcing architectural pattern)? How often will you need to go back to look at previous versions? How often do things change?
If it's event sourcing, then there are a few answers around on Stack Overflow about that, e.g. Using an RDBMS as event sourcing storage and best event sourcing db strategy. For more of an introduction, there's e.g. a Martin Fowler video.
There are also SO answers on logging changes in MySQL and Using MySQL triggers to log all table changes to a secondary table and an alternative approach (using 1 table, but adding sort-of version numbers to show each record's validity).

What kind of locking/transaction isolation level is appropriate for this situation?

Let's say I have a Student and a School table. One operation that I am performing is this:
Delete all Students that belong to a School
Modify the School itself (maybe change the name or some other field)
Add back a bunch of students
I am not concerned about this situation: Two people edit the School/Students at the same time. One submits their changes. Shortly after, someone else submits their changes. This won't be a problem because, in the second user's case, the application will notice that they are attempting to overwrite a new revision.
I am concerned about this: Someone opens the editor for the Schools/Students (which involves reading from the tables) while at the same time a transaction that is modifying them is running.
So basically, a read should not be able to run while a transaction is modifying the tables. Additionally, a write shouldn't be able to occur at the same time either.
Only in serializable isolation level MySQL won't allow you to read the rows that are being modified by another transaction. In any lower isolation level, you will see the rows in the state they were before the transaction, that modifies them, have been started. Of course, in READ_UNCOMITTED, the rows will be seen as deleted / modified, although transaction hasn't been completed.
If you use select for update,
You can use locking of tables to prevent this. Check this for more info on lock tables
EDIT
Have a look at this how to lock some row as they don't be selected in other transaction . Think a similar method can be applied for tables also

Question for Conflict in insertion of data in DB by user and admin, see below for description

I have a case that what will happen when at one end Admin is editing the Details of user "A" in a table "users" and at the same time user "A" itself edits its details in table users. Whose effect will reflected.. And what can be done to make it specific to some one or to give the priority?
Thanks and Regards...
As Michael J.V. says, the last one wins - unless you have a locking mechanism, or build application logic to deal with this case.
Locking mechanisms tend to dramatically reduce the performance of your database.
http://dev.mysql.com/doc/refman/5.5/en/internal-locking.html gives an overview of the options in MySQL. However, the scenario you describe - Admin accesses record, has a lock on that record until they modify the record - will cause all kinds of performance issues.
The alternative is to check for a "dirty" record prior to writing the record back. Pseudocode:
User finds record
Application stores (hash of) record in memory
User modifies copy of record
User instructs application to write record to database
Application retrieves current database state, compares to original
If identical
write change to database
If not identical
notify user
In this model, the admin's change would trigger the "notify user" flow; your application may decide to stop the write, or force the user to refresh the record from the database prior to modifying it and trying again.
More code, but far less likely to cause performance/scalability issues.

web inserts at the same time

We have developed an online quiz where users can register a team to take part in the quiz.
There is a check in the asp to see if that teamname has already been submitted, and if it has, an error is generated.
We have noticed a problem if 2 teams are registered at exactly the same time with the same name, that both teams are registered. Although this is highly unlikely, we wondered what approach should be used to overcome this.
We are using mySql as the database if that makes any difference.
Thanks for any guidance.
Don't worry this never happens.
Databases are smart enough and handle concurrency isuuses.
If you run a query on database for registering a team and another team register at the same time, at database level the first query (when it's send to database) succeed and the second fails with an error which you should take care of. If registeration needs actions more than a simple insert on a table then you should use transaction objects at your queries/store-procedures.
You can set the name column to be unique, and the database will throw an error on the second insert. If you want to do it in code, it will be more complicated.
Not sure how two teams can register at exactly the same time - even if the requests are submitted simultaneously (down to the nanosecond), your transaction semantics should still guarantee that one of them is "first" from the database point of view.