I've got a potential race condition in an application I'm developing, which I'd like to account for and avoid in my querying.
To summarise the application flow...
Create a new row in the entries table:
INSERT INTO entries ( name, email ) VALUES ( 'Foo Bar', 'foo#example.com' );
Find out if Mr Bar is a winner by checking a time-sensitive prizes table:
SELECT id FROM prizes WHERE various_time_conditions = 'met' AND id NOT IN ( SELECT prize_id FROM entries );
If he's a winner, update his entry row accordingly:
UPDATE entries SET prize_id = [prize id] WHERE id = [entry id];
As each prize can only be given out once, I need to eliminate any possibility of a race condition where another process can query the prizes table and update the entry table between steps 2 and 3 above.
I've been doing some research and have found a load of information about transactions (all my tables use InnoDB) and using MySQL's SELECT ... FOR UPDATE syntax but I'm confused as to which is the most suitable solution for me.
You're going to want to lock the prize record. So add some availability flag on the prizes table (perhaps with a default value) if you're not going to use something like a winner_id. Something like this:
SELECT id FROM prizes WHERE ... AND available = 1 FOR UPDATE
Then set the availability if you do assign the prize:
UPDATE prizes SET available = 0 WHERE id = ...
You'll need to wrap this inside a transaction of course.
Make sure that every time you check to see if the prize is available, you add AND available = 1 FOR UPDATE to the query because a SELECT without the FOR UPDATE is not going to wait for a lock.
Related
I am trying to combine INSERT, UPDATE and WHERE NOT EXISTS in one and the same query.
What I have at the moment is these two queries that works as expected separately
INSERT INTO settings (mid) SELECT '123' FROM DUAL WHERE NOT EXISTS (SELECT mid FROM settings WHERE mid='123');
UPDATE settings SET vote = CONCAT_WS(',', vote, '22') WHERE mid = '123'
What I am trying to achieve is combine them together, so I can bother the db once
What I have is a table with two columns: mid that stores the unique user id, that column is also a primary, and another column that is called vote that stores the user votes in a comma separated order.
So, my aim here is to first check is the user is having a row already created for him (if not to create it) and then if the row exists to add the new vote 22 in my example to the list.
I have a table as such:
id entity_id first_year last_year sessions_attended age
1 2020 1996 2008 3 34.7
2 2024 1993 2005 2 45.1
3 ... ... ...
id is auto-increment primary key, and entity_id is a foreign key that must be unique for the table.
I have a query that calculates first and last year of attendance, and I want to be able to update this table with fresh data each time it is run, only updating the first and last year columns:
This is my insert/update for "first year":
insert into my_table (entity_id, first_year)
( select contact_id, #sd:= year(start_date)
from
( select contact_id, event_id, start_date from participations
join events on participations.event_id = events.id where events.event_type_id = 7
group by contact_id order by event_id ASC) as starter)
ON DUPLICATE KEY UPDATE first_year_85 = #sd;
I have one similar that does "last year", identical except for the target column and the order by.
The queries alone return the desired values, but I am having issues with the insert/update queries. When I run them, I end up with the same values for both fields (the correct first_year value).
Does anything stand out as the cause for this?
Anecdotal Note: This seems to work on MySQL 5.5.54, but when run on my local MariaDB, it just exhibits the above behavior...
Update:
Not my table design to dictate. This is a CRM that allows custom fields to be defined by end-users, I am populating the data via external queries.
The participations table holds all event registrations for all entity_ids, but the start dates are held in a separate events table, hence the join.
The variable is there because the ON DUPLICATE UPDATE will not accept a reference to the column without it.
Age is actually slightly more involved: It is age by the start date of the next active event of a certain type.
Fields are being "hard" updated as the values in this table are being pulled by in-CRM reports and searches, they need to be present, can't be dynamically calculated.
Since you have a 'natural' PK (entity_id), why have the id?
age? Are you going to have to change that column daily, or at least monthly? Not a good design. It would be better to have the constant birth_date in the table, then compute the ages in SELECT.
"calculates first and last year of attendance" -- This implies you have a table that lists all years of attendance (yoa)? If so, MAX(yoa) and MIN(yoa) would probably a better way to compute things.
One rarely needs #variables in queries.
Munch on my comments; come back for more thoughts after you provide a new query, SHOW CREATE TABLE, EXPLAIN, and some sample data.
I have two tables called users and packages.
In users there is a column called "package" and in packages a column called "id".
What i'm trying to accomplish is, if the package id in the users table is changed to, lets say "1", then another field from the users table called "storage" should be changed to the corresponding "maxstorage" from the packages table... A little illustration here:
DATABASE:
Let's say Joe would like to upgrade to package number 2. Then his storage amount should be changed when his package is changed. It should pull the maxstorage from the packages table into the users table and then in the column "storage"...
How can i accomplish this?
It's pretty hard to explain for me, if anyone gets it then please edit for easier explanation.
What you want is not possible in a query (or at least, not simple). You have to move this logic to your code, e.g. you have a query which changes a row in the users table. In that query, also update storage.
Even better, drop the users.storage_id completely. Good databases don't repeat. you already have the data in the packages table, why copy it to the users?
SELECT users.name, packages.maxstorage
FROM users
LEFT JOIN packages ON (users.package_id = packages.id)
It can be that I didn't understand correctly your question, but what about this:
(pls make appropriate considerations on transactions to avoid conflicts).
CREATE TABLE P (ID INT, MAXSTORAGE INT);
CREATE TABLE U (USR_ID INT, PACKAGE_ID INT, STORAGE INT);
CREATE TRIGGER U_STORAGE_UPDATE BEFORE UPDATE ON U
FOR EACH ROW BEGIN
SET NEW.STORAGE = IF(NEW.PACKAGE_ID<>OLD.PACKAGE_ID , (SELECT MAXSTORAGE FROM P WHERE ID = NEW.PACKAGE_ID), NEW.STORAGE);
END;
INSERT INTO P VALUES (1,12345);
INSERT INTO P VALUES (2,54321);
INSERT INTO U VALUES (1,1,12000);
INSERT INTO U VALUES (2,2,60000);
SELECT * FROM U;
UPDATE U SET PACKAGE_ID=2 WHERE USR_ID=1;
SELECT * FROM U;
UPDATE U SET STORAGE=23
WHERE USR_ID=1;
SELECT * FROM U;
DROP TABLE P;
DROP TABLE U;
Output:
ante
USR_ID PACKAGE_ID STORAGE
1 1 1 12000
2 2 2 60000
post 1st update
USR_ID PACKAGE_ID STORAGE
1 1 2 54321
2 2 2 60000
post 2nd update
USR_ID PACKAGE_ID STORAGE
1 1 2 23
2 2 2 60000
Doesn't answer the question but might be useful to you:
The 'package' column in Users should have a foreign key restraint on 'id' in Packages. This ensures that all data in the 'package' column corresponds to a valid value in the Packages table. Otherwise you could enter some data into the 'package' column that doesn't have a value in the Packages table.
I have two table A, B. Where A is master table B is stage table. Many stage table get created B(for different state[AP, MH, UP, TN..] by updates provided by these states) exist and need to upsert in table A Monthly.
A- Id(Primery Key), Name, Email, ContactNumber, State, TimeStamp, Active(True, False)
A contains all data from each states, with last update date as time stamp and With flags.
B- Name, Email, ContactNumber
For now B is is considered to be updates from state AP. And contains updates means, B can have some row deleted which was earlier and some added and can have few same Previous rows.
I have to upsert by joning(on name, email, contact_number) all the updates for AP in such way,
IF row only present in A then Active_flag is False
IF row present in both then UPDATE row in A as timestamp=now() and active_flag=True
If row present in B insert into A with extra values STATE=AP and timestamp=now() and active=True
Is it possible using CASE, or IF-ELSE (Will it be fast instead of using multiple query like first update and insert)?
It seems you are looking for a REPLACE INTO command, as described in MySQL manual with mysql's select syntax (insert into sometable select * from anothertable), like this:
replace into mastertable m
select
field1, field2, field3, now()
from
stagetable s
outer join
mastertable m2 on m2.id=s.id
where
m2.ActiveFlag=true or m2.ActiveFlag is null
Of course I didnt try this, but this would be the general idea.
I am currently testing change tracking mechanism in sql server 2008 and noticed something:
When to the base table (change tracked) I insert a new record and delete it using the same
key, select with the changes returns to me the information that it should be deleted in the remote table however that record doesn't exist at all in that table ..
Why is it functioning that way?
SAMPLE CODE:
CREATE TABLE TEST (
ID UNIQUEIDENTIFIER primary key,
value int
)
ALTER TABLE [dbo].Test
ENABLE CHANGE_TRACKING
SELECT CHANGE_TRACKING_CURRENT_VERSION()
SELECT CT.SYS_CHANGE_OPERATION, CT.ID, IV.value
FROM CHANGETABLE(CHANGES TEST, 51374) CT
LEFT JOIN TEST IV ON IV.ID = CT.ID
--zero changes now for: 51374
insert into Test VALUES ('54C1D80E-ACB0-433F-94DF-7D06FE809E22', 1)
delete from Test where id = '54C1D80E-ACB0-433F-94DF-7D06FE809E22'
select * from Test -- table is empty (insert and delete)
SELECT CT.SYS_CHANGE_OPERATION, CT.ID, IV.value
FROM CHANGETABLE(CHANGES TEST, 51374) CT
LEFT JOIN TEST IV ON IV.ID = CT.ID
--however for Anchor: 51374 it claims I should delete the record ...
My base and the remote table were in sync at 51374 anchor.
Adding and deleting the record shouldn't give me the info for deleting
of something I don't have in my remote table ...
I think Damien's answer is right. Not sure why he answered in a comment.
It's all there in the documentation, really "Only the fact that a row
has changed is required, not how many times the row has changed or the
values of any intermediate changes.", "If an application requires
information about all the changes that were made and the intermediate
values of the changed data, using change data capture, instead of
change tracking."
Imagine if there's a row with ID 1 that you know about. Then in some
period, someone goes in and deletes that row, and then adds a new row
with the same ID. Change tracking will give you an insert, even though
you already knew about a row with ID 1 and haven't seen a delete. It's
the nature of the beast - you only get the last change, and you have
to reconcile that with your version of reality. If you see a Delete
for a row you don't know about, ignore it.