I've got some client code that is committing some data across some tables, in simple terms like so:
Client [Id, Balance, Timestamp]
ClientAssets [Id, AssetId, Quantity]
ClientLog [Id, ClientId, BalanceBefore, BalanceAfter]
When the customer buys an asset, I do the following pseudo code:
BEGIN TRANSACTION
GetClientRow Where ID = 1
Has enough balance for new asset cost? Yes...
Insert Into ClientAssets...
UpdateClient -> UPDATE Client SET Balance = f_SumAssetsForClient(1) WHERE ID = 1 and Timestamp = TS From Step 1;
GetClientRow Where ID = 1
Insert Into ClientLog BalanceBefore = Balance at Step 1, BalanceAfter = Balance at Step 5.
COMMIT
On step 4, the client row is updated in 1 update statement using a function 'f_SumAssetsForClient' that just sums the assets for the client and returns the balance of those assets. Also on Step 4, the timestamp is automatically updated.
My problem is, when I call GetClientRow again on Step 5, someone could have updated the clients balance, so when I go to write the log in Step 6, its not truly the balance after this set of steps. It would be the balance after a different write outside of this transaction.
If I could get the newly updated timestamp from the client row when I call UPDATE in Step 4, I could pass this to step to only grab the client row where the TS = the new updated TS. Is this possible at all? Or is my design flawed. I can't see a way out of the problem of stale data between step 5 and 6. I sense there is a problem in the table design but can't quite see it.
Step 1 needs to be SELECT ... FOR UPDATE. Any other data that needs to change also need to be "locked" FOR UPDATE.
That way, another thread cannot sneak in and modify those rows. They will probably be delayed until after you have COMMITted, or there might be a Deadlock. Either way, the thing you are worried about cannot happen. No timestamp games.
Copied from comment
Sounds like you need a step 3.5 that is SELECT f_SumAssetsForClient(1) then
store that value, then do the update, then write the log with the values - you
shouldnt have to deal with the timestamp at all -- or do the whole procedure as
a stored proc
Related
I want to create a trigger that does the following:
copy one column of info jobnumber from one table (jobs) to another (materials) on an existing record to attachedjobnumber.
I haven't found the correct syntax to say this. when I insert a new job - nothing gets update and no new row is inserted,,, but there are no error messages in the logs.
I also need to set the bool (hasjobnumber) equal to true - when I test that trigger - it works fine.
which makes me think that setting the value of material.attachedjobnumber = jobs.jobnumber is the problem, my guess is that jobs.jobnumber isn't in reference when updating table material...
if that's true - what's the proper syntax for this?
I've tested separate triggers, and so far this trigger works fine.
UPDATE material
SET isjobyet = "HAS"
WHERE barcode1 IN (
SELECT primaryRFID
FROM jobs
WHERE jobs.primaryRFID = material.barcode1
)
since this code does work - I make the assumption that the non-static JobNumber value is the source of the problem. since "HAS" is correctly updated.
UPDATE material
SET material.AttachedJobNumber = jobs.JobNumber
WHERE barcode1 IN (
SELECT primaryRFID
FROM jobs
WHERE jobs.primaryRFID = material.barcode1
)
from this - I expect that after each inserts on the table jobs:
jobs.JobNumber be assigned to the material.AttachedJobName
this updates only the material row where the material.barcode1 =jobs.primaryrfid.
but no new row is inserted at all.
Before you perform UPDATE,
Actually you can use the same script using SELECT [skip the UPDATE SYNTAX]
That way you can monitor your script without committing anything yet.
And also I dont recommend using this inside the subquery
WHERE jobs.primaryRFID = material.barcode1
This condition connecting a table works on IN-SELECT subquery.
If you are performing subqueries inside the [WHERE] clause, try to treat it as different buffer, dont connect them first.
I want to convert following —admittedly bad— query from H2/MySQL to Postgres/cockroach:
SET #UPDATE_TRANSFER=
(select count(*) from transfer where id=‘+transfer_id+' and consumed=false)>0;
update balance_address set balance =
case when #UPDATE_TRANSFER then balance +
(select value from transaction where transfer_id=‘+id+' and t_index=0)
else balance end where address =
(select address from transaction where transfer_id=‘+id+' and t_index=0)
There are three tables involved in this query: balance_address, bundle, and transaction. The goal of the query is to update the overall balance when a fund transfer happens.
A transfer can have many transaction bundled together. For instance, let’s assume Paul has $20 in his account and he wants to send $3 to Jane. This will result in 4 transactions:
One that adds $3 into Jane’s account
One transaction that removes the $20 from Paul account
One transactions that changes Paul account to 0
One transaction that puts to remainder of Paul funds in a new address; still belonging to him.
Each of these transaction in the whole transfer bundle has an index and a value. As you see above. So the goal of this update query is to update Jane’s account.
The challenge is that this transfer can be processed by many servers in parallel and there is no distributed lock. So, if we naively process in parallel, each server will increment Jane’s account, leading to erroneous results.
To prevent this, the balance_address table has a column called consumed. The first server that updates the balance, sets the transfer to consumed=true. Other servers or threads can only update if consumed is false.
So, my goal is to 1) improve this query and 2) rewrite it to work with posters. Right now, the variable construct is not accepted already.
PS. I cannot change the data model.
CockroachDB doesn't have variables, but the #UPDATE_TRANSFER variable is only used once, so you can just substitute the subquery inline:
update balance_address set balance =
case
when (select count(*) from transfer where id=$1 and consumed=false)>0
then balance + (select value from transaction where transfer_id=$1 and t_index=0)
else balance
end
where address =
(select address from transaction where transfer_id=$1 and t_index=0)
But this doesn't set the consumed flag. The simplest way to do this is to make this a multi step transaction in your client application:
num_rows = txn.execute("UPDATE transfer SET consumed=true
WHERE id=$1 AND consumed=false", transfer_id)
if num_rows == 0: return
value, address = txn.query("SELECT value, address FROM transaction
WHERE transfer_id=$1 and t_index=0", transfer_id)
txn.execute("UPDATE balance_address SET balance = balance+$1
WHERE address = $2", value, address)
In PostgreSQL, I think you could get this into one big statement using common table expressions. However, CockroachDB 2.0 only supports a subset of CTEs, and I don't think it's possible to do this with a CTE in cockroach yet.
I have been trying to figure out a way to do something like what this Delete all records except the most recent one?
But I have been unable to apply it to my circumstance.
My circumstance:
https://gyazo.com/178b2493e42aa4ec4e1a9ce0cbdb95d3
SELECT * FROM dayz_epoch.character_data;
CharacterID, PlayerUID, InstanceID, Datestamp, LastLogin, Alive, Generation
5 |76561198068668633|11|2016-05-31 18:21:37|2016-06-01 15:58:03|0|1
6 |76561198068668633|11|2016-06-01 15:58:20|2016-10-08 21:30:36|0|2
7 |76561198068668633|11|2016-10-08 21:30:52|2016-10-09 18:59:07|1|3
9 |76561198010759031|11|2016-10-08 21:48:32|2016-10-08 21:53:31|0|2
10|76561198010759031|11|2016-10-08 21:53:55|2016-10-09 19:07:28|1|3
(Look at image above) So I am currently trying to make a better method for deleting dead bodies from my database for my DayZ Epoch server. I need a code to delete Where ALIVE = 0 if that same PlayerUID has another instance where it is ALIVE = 1.
The other thing the code could do is just delete all players except the most recent one for each PlayerUID. I hope this makes sense. It's hard to explain. The first link explains better for me.
But basically, I want to delete any dead player that now has an alive player with that same PlayerUID. If I were better at coding, I could see many variables I could use like PlayerUID (a must), Datestamp, Alive, and generation. Probably only need 2 of those, one being the PlayerUID.
Thanks a bunch.
The easiest to me seems like it would be something like: SORT by PlayerUID AND FOR EACH PlayerUID DELETE ALL EXCEPT(?) newest Datestamp.
This would keep the player stats from their dead body in case they do not create a new character before this script is called.
So basicly, you need to be sure that on a Insert (or update of ALIVE to 1) of a player, you removed all previous (just in case, normaly there should be only one) player with the same PlayerUID as the new one.
The easiest is to create a trigger that will run before the insert (and on UPDATE if this is possible to update ALIVE to 1 to revieve one). Using the UID of the new player to run a delete on the table for the specific UID. This is that simple ;)
For the trigger, this should look like this
Create trigger CLEAR_PLAYER
before insert on dayz_epoch.character_data
For Each Row
Delete from dayz_epoch.character_data
where PlayerUID = NEW.PlayerUID
and Alive = 0 --Just in case, but what will happen if there where a line with Alive = 1 ?
This will be executed before the insert in the table dayz_epoch.character_data
(so don't remove the new one). This will remove every line with the PlayerUID of the inserted line. If you want to add some security, you could add the and Alive= 0 in the condition.
Edit :
Didn't write trigger in a long time, but I use the official doc as a reminder. Take a look if you need.
OK... So I have a calendar, in a custom PHP application built using CodeIgniter, which pulls its entries from a MySQL database. In the table containing the entry data, there are 3 fields that govern the date (start_date), time (start_time) and duration (duration) of the entry. Now, as the end-user moves an entry to a new date and/or time, the system needs to check against these 3 fields for any schedule conflicts.
Following are some vars used in the query:
$request_entry_id // the id of the entry being moved
$request_start_date // the requested new date
$request_start_time // the requested new time
$request_duration // the duration of the entry being moved (will remain the same)
$end_time = ($request_start_time + $request_duration); // the new end time of the entry being moved
My query used to check for a schedule conflict is:
SELECT t.id
FROM TABLE t
WHERE t.start_date = '$request_start_date'
AND (j.start_time BETWEEN '$request_start_time' AND '$end_time'))
AND t.id <> $request_entry_id
The above query will check for any entry that starts on the same date and time as the request. However, I also want to check to make sure that the new request does not fall within the duration of an existing entry, in the most efficient way (there's the trick). Any suggestions? Thanks in advance!
It's easier to figure out the logic if you first think about the condition for when there is no conflict:
The new event ends before the existing one starts, or starts after the existing event ends.
For there to be a conflict we take the negative of the above:
The new event ends after the existing one starts, and starts before the existing event ends.
In SQL:
SELECT t.id
FROM TABLE t
WHERE t.start_date = '$request_start_date'
AND ('$end_time' > t.start_time AND '$request_start_time' < addtime(t.start_time, t.duration))
AND t.id <> $request_entry_id
I have a site with about 30,000 members to which I'm adding a functionality that involves sending a random message from a pool of 40 possible messages. Members can never receive the same message twice.
One table contains the 40 messages and another table maps the many-to-many relationship between messages and members.
A cron script runs daily, selects a member from the 30,000, selects a message from the 40 and then checks to see if this message has been sent to this user before. If not, it sends the message. If yes, it runs the query again until it finds a message that has not yet been received by this member.
What I'm worried about now is that this m-m table will become very big: at 30,000 members and 40 messages we already have 1.2 million rows through which we have to search to find a message that has not yet been sent.
Is this a case for denormalisation? In the members table I could add 40 columns (message_1, message_2 ... message_40) in which a 1 flag is added each time a message is sent. If I'm not mistaken, this would make the queries in the cron script run much faster
?
I know that doesn't answer your original question, but wouldn't it be way faster if you selected all the messages that weren't yet sent to a user and then select one of those randomly?
See this pseudo-mysql here:
SELECT
CONCAT_WS(',', messages.ids) unsent_messages,
user.id user
FROM
messages,
user
WHERE
messages.id NOT IN (
SELECT
id
FROM
sent_messages
WHERE
user.id = sent_messages.user
)
GROUP BY ids
You could also append the id of the sent messages to a varchar-field in the members-table.
Despite of good manners, this would make it easily possible to use one statement to get a message that has not been sent yet for a specific member.
Just like this (if you surround the ids with '-')
SELECT message.id
FROM member, message
WHERE member.id = 2321
AND member.sentmessages NOT LIKE '%-' && id && '-%'
1.2 M rows # 8 bytes (+ overhead) per row is not a lot. It's so small I wouldn't even bet it needs indexing (but of course you should do it).
Normalization reduces redundancy and it is what you'll do if you have large amount of data which seems to be your case. You need not denormalize. Let there be an M-to-M table between members and messages.
You can archive the old data as your M-to-M data increases. I don't even see any conflicts because your cron job runs daily for this task and accounts only for the data for the current day. So you can archive M-to-M table data every week.
I believe there will be maintenance issue if you denormalize by adding additional coloumns to members table. I don't recommend the same. Archiving of old data can save you from trouble.
You could store only available (unsent) messages. This implies extra maintenance when you add or remove members or message types (nothing that can't be automated with foreign keys and triggers) but simplifies delivery: pick a random line from each user, send the message and remove the line. Also, your database will get smaller as messages get sent ;-)
You can achieve the effect of sending random messages by preallocating the random string in your m-m table and a pointer to the offset of the last message sent.
In more detail, create a table MemberMessages with columns
memberId,
messageIdList char(80) or varchar ,
lastMessage int,
primary key is memberId.
Pseudo-code for the cron job then looks like this...
ONE. Select next message for a member. If no row exists in MemberMessages for this member, go to step TWO. The sql to select next message looks like
select substr(messageIdList, 2*lastMessage + 1, 2) as nextMessageId
from MemberMessages
where member_id = ?
send the message identified by nextMessageId
then update lastMessage incrementing by 1, unless you have reached 39 in which case reset it to zero.
update MemberMessages
set lastMessage = MOD(lastMessage + 1, 40)
where member_id = ?
TWO. Create a random list of messageIds as a String of couplets like 2117390740... This is your random list of message IDs as an 80 char String. Insert a row to MemberMessages for your member_id setting message_id_list to your 80 char String and set last_message to 1.
Send the message identified by the first couplet from the list to the member.
You can create a kind of queue / heap.
ReceivedMessages
UserId
MessageId
then:
Pick up a member and select message to send:
SELECT * FROM Messages WHERE MessageId NOT IN (SELECT MessageId FROM ReceivedMessages WHERE UserId = #UserId) LIMIT 1
then insert MessageId and UserId to ReceivedMessages
and do send logic here
I hope that helps.
There are potential easier ways to do this, depending on how random you want "random" to be.
Consider that at the beginning of the day you shuffle an array A, [0..39] which describes the order of the messages to be sent to users today.
Also, consider that you have at most 40 Cron jobs, which are used to send messages to the users. Given the Nth cron job, and ID the selected user ID, numeric, you can choose M, the index of the message to send:
M = (A[N] + ID) % 40.
This way, a given ID would not receive the same message twice in the same day (because A[N] would be different), and two randomly selected users have a 1/40 chance of receiving the same message. If you want more "randomness" you can potentially use multiple arrays.