use multiple go routines for different operations on mysql

use multiple go routines for different operations on mysql - mysql

I have a piece of Go code with 3 different functions, insertIntoMysql, updateRowMysql and deleteRowmysql. I check for operation type and run one of these functions as needed.
I want to convert my normal functions into go routines to be able to handle more operations.
But here is the issue:
If I convert into goroutines, I will lose the sequence of operations.
For example, the insert operations are much more frequent than the delete operations and insert operations are being queued in the insert channel while the delete channel is empty it is possible for my code to try to delete a row before it gets inserted (e.g. a row is inserted and the deleted 1 sec later).
Any ideas on how to make sure the sequence of my operations on mysql is the same as the received operations.
here is the code:
go insertIntoMysql(insertChan, fields, db, config.DestinationTable)
go updatefieldMysql(updateChan, fields, db, config.DestinationTable)
go deleteFromMysql(deleteChan, fields, db, config.DestinationTable)
for !opsDone {
select {
case op, open := <-mysqlChan:
if !open {
opsDone = true
break
}
switch op.Operation {
case "i":
//fmt.Println("got insert operation")
insertChan <- op
break
case "u":
fmt.Println("got update operation")
updateChan <- op
break
case "d":
fmt.Println("got delete operation")
deleteChan <- op
break
case "c":
fmt.Println("got command operation")
//no logic yet
break
}
break
}
}
close(done)
close(insertChan)
close(deleteChan)
close(updateChan)
}

Looking at your code and your question/requirement: Stay in order as the original go channel delivered the data.
Staying in sync and still firing off multiple go routines to process the data, could be done like this:
func processStatements(mysqlChan chan *instructions) {
var wg sync.WaitGroup
prevOp := "" // Previous operation
for {
op, open := <-mysqlChan
if !open {
break
}
if prevOp!=op.Operation {
// Next operation type, unknown side effects on previous operations,
// wait for previous to finish before continuing
wg.Wait()
}
switch op.Operation {
case "i":
wg.Add(1)
go processInsert(op,wg)
break
case "u":
wg.Add(1)
go processUpdate(op,wg)
break
case "d":
wg.Add(1)
go processDelete(op,wg)
break
case "c":
break
}
prevOp = op.Operation
}
// Previous go routines might still be running, wait till done
wg.Wait()
// Close any channels
}
func processInsert(c chan *sqlCall,wg *sync.WaitGroup) {
defer wg.Done()
// Do actual insert etc
}
The main differences with your program are:
Processing is in order, safe against your base scenario
Process of next operation type will wait until previous operation type is finished
Multiple of the same operations run in parallel (as for in your code a single operation per type would run in parallel). Depending on the data distribution either of them can be faster or slower (i,i,i,d,u = 3 waits in this code, while it would also be 3 waits in your code, however in your code the 3 i would run sequential, here they run parallel) (Look into insert or update for mysql, since now your inserts could suddenly be updates depending on your data).

How would the one that calls delete know about the row in the first place? From a SQL perspective: if it is not committed, it does not exist. Something calls your code to insert. Your code should return only after the transaction is completed. At which point the caller could start select, update or delete.
For example, in a web server: Only the client that called insert knows of the existence to the record initially. Other clients will only learn of its existence after running select. That select will only return the new record if it was committed in the transaction that inserted it. Now they might decide to upsert or delete.
Many SQL databases take care of proper row locking, which will ensure data integrity during concurrent access. So if you use Go routines to write to the same record, they will become sequential (in any given order) on the DB side. Concurrent reads could still happen.
Please note that net/http already runs a Go routine for every request. And so do most other server front-ends for Go. If you are writing something different altogether, like a custom TCP listener, you could initiate your own Go routine for each request.

As far as I understand the problem, the following can be done as a resolution :
Call Insert, Delete or Update as go routines and they will run concurrently and performing all these operations.
Ensure you have row-level locking in your MySQL DB (I am sure InnoDb provides that)
For your delete and update operations, have an isExist check to check if the row you are updating/deleting exists before deleting and updating the row, this allows for some level of control.
You can implement a retry mechanism in your channel (with jitter preferably) to ensure that even if Delete Operation goes before insert operation, it will fail the exist check and then can be retried(after a second or 0.5 second or some config) and this will ensure insert record is already performed before delete or update operation is retried.
Hope this helps.

Related

Apps Script - Achieving Multiple Locks on Different Parts of Script

I am trying to limit concurrent use of apps script for different parts of the same script, but no matter what I do, one lock will lock the whole thing down. Here's my sample script:
function lockTest(event) {
var r_eventRange = event.range;
var value = r_eventRange.getValue();
if (value == "A"){
var lock1 = LockService.getScriptLock();
lock1.waitLock(10000);
//do stuff
lock1.releaseLock();
} else if (value == "B") {
var lock2 = LockService.getScriptLock();
lock2.waitLock(10000);
//do different stuff
lock2.releaseLock();
}
}
Function 'lockTest' is triggered by an onEdit event. The intent for this script is that there can be two values: "A", and "B". If I get consecutive calls where both have value "A", then force the calls to wait to execute one at a time. Same thing if I get consecutive calls where both have value "B". But if I get consecutive calls where value is "A" in one and "B" in the other, then go ahead and let the code run consecutively without waiting.
However, the with this script the 'lock service' will not allow consecutive runs regardless of the value. In other words, if the first call has value "A", and the second call comes before the first is done but the value is "B", the lock service is forcing the second call to wait for the first call to complete even though the locks are supposed to be different.
Is it possible to control the locking on different parts of the script?
Basically, I need value "A" to trigger lock1 which edits a specific set of shared resources. value "B" should trigger lock2 which edits a completely different set of shared resources than lock1. That means that lock1 and lock2 should be able to run at the same time because the resources they edit are completely different. In fact I need them to run at the same time because lock1 will be triggered a lot more frequently than lock2 and lock1 takes only about 2 seconds to execute. lock2 is triggered much less frequently but takes 30-150 seconds to execute. So I need lock1 to be able to run even if lock2 is already running because otherwise lock1 will either time out or there could be so many instances of lock1 waiting in line while lock2 finishes that the 'lock service' will start throwing errors.
Interestingly enough, if I put a logger before the if() statement, I get logs instantly every time an edit event happens - meaning that the 'lock service' in only locking down the if statement and not the lines before it.
Google's documentation for 'lock service' says
getScriptLock() Gets a lock that prevents any user from concurrently
running a section of code. A code section guarded by a script lock
cannot be executed simultaneously regardless of the identity of the
user.
It seems like from this that we should be able to lock specific sections of code. I just can't figure out how to designate what those individual sections are. I assumed the releaseLock() method would tell it where to end the section. And maybe it does - I'm concerned that the 'lock service' doesn't support designating multiple independent locks, which is what I need here.
According to Oleg's suggestion:
function lockTest(event) {
var r_eventRange = event.range;
var value = r_eventRange.getValue();
if (value == "A"){
checkFormatting();
} else {
doAction();
}
}
function checkFormatting(){
var lock1 = LockService.getScriptLock();
lock1.waitLock(10000);
//do stuff
lock1.releaseLock();
}
function doAction(){
var lock2 = LockService.getScriptLock();
lock2.waitLock(10000);
//do different stuff
lock2.releaseLock();
}
Unfortunately I'm having the same issue with the above code.

I have the same problem as you do. It seems the locks provided by LockService are singletons, and thus your lock.waitLock() calls will always wait irregardless of where the lock was grabbed initially.
I recommend you look at this library providing "named locks" which is what you are looking for here. Have not used it yet, but looks promising.
https://ramblings.mcpher.com/gassnippets2/using-named-locks-with-google-apps-scripts/

LockService provides a single mutex (aka lock) which you can lock/unlock.
Recall that multiple JS runtimes can run in your Google Sheet at a time. The guarantee that LockService provides is that when waitLock() succeeds, you know that you are the only JS runtime that has obtained the lock. All other runtimes are either doing something which doesn't care about the lock, or are waiting for you to release the lock.
const mutex = LockService.getScriptLock();
const TIMEOUT_MS = 10*1000;
function doStuff() {}
function doTheFirstThing() {
// Process some local data. Don't need the lock yet.
doStuff();
// Increment a cell. We definitely need the lock here,
// because we need to read and write in a row without
// somebody changing the value under our nose.
mutex.waitLock(TIMEOUT_MS);
const range = SpreadsheetApp.getActiveSpreadsheet.getRange('A1');
range.setValue(range.getValue()+1);
mutex.releaseLock();
}
function doTheSecondThing() {
// Process some different local data.
// We also don't care about the lock yet.
// Decrement a cell. We definitely need the lock here,
// because we need to read and write in a row without
// somebody changing the value under our nose.
mutex.waitLock(TIMEOUT_MS);
const range = SpreadsheetApp.getActiveSpreadsheet.getRange('A1');
range.setValue(range.getValue()-1);
mutex.releaseLock();
}
So a critical section is not defined really by a code block persay - it is defined by where you lock and unlock a specific mutex. In my example, doTheFirstThing() and doTheSecondThing() share a critical section. What you can do to achieve named locks (and thus distinct critical sections) is maintain a (synchronized) map of (lockName => isLocked) and use the one global lock you get to access said map.
However, be wary of performance here, as locking any named lock will lock the global lock.

Is it more efficient to store MySQL results in an object during app start?

When I start my Node app, I store "static" MySQL data like game quests and game monsters which won't be modified in global objects. I'm not sure if is more efficient to do it this way or retrieving the data each time I need it. Sample code:
global.monsters;
doConn.query('SELECT * FROM monsters', function(error, results) {
if (error) {
throw error;
}
console.log('[MYSQL] Loaded monsters');
monsters = results;
});

There's an important concept of code efficiency called loop-invariant. It refers to anything that remains the same in every iteration of a loop.
Example:
for loop := range 1..100 {
m = 42
// other statements...
}
m is assigned a fixed value 100 times. Why do it 100 times? Why not assign it once, either before or after the loop, and save 99% of that work?
m = 42
for loop := range 1..100 {
// other statements...
}
Some kinds of compilers can factor this code out during the compilation step. But maybe not Node.js. Even if it could factor it out, the code would be more clear to the reader if you write it with loop-invariant statements outside the loop. Otherwise the reader will waste some of their attention trying to figure out if there's some reason the statement is inside the loop.
The example of m = 42 is very simple, but there could be more complex code that is still loop-invariant. Like querying data out of a database, the question you ask about.
There are exceptions to every rule. For example, some of the data about monsters could change frequently, even while players are playing, so your game might need to query repeatedly so it makes sure to have the latest data at all times.
But in general, if you can identify queries that are just as correct if you query them once at the start of the program, it's better to do that than to query them repeatedly.

Grails Immediate commit for objects in a transaction

In my project there is a table called process_detail. Row inserted in this table as soon as a cron process starts and is updated at the end
of the cron process completion. We are using grails which internally takes care of transaction at service level method i.e. transaction starts at the start of the method, commit if the method execution successful, rollback if any exception.
Here what happens is that if the transaction fails this row also being rolled back this I don't want because this is type of a log
table. I tried creating a nested transaction and save this row and at the end update it but that fails with lock acquisition exception.
I am thinking of using MyISAM for this particular table,
this way I don't have to worry about transaction because MyISAM does not support it and it will commit immediately and no rollback possible. Here's pseudo code for what I am trying to achieve.
def someProcess(){
//Transaction starts
saveProcessDetail(details); //Commit this immediately, should not rollback if below code fails.
someOtherWork;
updateProcessDetail(details); //Commit this immediately, should
//Transaction Ends
}
Pseudo code for save and update process detail;
def saveProcessDetail(processName, processStatus){
ProcessDetail pd = new ProcessDetail(processName, processStatus);
pd.save();
}
def updateProcessDetail(processDetail, processStatus){
pd.procesStatus = processStatus;
pd.save();
}
Please advice if there is better of doing this in InnoDB. Answer could be mysql level I can find the grails solution my self. Let me know if any other info required.

Make someProcess #NonTransactional, then manage the transactional nature yourself. Write the initial saveProcessDetail with a flush:true, then make the remainder of the processing transactional, withTransaction?
Or
#NonTransactional
def someProcess() {
saveProcessDetail(details) // I'd still use a flush:true
transactionalProcessWork()
}
#Transactional
def transactionalProcessWork() {
someOtherWork()
updateProcessDetail(details)
}

How to test MySQL transactions?

I have a question about testing the queries in a transaction. I've been using MySQL transactions for quite some time now, and everytime I do this, I use something like:
$doCommit = true;
$error = "";
mysql_query("BEGIN");
/* repeat this part with the different queries in the transaction
this often involves updating of and inserting in multiple tables */
$query = "SELECT, UPDATE, INSERT, etc";
$result = mysql_query($query);
if(!$result){
$error .= mysql_error() . " in " . $query . "<BR>";
$doCommit = false;
}
/* end of repeating part */
if($doCommit){
mysql_query("COMMIT");
} else {
echo $error;
mysql_query("ROLLBACK");
}
Now, it often happens that I want to test my transaction, so I change mysql_query("COMMIT"); to mysql_query("ROLLBACK");, but I can imagine this is not a very good way to test this kind of stuff. It's usually not really feasable to copy every table to a temp_table and update and insert into those tables and delete them afterwards (for instance because tables maybe very large). Of course, when the code goes into production relevant error-handling (instead of just printing the error) is put into place.
What's the best way to do stuff like this?

First of all, there is a bug in your implementation. If a query errors out, the current transaction is automatically rolled back and then closed. So as you continue to execute queries, they will not be within a transaction (they will be commited to the DB). Then, when you execute Rollback, it'll silently fail. From the MySQL docs:
Rolling back can be a slow operation that may occur implicitly without the user
having explicitly asked for it (for example, when an error occurs).
The explicit command ROLLBACK should only be used if you determine in the application that you need to rollback (for reasons other than a query error). For example, if you're deducting funds from an account, you'd explicitly rollback if you found out the user didn't have enough funds to complete the exchange...
As far as testing the transactions, I do copy the database. I create a new database and install a set of "dummy data". Then I run all the tests using an automated tool. The tool will actually commit the transactions and force rollbacks, and check that the expected database state is maintained throughout the tests. Since it's harder to programatically know the end state from a transaction if you have an unknown input to the transaction, testing off of live (or even copied-from-live) data is not going to be easy. You can do it (and should), but don't depend upon those results for determining if your system is working. Use those results to build new test cases for the automated tester...

Maybe you could refactor your first example and use some DB access wrapper class?
In that wrapper class you can have a variable $normalCommit = true;
and a method SetCommitMode() which sets that $normalCommit variable.
And you have a method Commit() which commits if($normalCommit == true)
Or even have a variable $failTransaction which calls mysql_query("ROLLBACK"); if you wish (so you could pass/fail many sequential tests).
Then when you run the test, you can set somewhere in the test code file:
$myDBClass->SetCommitMode(false);
or
$myDBClass->RollBackNextOperation(true);
before the operation which you wish to fail, and it will just fail. In such a way the code which you are testing will not contain those fail/commit checks, only the DB class will contain them.
And normally ONLLY the test code (especially if you do unit testing) should call those SetCommitMode and RollBackNextOperation methods, so you accidentally do not leave those calls in the production code.
Or you could pass some crazy data to your method (if you are testing a method), like negative variables to save in UNSIGNED fields, and then your transaction should fail 100% if your code does not do commit after such an SQL error (but it should not).

Generally I use something like (I use pdo for my example):
$db->beginTransaction();
try {
$db->exec('INSERT/DELETE/UPDATE');
$db->commit();
}
catch (PDOException $e) {
$db->rollBack();
// rethrow the error or
}
Or if you have your own exception handler, use a special clause for your PDOExceptions, where to rollback the execution. Example:
function my_exception_handler($exception) {
if($exception instanceof PDOException) {
// assuming you have a registry class
Registry::get('database')->rollBack();
}
}

What is an idempotent operation?

What is an idempotent operation?

In computing, an idempotent operation is one that has no additional effect if it is called more than once with the same input parameters. For example, removing an item from a set can be considered an idempotent operation on the set.
In mathematics, an idempotent operation is one where f(f(x)) = f(x). For example, the abs() function is idempotent because abs(abs(x)) = abs(x) for all x.
These slightly different definitions can be reconciled by considering that x in the mathematical definition represents the state of an object, and f is an operation that may mutate that object. For example, consider the Python set and its discard method. The discard method removes an element from a set, and does nothing if the element does not exist. So:
my_set.discard(x)
has exactly the same effect as doing the same operation twice:
my_set.discard(x)
my_set.discard(x)
Idempotent operations are often used in the design of network protocols, where a request to perform an operation is guaranteed to happen at least once, but might also happen more than once. If the operation is idempotent, then there is no harm in performing the operation two or more times.
See the Wikipedia article on idempotence for more information.
The above answer previously had some incorrect and misleading examples. Comments below written before April 2014 refer to an older revision.

An idempotent operation can be repeated an arbitrary number of times and the result will be the same as if it had been done only once. In arithmetic, adding zero to a number is idempotent.
Idempotence is talked about a lot in the context of "RESTful" web services. REST seeks to maximally leverage HTTP to give programs access to web content, and is usually set in contrast to SOAP-based web services, which just tunnel remote procedure call style services inside HTTP requests and responses.
REST organizes a web application into "resources" (like a Twitter user, or a Flickr image) and then uses the HTTP verbs of POST, PUT, GET, and DELETE to create, update, read, and delete those resources.
Idempotence plays an important role in REST. If you GET a representation of a REST resource (eg, GET a jpeg image from Flickr), and the operation fails, you can just repeat the GET again and again until the operation succeeds. To the web service, it doesn't matter how many times the image is gotten. Likewise, if you use a RESTful web service to update your Twitter account information, you can PUT the new information as many times as it takes in order to get confirmation from the web service. PUT-ing it a thousand times is the same as PUT-ing it once. Similarly DELETE-ing a REST resource a thousand times is the same as deleting it once. Idempotence thus makes it a lot easier to construct a web service that's resilient to communication errors.
Further reading: RESTful Web Services, by Richardson and Ruby (idempotence is discussed on page 103-104), and Roy Fielding's PhD dissertation on REST. Fielding was one of the authors of HTTP 1.1, RFC-2616, which talks about idempotence in section 9.1.2.

No matter how many times you call the operation, the result will be the same.

Idempotence means that applying an operation once or applying it multiple times has the same effect.
Examples:
Multiplication by zero. No matter how many times you do it, the result is still zero.
Setting a boolean flag. No matter how many times you do it, the flag stays set.
Deleting a row from a database with a given ID. If you try it again, the row is still gone.
For pure functions (functions with no side effects) then idempotency implies that f(x) = f(f(x)) = f(f(f(x))) = f(f(f(f(x)))) = ...... for all values of x
For functions with side effects, idempotency furthermore implies that no additional side effects will be caused after the first application. You can consider the state of the world to be an additional "hidden" parameter to the function if you like.
Note that in a world where you have concurrent actions going on, you may find that operations you thought were idempotent cease to be so (for example, another thread could unset the value of the boolean flag in the example above). Basically whenever you have concurrency and mutable state, you need to think much more carefully about idempotency.
Idempotency is often a useful property in building robust systems. For example, if there is a risk that you may receive a duplicate message from a third party, it is helpful to have the message handler act as an idempotent operation so that the message effect only happens once.

A good example of understanding an idempotent operation might be locking a car with remote key.
log(Car.state) // unlocked
Remote.lock();
log(Car.state) // locked
Remote.lock();
Remote.lock();
Remote.lock();
log(Car.state) // locked
lock is an idempotent operation. Even if there are some side effect each time you run lock, like blinking, the car is still in the same locked state, no matter how many times you run lock operation.

An idempotent operation produces the result in the same state even if you call it more than once, provided you pass in the same parameters.

An idempotent operation is an operation, action, or request that can be applied multiple times without changing the result, i.e. the state of the system, beyond the initial application.
EXAMPLES (WEB APP CONTEXT):
IDEMPOTENT:
Making multiple identical requests has the same effect as making a single request. A message in an email messaging system is opened and marked as "opened" in the database. One can open the message many times but this repeated action will only ever result in that message being in the "opened" state. This is an idempotent operation. The first time one PUTs an update to a resource using information that does not match the resource (the state of the system), the state of the system will change as the resource is updated. If one PUTs the same update to a resource repeatedly then the information in the update will match the information already in the system upon every PUT, and no change to the state of the system will occur. Repeated PUTs with the same information are idempotent: the first PUT may change the state of the system, subsequent PUTs should not.
NON-IDEMPOTENT:
If an operation always causes a change in state, like POSTing the same message to a user over and over, resulting in a new message sent and stored in the database every time, we say that the operation is NON-IDEMPOTENT.
NULLIPOTENT:
If an operation has no side effects, like purely displaying information on a web page without any change in a database (in other words you are only reading the database), we say the operation is NULLIPOTENT. All GETs should be nullipotent.
When talking about the state of the system we are obviously ignoring hopefully harmless and inevitable effects like logging and diagnostics.

Just wanted to throw out a real use case that demonstrates idempotence. In JavaScript, say you are defining a bunch of model classes (as in MVC model). The way this is often implemented is functionally equivalent to something like this (basic example):
function model(name) {
function Model() {
this.name = name;
}
return Model;
}
You could then define new classes like this:
var User = model('user');
var Article = model('article');
But if you were to try to get the User class via model('user'), from somewhere else in the code, it would fail:
var User = model('user');
// ... then somewhere else in the code (in a different scope)
var User = model('user');
Those two User constructors would be different. That is,
model('user') !== model('user');
To make it idempotent, you would just add some sort of caching mechanism, like this:
var collection = {};
function model(name) {
if (collection[name])
return collection[name];
function Model() {
this.name = name;
}
collection[name] = Model;
return Model;
}
By adding caching, every time you did model('user') it will be the same object, and so it's idempotent. So:
model('user') === model('user');

Quite a detailed and technical answers. Just adding a simple definition.
Idempotent = Re-runnable
For example,
Create operation in itself is not guaranteed to run without error if executed more than once.
But if there is an operation CreateOrUpdate then it states re-runnability (Idempotency).

Idempotent Operations: Operations that have no side-effects if executed multiple times.
Example: An operation that retrieves values from a data resource and say, prints it
Non-Idempotent Operations: Operations that would cause some harm if executed multiple times. (As they change some values or states)
Example: An operation that withdraws from a bank account

It is any operation that every nth result will result in an output matching the value of the 1st result. For instance the absolute value of -1 is 1. The absolute value of the absolute value of -1 is 1. The absolute value of the absolute value of absolute value of -1 is 1. And so on. See also: When would be a really silly time to use recursion?

An idempotent operation over a set leaves its members unchanged when applied one or more times.
It can be a unary operation like absolute(x) where x belongs to a set of positive integers. Here absolute(absolute(x)) = x.
It can be a binary operation like union of a set with itself would always return the same set.
cheers

In short, Idempotent operations means that the operation will not result in different results no matter how many times you operate the idempotent operations.
For example, according to the definition of the spec of HTTP, GET, HEAD, PUT, and DELETE are idempotent operations; however POST and PATCH are not. That's why sometimes POST is replaced by PUT.

An operation is said to be idempotent if executing it multiple times is equivalent to executing it once.
For eg: setting volume to 20.
No matter how many times the volume of TV is set to 20, end result will be that volume is 20. Even if a process executes the operation 50/100 times or more, at the end of the process the volume will be 20.
Counter example: increasing the volume by 1. If a process executes this operation 50 times, at the end volume will be initial Volume + 50 and if a process executes the operation 100 times, at the end volume will be initial Volume + 100. As you can clearly see that the end result varies based upon how many times the operation was executed. Hence, we can conclude that this operation is NOT idempotent.
I have highlighted the end result in bold.
If you think in terms of programming, let's say that I have an operation in which a function f takes foo as the input and the output of f is set to foo back. If at the end of the process (that executes this operation 50/100 times or more), my foo variable holds the value that it did when the operation was executed only ONCE, then the operation is idempotent, otherwise NOT.
foo = <some random value here, let's say -2>
{ foo = f( foo ) }   curly brackets outline the operation
if f returns the square of the input then the operation is NOT idempotent. Because foo at the end will be (-2) raised to the power (number of times operation is executed)
if f returns the absolute of the input then the operation is idempotent because no matter how many multiple times the operation is executed foo will be abs(-2).
Here, end result is defined as the final value of variable foo.
In mathematical sense, idempotence has a slightly different meaning of:
f(f(....f(x))) = f(x)
here output of f(x) is passed as input to f again which doesn't need to be the case always with programming.

my 5c:
In integration and networking the idempotency is very important.
Several examples from real-life:
Imagine, we deliver data to the target system. Data delivered by a sequence of messages.
1. What would happen if the sequence is mixed in channel? (As network packages always do :) ). If the target system is idempotent, the result will not be different. If the target system depends of the right order in the sequence, we have to implement resequencer on the target site, which would restore the right order.
2. What would happen if there are the message duplicates? If the channel of target system does not acknowledge timely, the source system (or channel itself) usually sends another copy of the message. As a result we can have duplicate message on the target system side.
If the target system is idempotent, it takes care of it and result will not be different.
If the target system is not idempotent, we have to implement deduplicator on the target system side of the channel.

For a workflow manager (as Apache Airflow) if an idempotency operation fails in your pipeline the system can retry the task automatically without affecting the system. Even if the logs change, that is good because you can see the incident.
The most important in this case is that your system can retry the task that failed and doesn't mess up the pipeline (e.g. appending the same data in a table each retry)

Let's say the client makes a request to "IstanceA" service which process the request, passes it to DB, and shuts down before sending the response. since the client does not see that it was processed and it will retry the same request. Load balancer will forward the request to another service instance, "InstanceB", which will make the same change on the same DB item.
We should use idempotent tokens. When a client sends a request to a service, it should have some kind of request-id that can be saved in DB to show that we have already executed the request. if the client retries the request, "InstanceB" will check the requestId. Since that particular request already has been executed, it will not make any change to the DB item. Those kinds of requests are called idempotent requests. So we send the same request multiple times, but we won't make any change

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008