I need the figure out how to manage my retries in Nservicebus.
If there is any exception in my flow, It should retry 10 times every 10 seconds. But when I search in Nservicebus' website (http://docs.particular.net/nservicebus/errors/automatic-retries), there are 2 different retry mechanisms which are First Level Retry(FLR) and Second Level Retry (SLR).
FLR is for transient errors. When you got exception, It will try instantly according to your MaxRetries parameter. This parameter should be 1 for me.
SLR is for errors that persist after FLR, where a small delay is needed between retries. There is a config parameter called "TimeIncrease" defines a delay time between tries. However, Nservicebus do these retries increasingly delay time. When you set this parameter to 10 second. It will try 10.seconds, 30.seconds, 60.seconds and so on.
What do you suggest to me to provide my first request to try every 10 seconds with or without these mechanisms?
I found my answer;
The reply of Particular Software's community(John Simon), You need to apply a custom retry policy, have a look at http://docs.particular.net/nservicebus/errors/automatic-retries#second-level-retries-custom-retry-policy-simple-policy for an example.
Related
I want my consumers to process large batches, so I aim to have the consumer listener "awake", say, on 1800mb of data or every 5min, whichever comes first.
Mine is a kafka-springboot application, the topic has 28 partitions, and this is the configuration I explicitly change:
Parameter
Value I set
Default Value
Why I set it this way
fetch.max.bytes
1801mb
50mb
fetch.min.bytes+1mb
fetch.min.bytes
1800mb
1b
desired batch size
fetch.max.wait.ms
5min
500ms
desired cadence
max.partition.fetch.bytes
1801mb
1mb
unbalanced partitions
request.timeout.ms
5min+1sec
30sec
fetch.max.wait.ms + 1sec
max.poll.records
10000
500
1500 found too low
max.poll.interval.ms
5min+1sec
5min
fetch.max.wait.ms + 1sec
Nevertheless, I produce ~2gb of data to the topic, and I see the consumer-listener (a Batch Listener) is called many times per second -- way more than desired rate.
I logged the serialized-size of the ConsumerRecords<?,?> argument, and found that it is never more than 55mb.
This hints that I was not able to set fetch.max.bytes above the default 50mb.
Any idea how I can troubleshoot this?
Edit:
I found this question: Kafka MSK - a configuration of high fetch.max.wait.ms and fetch.min.bytes is behaving unexpectedly
Is it really impossible as stated?
Finally found the cause.
There is a broker fetch.max.bytes setting, and it defaults to 55mb. I only changed the consumer preferences, unaware of the broker-side limit.
see also
The kafka KIP and the actual commit.
(crossposted:https://ai.stackexchange.com/questions/15693/state-reward-per-step-in-a-multiagnet-environment)
In a single agent environment, the agent takes an action, then observes the next state and reward:
for ep in num_episodes:
action = dqn.select_action(state)
next_state, reward = env.step(action)
Implicitly, the for moving the simulation (env) forward is embedded inside the env.step() function.
Now in the multiagent scenario, agent 1 ($a_1$) has to make a decision at time $t_{1a}$, which will finish at time $t_{2a}$, and agent 2 ($a_2$) makes a decision at time $t_{1b} < t_{1a}$ which is finished at $t_{2b} > t_{2a}$.
If both of their actions would start and finish at the same time, then it could easily be implemented as:
for ep in num_episodes:
action1, action2 = dqn.select_action([state1, state2])
next_state_1, reward_1, next_state_2, reward_2 = env.step([action1, action2])
because the env can execute both in parallel, wait till they are done, and then return the next states and rewards. But in the scenario that I described previously, it is not clear how to implement this (at least to me). Here, we need to explicitly track time, a check at any timepoint to see if an agent needs to make a decision, Just to be concrete:
for ep in num_episodes:
for t in total_time:
action1 = dqn.select_action(state1)
env.step(action1) # this step might take 5t to complete.
as such, the step() function won't return the reward till 5 t later.
#In the mean time, agent 2 comes and has to make a decision. its reward and next step won't be observed till 10 t later.
To summarize, how would one implement a multiagent environment with asynchronous action/rewards per agents?
I would love to be able to wait for a random amount of time (let's say a number between 5-12 seconds, chosen at random each time) before executing my next action in Puppeteer, in order to make the behaviour seem more authentic/real world user-like.
I'm aware of how to do it in plain Javascript (as detailed in the Mozilla docs here), but can't seem to get it working in Puppeteer using the waitFor call (which I assume is what I'm supposed to use?).
Any help would be greatly appreciated! :)
You can use vanila JS to randomly wait between 5-12 seconds between action.
await page.waitFor((Math.floor(Math.random() * 12) + 5) * 1000)
Where:
5 is the start number
12 is the end number
1000 means it's converting seconds to milliseconds
(PS: However, if you question is about waiting 5-12 seconds randomly before every action, then you should have a class with wrapper, which is a different issue until you update your question.)
I see such code segment in proposer_task(xcom_base.c)
if(threephase || ep->p->force_delivery){
push_msg_3p(ep->site, ep->p, ep->prepare_msg, ep->msgno, normal);
}else{
push_msg_2p(ep->site, ep->p);
}
the threepahse is int const threephase = 0 and force_delivery == 0 here
push_msg_eq is normal paxos include prepare, accept and learn phase
but push_msg_2p will skip prepare phase and directly send accept request
I want to know why, Thanks a lot.
If you look at the paper Paxos Made Simple page 10 paragraph 3 says:
A newly chosen leader executes phase 1 for infinitely many instances
of the consensus algorithm [...]
Then paragraph 4:
Since failure of the leader and election of a new one should be rare
events, the effective cost of executing a state machine command—that
is, of achieving consensus on the command/value—is the cost of
executing only phase 2 of the consensus algorithm. It can be shown
that phase 2 of the Paxos consensus algorithm has the minimum possible
cost of any algorithm for reaching agreement in the presence of faults.
Hence, the Paxos algorithm is essentially optimal.
This is saying that a leader only issues a prepare during a leader failover. After that it streams accept messages. It then has "optimal messaging" in that the leader only needs one round trip to know a value is chosen (the accept message and its acknowledgment).
In a three node cluster, a leader self-accepts instantaneously, then only needs one accept acknowledgment from a second node to have a majority. It then knows the value is chosen without having to await the response from the 3rd node (which could be down). That is as efficient as you can get. The value is known to be accepted at a second node with strong consistency.
Given that is how paxos works to get maximum efficiency we should expect that mysql xcom has a mode that skips the prepare message phase in steady state.
You can read more about the Paxos Made Simple techniques on my blog here.
You might be interested to know about the latest developments of Paxos where you don't need a majority response for accept messages in the cluster using FPaxos and tricks like the even nodes optimization.
I've got this kind of problem with Proton CEP: i currently have a "Sequence" EPA; its input are 2 events. But these events have different granularity: let's say i have A and B events; i receive N "A" events, and M "B" events, where M << N.
So i'd like to have a rule like "if event of type A is not consumed within X seconds, remove it", otherwise i've got a long A events queue; i only need the rule to be evaluated for closest (temporally) events.
Practically, i've got a fake room temperature sensor that sends its temperature updates every 5seconds, and i've got another program that checks external weather and sends it every minute.
Any idea how to solve this situation?
Thank you very much!
I guess that in "consume" you mean arrival, so do you want to evaluate the time the A event took to get to the proton pcoressor? or the time between A events? Do you want to ensure that the A events are indeed continuous in a fix rate? "Removing" an event means to ignore it, since events are not kept anywhere, just processed. At the end, what is that you want to detect here? Like, what is the trend of room temperature compared to the outside temperature? then, emit output events accordingly?
Thanks.
all the relevant event instances are kept within the local state of a corresponding EPA.
For each EPA operand you have policies which dictates how the state is gathered and how the matching set for event derivation is built.
For example, instance selection policy which is defined per operand, and has the values of "Each", "First" and "Last" will tell you if all A instances are examined for match with B instance, or the first (in the order of arrival), or the last.
The consumption policy says what to do with the operand state once a seqence is detected - should the instances of say A which participated in sequence be removed from EPA's state ("consume" value of the policy) or should they remain.
Playing with combination of those policies should give you the behaviour you require