Distributing push notifications on multiple workers - mysql

Say you have millions of Android GCM device keys and you want to send them in a management script. This script will take loads of time to finish as it's processing the keys in the DB as a queue.
Question: How do you implement this faster? how do you send these notifications in parallel? how do you get to near-real-time push notifications?
One solution would to to instantiate an X number of celery workers where each worker is responsible for an offset Y at which it starts fetching from MySQL.
Example:
Worker 1: starts at offset 0,
Worker 2: starts at offset 10,000,
Worker 3: starts at offset 20,000,
Worker 4: starts at offset 30,000,
Worker 5: starts at offset 40,000,
Worker 1: Restarts at offset 50,000,
Worker 2: Restarts at offset 60,000,
... etc
Is this a viable solution?

Create list of tasks as a Celery group. Also because you have to retrieve all records from Android model it's good to create separate celery task which will do it in background:
#shared_task
def push_notification(offset, limit):
for android in Android.objects.all()[offset:offset+limit]:
pass
#shared_task
def push_notification_to_all():
count = Android.objects.all().count()
limit = 100
group(push_notification.s(offset, limit) for offset in range(0, count, limit)()
push_notification_to_all.delay()
Also instead of sending

Related

TransactionError when using Brownie on Optimism - Tx dropped without known replacement

I have a Python script using Brownie that occasionally triggers a swap on Uniswap by sending a transaction to Optimism Network.
It worked well for a few days (did multiple transactions successfully), but now each time it triggers a transaction, I get an error message:
TransactionError: Tx dropped without known replacement
However, the transaction goes through and get validated, but the script stops.
swap_router = interface.ISwapRouter(router_address)
params = (
weth_address,
dai_address,
3000,
account.address,
time.time() + 86400,
amount * 10 ** 18,
0,
0,
)
amountOut = swap_router.exactInputSingle(params, {"from": account})
There is a possibility that one of your methods seeks data off-chain and is being called prematurely before the confirmation is received.
I had the same problem, and I managed to sort it out by adding
time.sleep(60)
at the end of the function that seeks for data off-chain
"Dropped and replaced" means the transaction is being replaced by a new one, Eth is being overloaded with a new gas fee. My guess is that you need to increase your gas costs in order to average the price.

Testing k6 groups with separate throughput

I am targeting 10x load for an API, this API contains 6 endpoints which should be under the test, but each endpoint has its own throughput which should be multiplied by 10.
Now, I put all endpoints in one script file, but it doesn't make any sense to have the same throughput for all endpoints, I wanna run the k6 and it has to stop automatically when the needed throughput is already reached for a specific group.
Example:
api/GetUser > current 1k RPM > target 10k RPM
api/GetManyUsers > current 500 RPM > target 5k RPM
The main problem is when I put each endpoint in a separate group in one single script, this let k6 iterate over both groups/endpoints with the same iterations count with the same virtual users, which leads to reach 10x for both endpoints which is not required at the moment.
One more thing, I already tried to separate all endpoints in separate scripts, but this is difficult to manage and this makes the monitoring not easy because all 6 endpoints should be run in parallel.
What you need can currently be approximated roughly with the __ITER and/or __VU execution context variables. Have a single default function that has something like this:
if (__ITER % 3 == 0) {
CallGetManyUsers(); // 33% of iterations
} else {
CallGetUser(); // 66% of iterations
}
In the very near future we plan to also add a more elegant way of supporting multi-scenario tests in a single script: https://github.com/loadimpact/k6/pull/1007

State, Reward per step in a multiagnet environment

(crossposted:https://ai.stackexchange.com/questions/15693/state-reward-per-step-in-a-multiagnet-environment)
In a single agent environment, the agent takes an action, then observes the next state and reward:
for ep in num_episodes:
action = dqn.select_action(state)
next_state, reward = env.step(action)
Implicitly, the for moving the simulation (env) forward is embedded inside the env.step() function.
Now in the multiagent scenario, agent 1 ($a_1$) has to make a decision at time $t_{1a}$, which will finish at time $t_{2a}$, and agent 2 ($a_2$) makes a decision at time $t_{1b} < t_{1a}$ which is finished at $t_{2b} > t_{2a}$.
If both of their actions would start and finish at the same time, then it could easily be implemented as:
for ep in num_episodes:
action1, action2 = dqn.select_action([state1, state2])
next_state_1, reward_1, next_state_2, reward_2 = env.step([action1, action2])
because the env can execute both in parallel, wait till they are done, and then return the next states and rewards. But in the scenario that I described previously, it is not clear how to implement this (at least to me). Here, we need to explicitly track time, a check at any timepoint to see if an agent needs to make a decision, Just to be concrete:
for ep in num_episodes:
for t in total_time:
action1 = dqn.select_action(state1)
env.step(action1) # this step might take 5t to complete.
as such, the step() function won't return the reward till 5 t later.
#In the mean time, agent 2 comes and has to make a decision. its reward and next step won't be observed till 10 t later.
To summarize, how would one implement a multiagent environment with asynchronous action/rewards per agents?

host.json; meaning of batchsize

Does it make sense to set batchSize = 1? In case I would like to process files one-at-a-time?
Tried batchSize = 1000 and batchSize = 1 - seems to have the same effect
{
"version": "2.0",
"functionTimeout": "00:15:00",
"aggregator": {
"batchSize": 1,
"flushTimeout": "00:00:30"
}
}
Edited:
Added into app setings:
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT = 1
Still the function is triggered simultaneously - using blob trigger. Two more files were uploaded.
From https://github.com/Azure/azure-functions-host/wiki/Configuration-Settings
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT = 1
Set a maximum number of instances that a function app can scale to. This limit is not yet fully supported - it does work to limit your scale out, but there are some cases where it might not be completely foolproof. We're working on improving this.
I think I can close this issue. There is no easy way how to set one-message-one-time feature in multiple function apps instances.
I think your misunderstand the batchSize meaning with aggregator. This batchSize means Maximum number of requests to aggregate. You could check here and about the aggregator it's configured to the runtime agregates data about function executions over a period of time.
From your description, it's similar to the Azure Queue batchSize. It sets the number of queue messages that the Functions runtime retrieves simultaneously and processes in parallel. And If you want to avoid parallel execution for messages received on one queue, you can set batchSize to 1(This means one-message-one-time).

Spark streaming maxRate is violated sometimes

I have a simple Spark Streaming process (1.6.1) which receives data from Azure Event Hub. I am experimenting with back pressure and maxRate settings. This is my configuration:
spark.streaming.backpressure.enabled = true
spark.streaming.backpressure.pid.minRate = 900
spark.streaming.receiver.maxRate = 1000
I use two receivers, therefor per microbatch I would expect 2000 messages (in total). Most of the time this works fine (the total event count is below or equal the maxRate value). However sometimes I have spikes which violate the maxRate value.
My test case is as follows:
- send 10k events to azure event hub
- mock job/cluster downtime (no streaming job is running) 60sec delay
- start streaming job
- process events and assert events number smaller or equal to 2000
In that test I can observe that the total number of events sometime is higher than 2000, for example: 2075, 2530, 2040. It is not significantly higher and the processing is not time consuming however I would still expect the total number of events per microbatch to obey the maxRate value. Furthermore sometime the total number of events is smaller than backpressure.pid.minRate, for example: 811, 631.
Am I doing something wrong?