host.json; meaning of batchsize - function

Does it make sense to set batchSize = 1? In case I would like to process files one-at-a-time?
Tried batchSize = 1000 and batchSize = 1 - seems to have the same effect
{
"version": "2.0",
"functionTimeout": "00:15:00",
"aggregator": {
"batchSize": 1,
"flushTimeout": "00:00:30"
}
}
Edited:
Added into app setings:
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT = 1
Still the function is triggered simultaneously - using blob trigger. Two more files were uploaded.

From https://github.com/Azure/azure-functions-host/wiki/Configuration-Settings
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT = 1
Set a maximum number of instances that a function app can scale to. This limit is not yet fully supported - it does work to limit your scale out, but there are some cases where it might not be completely foolproof. We're working on improving this.
I think I can close this issue. There is no easy way how to set one-message-one-time feature in multiple function apps instances.

I think your misunderstand the batchSize meaning with aggregator. This batchSize means Maximum number of requests to aggregate. You could check here and about the aggregator it's configured to the runtime agregates data about function executions over a period of time.
From your description, it's similar to the Azure Queue batchSize. It sets the number of queue messages that the Functions runtime retrieves simultaneously and processes in parallel. And If you want to avoid parallel execution for messages received on one queue, you can set batchSize to 1(This means one-message-one-time).

Related

Testing k6 groups with separate throughput

I am targeting 10x load for an API, this API contains 6 endpoints which should be under the test, but each endpoint has its own throughput which should be multiplied by 10.
Now, I put all endpoints in one script file, but it doesn't make any sense to have the same throughput for all endpoints, I wanna run the k6 and it has to stop automatically when the needed throughput is already reached for a specific group.
Example:
api/GetUser > current 1k RPM > target 10k RPM
api/GetManyUsers > current 500 RPM > target 5k RPM
The main problem is when I put each endpoint in a separate group in one single script, this let k6 iterate over both groups/endpoints with the same iterations count with the same virtual users, which leads to reach 10x for both endpoints which is not required at the moment.
One more thing, I already tried to separate all endpoints in separate scripts, but this is difficult to manage and this makes the monitoring not easy because all 6 endpoints should be run in parallel.
What you need can currently be approximated roughly with the __ITER and/or __VU execution context variables. Have a single default function that has something like this:
if (__ITER % 3 == 0) {
CallGetManyUsers(); // 33% of iterations
} else {
CallGetUser(); // 66% of iterations
}
In the very near future we plan to also add a more elegant way of supporting multi-scenario tests in a single script: https://github.com/loadimpact/k6/pull/1007

Is the nvidia kepler shuffle "destructive"?

I'm using the implementation of the parallel reduction on CUDA using new kepler's shuffle instructions, similar to this:
http://devblogs.nvidia.com/parallelforall/faster-parallel-reductions-kepler/
I was searching for the minima of rows in a given matrix, and in the end of the kernel I had the following code:
my_register = min(my_register, __shfl_down(my_register,8,16));
my_register = min(my_register, __shfl_down(my_register,4,16));
my_register = min(my_register, __shfl_down(my_register,2,16));
my_register = min(my_register, __shfl_down(my_register,1,16));
My blocks are 16*16, so everything worked fine, with that code I was getting minima in two sub-rows in the very same kernel.
Now I also need to return the indices of the smallest elements in every row of my matrix, so I was going to replace "min" with the "if" statement and handle these indices in a similar fashion, I got stuck at this code:
if (my_reg > __shfl_down(my_reg,8,16)){my_reg = __shfl_down(my_reg,8,16);};
if (my_reg > __shfl_down(my_reg,4,16)){my_reg = __shfl_down(my_reg,4,16);};
if (my_reg > __shfl_down(my_reg,2,16)){my_reg = __shfl_down(my_reg,2,16);};
if (my_reg > __shfl_down(my_reg,1,16)){my_reg = __shfl_down(my_reg,1,16);};
No cudaErrors whatsoever, but kernel returns trash now. Nevertheless I have fix for that:
myreg_tmp = __shfl_down(myreg,8,16);
if (myreg > myreg_tmp){myreg = myreg_tmp;};
myreg_tmp = __shfl_down(myreg,4,16);
if (myreg > myreg_tmp){myreg = myreg_tmp;};
myreg_tmp = __shfl_down(myreg,2,16);
if (myreg > myreg_tmp){myreg = myreg_tmp;};
myreg_tmp = __shfl_down(myreg,1,16);
if (myreg > myreg_tmp){myreg = myreg_tmp;};
So, allocating new tmp variable to sneak into neighboring registers saves everything for me.
Now the question: Are the kepler shuffle instructions destructive ? in a sense that invoking same instruction twice doesn't issue the same result. I haven't assigned anything to those registers saying "my_reg > __shfl_down(my_reg,8,16)" - this adds up to my confusion. Can anyone explain me what is the problem with invoking shuffle twice? I'm pretty much a newbie in CUDA, so detailed explanation for dummies is welcomed
warp shuffle is not destructive. The operation, if repeated under the exact same conditions, will return the same result each time. The var value (myreg in your example) does not get modified by the warp shuffle function itself.
The problem you are experiencing is due to the fact that the number of participating threads on the second invocation of __shfl_down() in your first method is different than the other invocations, in either method.
First, let's remind ourselves of a key point in the documentation:
Threads may only read data from another thread which is actively participating in the __shfl() command. If the target thread is inactive, the retrieved value is undefined.
Now let's take a look at your first "broken" method:
if (my_reg > __shfl_down(my_reg,8,16)){my_reg = __shfl_down(my_reg,8,16);};
The first time you call __shfl_down() above (within the if-clause), all threads are participating. Therefore all values returned by __shfl_down() will be what you expect. However, once the if clause is complete, only threads that satisfied the if-clause will participate in the body of the if-statement. Therefore, on the second invocation of __shfl_down() within the if-statement body, only threads for which their my_reg value was greater than the my_reg value of the thread 8 lanes above them will participate. This means that some of these assignment statements probably will not return the value you expect, because the other thread may not be participating. (The participation of the thread 8 lanes above would be dependent on the result of the if comparison done by that thread, which may or may not be true.)
The second method you propose has no such issue, and works correctly according to your statements. All threads participate in each invocation of __shfl_down().

Continuously read data from a serial port while loop is running

First, please refer to this block of code:
while(1) {
lt = time(NULL);
ptr = localtime(&lt);
int n = read (fd, buf, sizeof(buf));
strftime(str, 100, "%c", ptr);
int temp = sprintf(tempCommand, "UPDATE roomtemp SET Temperature='%s' WHERE Date='Today'", buf);
temp = sprintf(dateCommand, "UPDATE roomtemp SET Date='%s' WHERE Type='DisplayTemp'", str);
printf("%s", buf);
mysql_query(conn, tempCommand);
mysql_query(conn, dateCommand);
}
The read function is actually reading data coming in from a serial port. It works great, but the problem I am experiencing (I think) is the time it takes for the loop to execute. I have data being sent to the serial port every second. Suppose the data is "22" every second. What this loop does is read in "2222" or sometimes "222222". What I think is happening is that the loop takes too long to iterate, and that causes data to accumulate in the serial buffer. The read statement reads in everything in the buffer, hence giving me repeated values.
Is there any way to get around this? Perhaps at the end of the loop, I can flush the buffer. But I am not certain I know how to do this. Or perhaps there is some way to cut down the code inside the loop in order to reduce the overall time each iteration takes in the first place. My guess is that the MySQL queries are what take the most time anyway.
To start with you should check for errors from read, and also properly terminate the received "string".
To continue with your problem, there are a couple of ways to solve this. One it to put either the reading from the serial port or the database updates in a separate thread. Then you can pass "messages" between the the threads. Be careful though, as it seems your database is slow and the message queue might build up. This message-buildup can be averted by having a message queue of size one, which always contain the latest temperature read. Then you only need a single flag that the temperature reading thread sets, and the database updating thread checks and then clears.
Another solution is to modify the protocol used for the communication, so it includes a digit to tell how big the message is.

How are functions modified at run-time then propagated to multiple threads?

With Clojure (and other Lisp dialects) you can modify running code. So, when a function is modified during runtime is that change made available to multiple threads?
I'm trying to figure out how it works technically in a concurrent setting: if several threads are using a function foo, what happens when I redefine (say using defn) the function foo?
There has to be some synchronization going on: when and how does such synchronization happen and what does it cost?
Say on a JVM, is the function referenced using a volatile reference? If so, does it mean every single time there's a "function lookup" then one has to pay the volatile cost?
In Clojure functions are instances of the IFn class and they are almost always stored in vars. vars are Clojures mechanism for thread local values.
when you define a function that sets the "root binding" of the var to reference the function
threads other threads get whatever the the current value of the root binding for the var but can't change the value. this prevents any two threads from having to fight over the value of the var because only the root thread can set the value.
threads can choose to use a new value of the var if they need to, but calling binding which gives then their own thread local value that they are free to change at will because no other thread can read it.
A good understanding of vars is well worth a little study, they are a very useful concurrency device once you get used to them.
ps: the root thread is usually the REPL
pss: you are of course free to store your functions in something other than vars, if for instance you needed to atomically update a group of functions, though this is rare.

Perl script multi thread not running parallel

I am completely new to Perl, like absolute newbie. I am trying to develop a system which reads a database and, according to the results, generates a queue which launches another script.
HERE is the source code.
Now the script works as expected, except I have noticed that it doesn't really do the threads parallel. Whether I use 1 thread or 50 threads, the execution time is the same; 1 thread is even faster.
When I have the script display which thread did what, I see the threads don't run at the same time, because it will do thread 1, then 2, then 3 etc.
Does anyone know what I did wrong here? Again the script itself works, just not in parallel threads.
You need to learn what semaphores actually are before you start using them. You've explicitly told the threads not to run in parallel:
my $s = Thread::Semaphore->new;
#...
while ($queue_id_list->pending > 0) {
$s->down;
my $info = $queue_id_list->dequeue_nb;
if (defined($info)) {
my #details = split(/#/, $info);
#my $result = system("./match_name db=user_".$details[0]." id=".$details[1]);
# normally the script above would be launched which is a php script run in php-cli and does some database things
sleep(0.1);
#print "Thread: ". threads->self->tid. " - Done user: ".$details[0]. " and addressbook id: ". $details[1]."\r\n";
#print $queue_id_list->pending."\r\n";
}
$s->up;
}
You've created a semaphore $s, which by default has a count of 1. Then in the function you're trying to run, you call $s->down at the start -- which decreases the count by 1, or blocks if the count is already <1, and $s->up at the end, which increases the count by 1.
Once a thread calls down, no other threads will run until it calls up again.
You should carefully read the Thread::Semaphore docs, and probably this wikipedia article on semaphores, too.