Is long polling more resource efficient than classical polling? - comet

Why is it more efficient for me to hold an http connection open until content comes and then re-open the connection than to simply open a connection periodically?
Ofcourse the latter scenario is perhaps more hit or miss but I am asking purely from a resource efficiency point of view.

By keeping a connection open, you are blocking resources but not incurring the overhead of periodic tearing down connections and setting up connections. Setting & closing a socket connection is lot more expensive underneath the function call. Sending the close intent to the connection end point, freeing the kernel resources and memory associated with it. For opening the connection, the same in reverse happens. For allocating kernel resources, there may be serialized calls (depends on kernel implementation) which can affect the overall system performance. Last but not the least, the hit-n-miss approach is not a deterministic model.

Let's say you have a thread blocked on a socket waiting for a response. (As in comet). During that time, the thread isn't scheduled by the kernel and other things on the machine can run. However if you're polling the thread is busy with brief wait periods. This also adds latency because you won't know of a need to do something until the poll occurs.

Related

What effects GCP Cloud Function memory usage

I recently redeployed a hanful of python GCP cloud functions and noticed they are taking about 50mbs more memory, triggering memory limit errors (I had to increase the memory allocation from 256mb to 512mb to get them to run). Unfortunately, that is 2x the cost.
I am trying to figure what caused the memory increase. The only thing I can think of is a python package recent upgrade. So, I specified all package versions in the requirements.txt, based off of my local virtual env, which has not changed lately. The memory usage increase remained.
Are there other factors that would lead to a memory utilization increase? Python runtime is still 3.7, the data that the functions processed has not changed. It also doesn't seem to be a change that GCP has made to cloud functions in general, because it has only happened with functions I have redeployed.
I can point out few possibilities of memory limit errors which are:
One of the reasons for out of memory in Cloud Functions is as discussed in the document.
Files that you write consume memory available to your function, and
sometimes persist between invocations. Failing to explicitly delete
these files may eventually lead to an out-of-memory error and a
subsequent cold start.
As mentioned in this StackOverflow Answer, that if you allocate anything in global memory space without deallocating it, the memory allocation will count this with the future invocations. To minimize memory usage, only allocate objects locally that will get cleaned up when the function is complete. Memory leaks are often difficult to detect.
Also, The cloud functions need to respond when they're done. if they don't respond then their allocated resources won't be free. Any exception in the cloud functions may cause a memory limit error.
You may also wanna check Auto-scaling and Concurrency which mentions another possibility.
Each instance of a function handles only one concurrent request at a
time. This means that while your code is processing one request, there
is no possibility of a second request being routed to the same
instance. Thus the original request can use the full amount of
resources (CPU and memory) that you requested.
Lastly, this may be caused by issues with logging. If you are logging objects, this may prevent these objects from being garbage collected. You may need to make the logging less verbose and use string representations to see if the memory usage gets better. Either way, you could try using the Profiler in order to get more information about what’s going on with your Cloud Function’s memory.

Handling lots of req / sec in go or nodejs

I'm developing a web app that needs to handle bursts of very high loads,
once per minute I get a burst of requests in very few seconds (~1M-3M/sec) and then for the rest of the minute I get nothing,
What's my best strategy to handle as many req /sec as possible at each front server, just sending a reply and storing the request in memory somehow to be processed in the background by the DB writer worker later ?
The aim is to do as less as possible during the burst, and write the requests to the DB ASAP after the burst.
Edit : the order of transactions in not important,
we can lose some transactions but 99% need to be recorded
latency of getting all requests to the DB can be a few seconds after then last request has been received. Lets say not more than 15 seconds
This question is kind of vague. But I'll take a stab at it.
1) You need limits. A simple implementation will open millions of connections to the DB, which will obviously perform badly. At the very least, each connection eats MB of RAM on the DB. Even with connection pooling, each 'thread' could take a lot of RAM to record it's (incoming) state.
If your app server had a limited number of processing threads, you can use HAProxy to "pick up the phone" and buffer the request in a queue for a few seconds until there is a free thread on your app server to handle the request.
In fact, you could just use a web server like nginx to take the request and say "200 OK". Then later, a simple app reads the web log and inserts into DB. This will scale pretty well, although you probably want one thread reading the log and several threads inserting.
2) If your language has coroutines, it may be better to handle the buffering yourself. You should measure the overhead of relying on our language runtime for scheduling.
For example, if each HTTP request is 1K of headers + data, want to parse it and throw away everything but the one or two pieces of data that you actually need (i.e. the DB ID). If you rely on your language coroutines as an 'implicit' queue, it will have 1K buffers for each coroutine while they are being parsed. In some cases, it's more efficient/faster to have a finite number of workers, and manage the queue explicitly. When you have a million things to do, small overheads add up quickly, and the language runtime won't always be optimized for your app.
Also, Go will give you far better control over your memory than Node.js. (Structs are much smaller than objects. The 'overhead' for the Keys to your struct is a compile-time thing for Go, but a run-time thing for Node.js)
3) How do you know it's working? You want to be able to know exactly how you are doing. When you rely on the language co-routines, it's not easy to ask "how many threads of execution do I have and what's the oldest one?" If you make an explicit queue, those questions are much easier to ask. (Imagine a handful of workers putting stuff in the queue, and a handful of workers pulling stuff out. There is a little uncertainty around the edges, but the queue in the middle very explicitly captures your backlog. You can easily calculate things like "drain rate" and "max memory usage" which are very important to knowing how overloaded you are.)
My advice: Go with Go. Long term, Go will be a much better choice. The Go runtime is a bit immature right now, but every release is getting better. Node.js is probably slightly ahead in a few areas (maturity, size of community, libraries, etc.)
How about a channel with a buffer size equal to what the DB writer can handle in 15 seconds? When the request comes in, it is sent on the channel. If the channel is full, give some sort of "System Overloaded" error response.
Then the DB writer reads from the channel and writes to the database.

How can I make an SQL query thread start, then do other work before getting results?

I have a program that does a limited form of multithreading. It is written in Delphi, and uses libmysql.dll (the C API) to access a MySQL server. The program must process a long list of records, taking ~0.1s per record. Think of it as one big loop. All database access is done by worker threads which either prefetch the next records or write results, so the main thread doesn't have to wait.
At the top of this loop, we first wait for the prefetch thread, get the results, then have the prefetch thread execute the query for the next record. The idea being that the prefetch thread will send the query immediately, and wait for results while the main thread completes the loop.
It often does work that way. But note there's nothing to ensure that the prefetch thread runs right away. I found that often the query was not sent until the main thread looped around and started waiting for the prefetch.
I sort-of fixed that by calling sleep(0) right after launching the prefetch thread. This way the main thread surrenders the remainder of it's time slice, hoping that the prefetch thread will now run, sending the query. Then that thread will sleep while waiting, which allows the main thread to run again.
Of course, there's plenty more threads running in the OS, but this did actually work to some extent.
What I really want to happen is for the main thread to send the query, and then have the worker thread wait for the results. Using libmysql.dll I call
result := mysql_query(p.SqlCon,pChar(p.query));
in the worker thread. Instead, I'd like to have the main thread call something like
mysql_threadedquery(p.SqlCon,pChar(p.query),thread);
which would hand off the task as soon as the data went out.
Anybody know of anything like that?
This is really a scheduling problem, so I could try is lauching the prefetch thread at a higher priority, then have it reduce its priority after the query is sent. But again, I don't have any mysql call that separates sending the query from receiving the results.
Maybe it's in there and I just don't know about it. Enlighten me, please.
Added Question:
Does anyone think this problem would be solved by running the prefetch thread at a higher priority than the main thread? The idea is that the prefetch would immediately preempt the main thread and send the query. Then it would sleep waiting for the server reply. Meanwhile the main thread would run.
Added: Details of current implementation
This program performs calculations on data contained in a MySQL DB. There are 33M items with more added every second. The program runs continuously, processing new items, and sometimes re-analyzing old items. It gets a list of items to analyze from a table, so at the beginning of a pass (current item) it knows the next item ID it will need.
As each item is independent, this is a perfect target for multiprocessing. The easiest way to do this is to run multiple instances of the program on multiple machines. The program is highly optimized via profiling, rewrites, and algorithm redesign. Still, a single instance utilizes 100% of a CPU core when not data-starved. I run 4-8 copies on two quad-core workstations. But at this rate they must spend time waiting on the MySQL server. (Optimization of the Server/DB schema is another topic.)
I implemented multi-threading in the process solely to avoid blocking on the SQL calls. That's why I called this "limited multi-threading". A worker thread has one task: send a command and wait for results. (OK, two tasks.)
It turns out there are 6 blocking tasks associated with 6 tables. Two of these read data and the other 4 write results. These are similar enough to be defined by a common Task structure. A pointer to this Task is passed to a threadpool manager which assigns a thread to do the work. The main thread can check the task status through the Task structure.
This makes the main thread code very simple. When it needs to perform Task1, it waits for Task1 to be not busy, puts the SQL command in Task1 and hands it off. When Task1 is no longer busy, it contains the results (if any).
The 4 tasks that write results are trivial. The main thread has a Task write records while it goes on to the next item. When done with that item it makes sure the previous write finished before starting another.
The 2 reading threads are less trivial. Nothing would be gained by passing the read to a thread and then waiting for the results. Instead, these tasks prefetch data for the next item. So the main thread, coming to this blocking tasks, checks if the prefetch is done; Waits if necessary for the prefetch to finish, then takes the data from the Task. Finally, it reissues the Task with the NEXT Item ID.
The idea is for the prefetch task to immediately issue the query and wait for the MySQL server. Then the main thread can process the current Item and by the time it starts on the next Item the data it needs is in the prefetch Task.
So the threading, a thread pool, the synchronization, data structures, etc. are all done. And that all works. What I'm left with is a Scheduling Problem.
The Scheduling Problem is this: All the speed gain is in processing the current Item while the server is fetching the next Item. We issue the prefetch task before processing the current item, but how do we guarantee that it starts? The OS scheduler does not know that it's important for the prefetch task to issue the query right away, and then it will do nothing but wait.
The OS scheduler is trying to be "fair" and allow each task to run for an assigned time slice. My worst case is this: The main thread receives its slice and issues a prefetch, then finishes the current item and must wait for the next item. Waiting releases the rest of its time slice, so the scheduler starts the prefetch thread, which issues the query and then waits. Now both threads are waiting. When the server signals the query is done the prefetch thread restarts, and requests the Results (dataset) then sleeps. When the server provides the results the prefetch thread awakes, marks the Task Done and terminates. Finally, the main thread restarts and takes the data from the finished Task.
To avoid this worst-case scheduling I need some way to ensure that the prefetch query is issued before the main thread goes on with the current item. So far I've thought of three ways to do that:
Right after issuing the prefetch task, the main thread calls Sleep(0). This should relinquish the rest of its time slice. I then hope that the scheduler runs the prefetch thread, which will issue the query and then wait. Then the scheduler should restart the main thread (I hope.) As bad as it sounds, this actually works better than nothing.
I could possibly issue the prefetch thread at a higher priority than the main thread. That should cause the scheduler to run it right away, even if it must preempt the main thread. It may also have undesirable effects. It seems unnatural for a background worker thread to get a higher priority.
I could possibly issue the query asynchronously. That is, separate sending the query from receiving the results. That way I could have the main thread send the prefetch using mysql_send_query (non blocking) and go on with the current item. Then when it needed the next item it would call mysql_read_query, which would block until the data is available.
Note that solution 3 does not even use a worker thread. This looks like the best answer, but requires a rewrite of some low-level code. I'm currently looking for examples of such asynchronous client-server access.
I'd also like any experienced opinions on these approaches. Have I missed anything, or am I doing anything wrong? Please note that this is all working code. I'm not asking how to do it, but how to do it better/faster.
Still, a single instance utilizes 100% of a CPU core when not data-starved. I run 4-8 copies on two quad-core workstations.
I have a conceptual problem here. In your situation I would either create a multi-process solution, with each process doing everything in its single thread, or I would create a multi-threaded solution that is limited to a single instance on any particular machine. Once you decide to work with multiple threads and accept the added complexity and probability of hard-to-fix bugs, then you should make maximum use of them. Using a single process with multiple threads allows you to employ varying numbers of threads for reading from and writing to the database and to process your data. The number of threads may even change during the runtime of your program, and the ratio of database and processing threads may too. This kind of dynamic partitioning of the work is only possible if you can control all threads from a single point in the program, which isn't possible with multiple processes.
I implemented multi-threading in the process solely to avoid blocking on the SQL calls.
With multiple processes there wouldn't be a real need to do so. If your processes are I/O-bound some of the time they don't consume CPU resources, so you probably simply need to run more of them than your machine has cores. But then you have the problem to know how many processes to spawn, and that may again change over time if the machine does other work too. A threaded solution in a single process can be made adaptable to a changing environment in a relatively simple way.
So the threading, a thread pool, the synchronization, data structures, etc. are all done. And that all works. What I'm left with is a Scheduling Problem.
Which you should leave to the OS. Simply have a single process with the necessary pooled threads. Something like the following:
A number of threads reads records from the database and adds them to a producer-consumer queue with an upper bound, which is somewhere between N and 2*N where N is the number of processor cores in the system. These threads will block on the full queue, and they can have increased priority, so that they will be scheduled to run as soon as the queue has more room and they become unblocked. Since they will be blocked on I/O most of the time their higher priority shouldn't be a problem.
I don't know what that number of threads is, you would need to measure.
A number of processing threads, probably one per processor core in the system. They will take work items from the queue mentioned in the previous point, on block on that queue if it's empty. Processed work items should go to another queue.
A number of threads that take processed work items from the second queue and write data back to the database. There should probably an upper bound for the second queue as well, to make it so that a failure to write processed data back to the database will not cause processed data to pile up and fill all your process memory space.
The number of threads needs to be determined, but all scheduling will be performed by the OS scheduler. The key is to have enough threads to utilise all CPU cores, and the necessary number of auxiliary threads to keep them busy and deal with their outputs. If these threads come from pools you are free to adjust their numbers at runtime too.
The Omni Thread Library has a solution for tasks, task pools, producer consumer queues and everything else you would need to implement this. Otherwise you can write your own queues using mutexes.
The Scheduling Problem is this: All the speed gain is in processing the current Item while the server is fetching the next Item. We issue the prefetch task before processing the current item, but how do we guarantee that it starts?
By giving it a higher priority.
The OS scheduler does not know that it's important for the prefetch task to issue the query right away
It will know if the thread has a higher priority.
The OS scheduler is trying to be "fair" and allow each task to run for an assigned time slice.
Only for threads of the same priority. No lower priority thread will get any slice of CPU while a higher priority thread in the same process is runnable.
[Edit: That's not completely true, more information at the end. However, it is close enough to the truth to ensure that your higher priority network threads send and receive data as soon as possible.]
Right after issuing the prefetch task, the main thread calls Sleep(0).
Calling Sleep() is a bad way to force threads to execute in a certain order. Set the thread priority according to the priority of the work they perform, and use OS primitives to block higher priority threads if they should not run.
I could possibly issue the prefetch thread at a higher priority than the main thread. That should cause the scheduler to run it right away, even if it must preempt the main thread. It may also have undesirable effects. It seems unnatural for a background worker thread to get a higher priority.
There is nothing unnatural about this. It is the intended way to use threads. You only must make sure that higher priority threads block sooner or later, and any thread that goes to the OS for I/O (file or network) does block. In the scheme I sketched above the high priority threads will also block on the queues.
I could possibly issue the query asynchronously.
I wouldn't go there. This technique may be necessary when you write a server for many simultaneous connections and a thread per connection is prohibitively expensive, but otherwise blocking network access in a threaded solution should work fine.
Edit:
Thanks to Jeroen Pluimers for the poke to look closer into this. As the information in the links he gave in his comment shows my statement
No lower priority thread will get any slice of CPU while a higher priority thread in the same process is runnable.
is not true. Lower priority threads that haven't been running for a long time get a random priority boost and will indeed sooner or later get a share of CPU, even though higher priority threads are runnable. For more information about this see in particular "Priority Inversion and Windows NT Scheduler".
To test this out I created a simple demo with Delphi:
type
TForm1 = class(TForm)
Label1: TLabel;
Label2: TLabel;
Label3: TLabel;
Label4: TLabel;
Label5: TLabel;
Label6: TLabel;
Timer1: TTimer;
procedure FormCreate(Sender: TObject);
procedure FormDestroy(Sender: TObject);
procedure Timer1Timer(Sender: TObject);
private
fLoopCounters: array[0..5] of LongWord;
fThreads: array[0..5] of TThread;
end;
var
Form1: TForm1;
implementation
{$R *.DFM}
// TTestThread
type
TTestThread = class(TThread)
private
fLoopCounterPtr: PLongWord;
protected
procedure Execute; override;
public
constructor Create(ALowerPriority: boolean; ALoopCounterPtr: PLongWord);
end;
constructor TTestThread.Create(ALowerPriority: boolean;
ALoopCounterPtr: PLongWord);
begin
inherited Create(True);
if ALowerPriority then
Priority := tpLower;
fLoopCounterPtr := ALoopCounterPtr;
Resume;
end;
procedure TTestThread.Execute;
begin
while not Terminated do
InterlockedIncrement(PInteger(fLoopCounterPtr)^);
end;
// TForm1
procedure TForm1.FormCreate(Sender: TObject);
var
i: integer;
begin
for i := Low(fThreads) to High(fThreads) do
// fThreads[i] := TTestThread.Create(True, #fLoopCounters[i]);
fThreads[i] := TTestThread.Create(i >= 4, #fLoopCounters[i]);
end;
procedure TForm1.FormDestroy(Sender: TObject);
var
i: integer;
begin
for i := Low(fThreads) to High(fThreads) do begin
if fThreads[i] <> nil then
fThreads[i].Terminate;
end;
for i := Low(fThreads) to High(fThreads) do
fThreads[i].Free;
end;
procedure TForm1.Timer1Timer(Sender: TObject);
begin
Label1.Caption := IntToStr(fLoopCounters[0]);
Label2.Caption := IntToStr(fLoopCounters[1]);
Label3.Caption := IntToStr(fLoopCounters[2]);
Label4.Caption := IntToStr(fLoopCounters[3]);
Label5.Caption := IntToStr(fLoopCounters[4]);
Label6.Caption := IntToStr(fLoopCounters[5]);
end;
This creates 6 threads (on my 4 core machine), either all with lower priority, or 4 with normal and 2 with lower priority. In the first case all 6 threads run, but with wildly different shares of CPU time:
In the second case 4 threads run with roughly equal share of CPU time, but the other two threads get a little share of the CPU as well:
But the share of CPU time is very very small, way below a percent of what the other threads receive.
And to get back to your question: A program using multiple threads with custom priority, coupled via producer-consumer queues, should be a viable solution. In the normal case the database threads will block most of the time, either on the network operations or on the queues. And the Windows scheduler will make sure that even a lower priority thread will not completely starve to death.
I don't know any database access layer that permits this.
The reason is that each thread has its own "thread local storage" (The threadvar keyword in Delphi, other languages have equivalents, it is used in a lot of frameworks).
When you start things on one thread, and continue it on another, then you get these local storages mixed up causing all sorts of havoc.
The best you can do is this:
pass the query and parameters to the thread that will handle this (use the standard Delphi thread synchronization mechanisms for this)
have the actual query thread perform the query
return the results to the main thread (use the standard Delphi thread synchronization mechanisms for this)
The answers to this question explains thread synchronization in more detail.
Edit: (on presumed slowness of starting something in an other thread)
"Right away" is a relative term: it depends in how you do your thread synchronization and can be very very fast (i.e. less than a millisecond).
Creating a new thread might take some time.
The solution is to have a threadpool of worker threads that is big enough to service a reasonable amount of requests in an efficient manner.
That way, if the system is not yet too busy, you will have a worker thread ready to start servicing your request almost immediately.
I have done this (even cross process) in a big audio application that required low latency response, and it works like a charm.
The audio server process runs at high priority waiting for requests. When it is idle, it doesn't consume CPU, but when it receives a request it responds really fast.
The answers to this question on changes with big improvements and this question on cross thread communication provide some interesting tips on how to get this asynchronous behaviour working.
Look for the words AsyncCalls, OmniThread and thread.
--jeroen
I'm putting in a second answer, for your second part of the question: your Scheduling Problem
This makes it easier to distinguish both answers.
First of all, you should read Consequences of the scheduling algorithm: Sleeping doesn't always help which is part of Raymond Chen's blog "The Old New Thing".
Sleeping versus polling is also good reading.
Basically all these make good reading.
If I understand your Scheduling Problem correctly, you have 3 kinds of threads:
Main Thread: makes sure the Fetch Threads always have work to do
Fetch Threads: (database bound) fetch data for the Processing Threads
Processing Threads: (CPU bound) process fetched data
The only way to keep 3 running is to have 2 fetch as much data as they can.
The only way to keep 2 fetching, is to have 1 provide them enough entries to fetch.
You can use queues to communicate data between 1 and 2 and between 2 and 3.
Your problem now is two-fold:
finding the balance between the number of threads in category 2 and 3
making sure that 2 always have work to do
I think you have solved the former.
The latter comes down to making sure the queue between 1 and 2 is never empty.
A few tricks:
You can use Sleep(1) (see the blog article) as a simple way to "force" 2 to run
Never let the treads exit their execute: creating and destroying threads is expensive
choose your synchronization objects (often called IPC objects) carefully (Kudzu has a nice article on them)
--jeroen
You just have to use the standard Thread synchronization mechanism of the Delphi threading.
Check your IDE help for TEvent class and its associated methods.

What happens during Stand-By and Hibernation?

It just hit me the other day. What actually happens when I tell the computer to go into Stand-By or to Hibernate?
More spesifically, what implications, if any, does it have on code that is running? For example if an application is compressing some files, encoding video files, checking email, running a database query, generating reports or just processing lots of data or doing complicated math stuff. What happens? Can you end up with a bug in your video? Can the database query fail? Can data processing end up containing errors?
I'm asking this both out of general curiosity, but also because I started to wonder if this is something I should think about when I program myself.
You should remember that the OS (scheduler) freezes your program about a gazillion times each second. This means that your program can already function pretty well when the operating system freezes it. There isn't much difference, from your point of view, between stand-by, hibernate and context switching.
What is different is that you'll be frozen for a long time. And this is the only thing you need to think about. In most cases, this shouldn't be a problem.
If you have a network connection you'll probably need to re-establish it, and similar issues. But this just means checking for errors in all IO operations, which I'm sure you're already doing... :-)
My initial thought is that as long as your program and its eco-system is contained within the pc that is going on stand - by or hibernation, then, upon resume your program should not be affected.
However, if you are say updating a record in some database hosted on a separate machine then hibernation / stand - by will be treated as a timeout.
If your program is dependent on such a change in "power status" you can listen to WM_POWERBROADCAST Message as mentioned on msdn
Stand-By keeps your "state" alive by keeping it in RAM. As a consequence if you lose power you'll lose your stored "state".
But it makes it quicker to achieve.
Hibernation stores your "state" in virtual RAM on the hard disk, so if you lose power you can still come back three days later. But it's slower.
I guess a limitation with Stand-By is how much RAM you've got, but I'm sure virtual RAM must be employed by Stand-By when it runs out of standard RAM. I'll look that up though and get back!
The Wikipedia article on ACPI contains the details about the different power savings modes which are present in modern PCs.
Here's the basic idea, from how I understand things:
The basic idea is to keep the current state of the system persisted, so when the machine is brought back into operation, it can resume at the state it was before the machine was put into sleep/standby/hibernation, etc. Think of it as serialization for your PC.
In standby, the computer will keep feeding power to the RAM, as the main memory is volatile memory that needs constant refreshing to hold on to its state. This means that the hard drives, CPU, and other components can be turned off, as long as there is enough power to keep the DRAM refreshed to keep its contents from disappearing.
In hibernation, the main memory will also be turned off, so the contents must be copied to permanent storage, such as a hard drive, before the system power is turned off. Other than that, the basic premise of hiberation is no different from standby -- to store the current state of the machine to restore at a later time.
With that in mind, it's probably not too likely that going into standby or hibernate will cause problems with tasks that are executing at the moment. However, it may not be a good idea to allow network activity to stop in the middle of execution, as depending on the protocol, your network connection could timeout and be unable to resume upon returning the system to its running state.
Also, there may be some machines that just have flaky power-savings drivers which may cause it to go to standby and never come back, but that's completely a different issue.
There are some implications for your code. Hibernation is more than just a context switch from the scheduler. Network connections will be closed, network drives or removable media might be disconnected during the hibernation, ...
I dont think your application can be notified of hibernation (but I might be wrong). What you should do is handle error scenarios (loss of network connectivity for example) as gracefully as possible. And note that those error scenario can occur during normal operation as well, not only when going into hibernation ...

Is "Out Of Memory" A Recoverable Error?

I've been programming a long time, and the programs I see, when they run out of memory, attempt to clean up and exit, i.e. fail gracefully. I can't remember the last time I saw one actually attempt to recover and continue operating normally.
So much processing relies on being able to successfully allocate memory, especially in garbage collected languages, it seems that out of memory errors should be classified as non-recoverable. (Non-recoverable errors include things like stack overflows.)
What is the compelling argument for making it a recoverable error?
It really depends on what you're building.
It's not entirely unreasonable for a webserver to fail one request/response pair but then keep on going for further requests. You'd have to be sure that the single failure didn't have detrimental effects on the global state, however - that would be the tricky bit. Given that a failure causes an exception in most managed environments (e.g. .NET and Java) I suspect that if the exception is handled in "user code" it would be recoverable for future requests - e.g. if one request tried to allocate 10GB of memory and failed, that shouldn't harm the rest of the system. If the system runs out of memory while trying to hand off the request to the user code, however - that kind of thing could be nastier.
In a library, you want to efficiently copy a file. When you do that, you'll usually find that copying using a small number of big chunks is much more effective than copying a lot of smaller ones (say, it's faster to copy a 15MB file by copying 15 1MB chunks than copying 15'000 1K chunks).
But the code works with any chunk size. So while it may be faster with 1MB chunks, if you design for a system where a lot of files are copied, it may be wise to catch OutOfMemoryError and reduce the chunk size until you succeed.
Another place is a cache for Object stored in a database. You want to keep as many objects in the cache as possible but you don't want to interfere with the rest of the application. Since these objects can be recreated, it's a smart way to conserve memory to attach the cache to an out of memory handler to drop entries until the rest of the app has enough room to breathe, again.
Lastly, for image manipulation, you want to load as much of the image into memory as possible. Again, an OOM-handler allows you to implement that without knowing in advance how much memory the user or OS will grant your code.
[EDIT] Note that I work under the assumption here that you've given the application a fixed amount of memory and this amount is smaller than the total available memory excluding swap space. If you can allocate so much memory that part of it has to be swapped out, several of my comments don't make sense anymore.
Users of MATLAB run out of memory all the time when performing arithmetic with large arrays. For example if variable x fits in memory and they run "x+1" then MATLAB allocates space for the result and then fills it. If the allocation fails MATLAB errors and the user can try something else. It would be a disaster if MATLAB exited whenever this use case came up.
OOM should be recoverable because shutdown isn't the only strategy to recovering from OOM.
There is actually a pretty standard solution to the OOM problem at the application level.
As part of you application design determine a safe minimum amount of memory required to recover from an out of memory condition. (Eg. the memory required to auto save documents, bring up warning dialogs, log shutdown data).
At the start of your application or at the start of a critical block, pre-allocate that amount of memory. If you detect an out of memory condition release your guard memory and perform recovery. The strategy can still fail but on the whole gives great bang for the buck.
Note that the application need not shut down. It can display a modal dialog until the OOM condition has been resolved.
I'm not 100% certain but I'm pretty sure 'Code Complete' (required reading for any respectable software engineer) covers this.
P.S. You can extend your application framework to help with this strategy but please don't implement such a policy in a library (good libraries do not make global decisions without an applications consent)
I think that like many things, it's a cost/benefit analysis. You can program in attempted recovery from a malloc() failure - although it may be difficult (your handler had better not fall foul of the same memory shortage it's meant to deal with).
You've already noted that the commonest case is to clean up and fail gracefully. In that case it's been decided that the cost of aborting gracefully is lower than the combination of development cost and performance cost in recovering.
I'm sure you can think of your own examples of situations where terminating the program is a very expensive option (life support machine, spaceship control, long-running and time-critical financial calculation etc.) - although the first line of defence is of course to ensure that the program has predictable memory usage and that the environment can supply that.
I'm working on a system that allocates memory for IO cache to increase performance. Then, on detecting OOM, it takes some of it back, so that the business logic could proceed, even if that means less IO cache and slightly lower write performance.
I also worked with an embedded Java applications that attempted to manage OOM by forcing garbage collection, optionally releasing some of non-critical objects, like pre-fetched or cached data.
The main problems with OOM handling are:
1) being able to re-try in the place where it happened or being able to roll back and re-try from a higher point. Most contemporary programs rely too much on the language to throw and don't really manage where they end up and how to re-try the operation. Usually the context of the operation will be lost, if it wasn't designed to be preserved
2) being able to actually release some memory. This means a kind of resource manager that knows what objects are critical and what are not, and the system be able to re-request the released objects when and if they later become critical
Another important issue is to be able to roll back without triggering yet another OOM situation. This is something that is hard to control in higher level languages.
Also, the underlying OS must behave predictably with regard to OOM. Linux, for example, will not, if memory overcommit is enabled. Many swap-enabled systems will die sooner than reporting the OOM to the offending application.
And, there's the case when it is not your process that created the situation, so releasing memory does not help if the offending process continues to leak.
Because of all this, it's often the big and embedded systems that employ this techniques, for they have the control over OS and memory to enable them, and the discipline/motivation to implement them.
It is recoverable only if you catch it and handle it correctly.
In same cases, for example, a request tried to allocate a lot memory. It is quite predictable and you can handle it very very well.
However, in many cases in multi-thread application, OOE may also happen on background thread (including created by system/3rd-party library).
It is almost imposable to predict and you may unable to recover the state of all your threads.
No.
An out of memory error from the GC is should not generally be recoverable inside of the current thread. (Recoverable thread (user or kernel) creation and termination should be supported though)
Regarding the counter examples: I'm currently working on a D programming language project which uses NVIDIA's CUDA platform for GPU computing. Instead of manually managing GPU memory, I've created proxy objects to leverage the D's GC. So when the GPU returns an out of memory error, I run a full collect and only raise an exception if it fails a second time. But, this isn't really an example of out of memory recovery, it's more one of GC integration. The other examples of recovery (caches, free-lists, stacks/hashes without auto-shrinking, etc) are all structures that have their own methods of collecting/compacting memory which are separate from the GC and tend not to be local to the allocating function.
So people might implement something like the following:
T new2(T)( lazy T old_new ) {
T obj;
try{
obj = old_new;
}catch(OutOfMemoryException oome) {
foreach(compact; Global_List_Of_Delegates_From_Compatible_Objects)
compact();
obj = old_new;
}
return obj;
}
Which is a decent argument for adding support for registering/unregistering self-collecting/compacting objects to garbage collectors in general.
In the general case, it's not recoverable.
However, if your system includes some form of dynamic caching, an out-of-memory handler can often dump the oldest elements in the cache (or even the whole cache).
Of course, you have to make sure that the "dumping" process requires no new memory allocations :) Also, it can be tricky to recover the specific allocation that failed, unless you're able to plug your cache dumping code directly at the allocator level, so that the failure isn't propagated up to the caller.
It depends on what you mean by running out of memory.
When malloc() fails on most systems, it's because you've run out of address-space.
If most of that memory is taken by cacheing, or by mmap'd regions, you might be able to reclaim some of it by freeing your cache or unmmaping. However this really requires that you know what you're using that memory for- and as you've noticed either most programs don't, or it doesn't make a difference.
If you used setrlimit() on yourself (to protect against unforseen attacks, perhaps, or maybe root did it to you), you can relax the limit in your error handler. I do this very frequently- after prompting the user if possible, and logging the event.
On the other hand, catching stack overflow is a bit more difficult, and isn't portable. I wrote a posixish solution for ECL, and described a Windows implementation, if you're going this route. It was checked into ECL a few months ago, but I can dig up the original patches if you're interested.
Especially in garbage collected environments, it's quote likely that if you catch the OutOfMemory error at a high level of the application, lots of stuff has gone out of scope and can be reclaimed to give you back memory.
In the case of single excessive allocations, the app may be able to continue working flawlessly. Of course, if you have a gradual memory leak, you'll just run into the problem again (more likely sooner than later), but it's still a good idea to give the app a chance to go down gracefully, save unsaved changes in the case of a GUI app, etc.
Yes, OOM is recoverable. As an extreme example, the Unix and Windows operating systems recover quite nicely from OOM conditions, most of the time. The applications fail, but the OS survives (assuming there is enough memory for the OS to properly start up in the first place).
I only cite this example to show that it can be done.
The problem of dealing with OOM is really dependent on your program and environment.
For example, in many cases the place where the OOM happens most likely is NOT the best place to actually recover from an OOM state.
Now, a custom allocator could possibly work as a central point within the code that can handle an OOM. The Java allocator will perform a full GC before is actually throws a OOM exception.
The more "application aware" that your allocator is, the better suited it would be as a central handler and recovery agent for OOM. Using Java again, it's allocator isn't particularly application aware.
This is where something like Java is readily frustrating. You can't override the allocator. So, while you could trap OOM exceptions in your own code, there's nothing saying that some library you're using is properly trapping, or even properly THROWING an OOM exception. It's trivial to create a class that is forever ruined by a OOM exception, as some object gets set to null and "that never happen", and it's never recoverable.
So, yes, OOM is recoverable, but it can be VERY hard, particularly in modern environments like Java and it's plethora of 3rd party libraries of various quality.
The question is tagged "language-agnostic", but it's difficult to answer without considering the language and/or the underlying system. (I see several toher hadns
If memory allocation is implicit, with no mechanism to detect whether a given allocation succeeded or not, then recovering from an out-of-memory condition may be difficult or impossible.
For example, if you call a function that attempts to allocate a huge array, most languages just don't define the behavior if the array can't be allocated. (In Ada this raises a Storage_Error exception, at least in principle, and it should be possible to handle that.)
On the other hand, if you have a mechanism that attempts to allocate memory and is able to report a failure to do so (like C's malloc() or C++'s new), then yes, it's certainly possible to recover from that failure. In at least the cases of malloc() and new, a failed allocation doesn't do anything other than report failure (it doesn't corrupt any internal data structures, for example).
Whether it makes sense to try to recover depends on the application. If the application just can't succeed after an allocation failure, then it should do whatever cleanup it can and terminate. But if the allocation failure merely means that one particular task cannot be performed, or if the task can still be performed more slowly with less memory, then it makes sense to continue operating.
A concrete example: Suppose I'm using a text editor. If I try to perform some operation within the editor that requires a lot of memory, and that operation can't be performed, I want the editor to tell me it can't do what I asked and let me keep editing. Terminating without saving my work would be an unacceptable response. Saving my work and terminating would be better, but is still unnecessarily user-hostile.
This is a difficult question. On first sight it seems having no more memory means "out of luck" but, you must also see that one can get rid of many memory related stuff if one really insist. Let's just take the in other ways broken function strtok which on one hand has no problems with memory stuff. Then take as counterpart g_string_split from the Glib library, which heavily depends on allocation of memory as nearly everything in glib or GObject based programs. One can definitly say in more dynamic languages memory allocation is much more used as in more inflexible languages, especially C. But let us see the alternatives. If you just end the program if you run out of memory, even careful developed code may stop working. But if you have a recoverable error, you can do something about it. So the argument, making it recoverable means that one can choose to "handle" that situation differently (e.g putting aside a memory block for emergencies, or degradation to a less memory extensive program).
So the most compelling reason is. If you provide a way of recovering one can try the recoverying, if you do not have the choice all depends on always getting enough memory...
Regards
It's just puzzling me now.
At work, we have a bundle of applications working together, and memory is running low. While the problem is either make the application bundle go 64-bit (and so, be able to work beyond the 2 Go limits we have on a normal Win32 OS), and/or reduce our use of memory, this problem of "How to recover from a OOM" won't quit my head.
Of course, I have no solution, but still play at searching for one for C++ (because of RAII and exceptions, mainly).
Perhaps a process supposed to recover gracefully should break down its processing in atomic/rollback-able tasks (i.e. using only functions/methods giving strong/nothrow exception guarantee), with a "buffer/pool of memory" reserved for recovering purposes.
Should one of the task fails, the C++ bad_alloc would unwind the stack, free some stack/heap memory through RAII. The recovering feature would then salvage as much as possible (saving the initial data of the task on the disk, to use on a later try), and perhaps register the task data for later try.
I do believe the use of C++ strong/nothrow guanrantees can help a process to survive in low-available-memory conditions, even if it would be akin memory swapping (i.e. slow, somewhat unresponding, etc.), but of course, this is only theory. I just need to get smarter on the subject before trying to simulate this (i.e. creating a C++ program, with a custom new/delete allocator with limited memory, and then try to do some work under those stressful condition).
Well...
Out of memory normally means you have to quit whatever you were doing. If you are careful about cleanup, though, it can leave the program itself operational and able to respond to other requests. It's better to have a program say "Sorry, not enough memory to do " than say "Sorry, out of memory, shutting down."
Out of memory can be caused either by free memory depletion or by trying to allocate an unreasonably big block (like one gig). In "depletion" cases memory shortage is global to the system and usually affects other applications and system services and the whole system might become unstable so it's wise to forget and reboot. In "unreasonably big block" cases no shortage actually occurs and it's safe to continue. The problem is you can't automatically detect which case you're in. So it's safer to make the error non-recoverable and find a workaround for each case you encounter this error - make your program use less memory or in some cases just fix bugs in code that invokes memory allocation.
There are already many good answers here. But I'd like to contribute with another perspective.
Depletion of just about any reusable resource should be recoverable in general. The reasoning is that each and every part of a program is basically a sub program. Just because one sub cannot complete to it's end at this very point in time, does not mean that the entire state of the program is garbage. Just because the parking lot is full of cars does not mean that you trash your car. Either you wait a while for a booth to be free, or you drive to a store further away to buy your cookies.
In most cases there is an alternative way. Making an out of error unrecoverable, effectively removes a lot of options, and none of us like to have anyone decide for us what we can and cannot do.
The same applies to disk space. It's really the same reasoning. And contrary to your insinuation about stack overflow is unrecoverable, i would say that it's and arbitrary limitation. There is no good reason that you should not be able to throw an exception (popping a lot of frames) and then use another less efficient approach to get the job done.
My two cents :-)
If you are really out of memory you are doomed, since you can not free anything anymore.
If you are out of memory, but something like a garbage collector can kick in and free up some memory you are non dead yet.
The other problem is fragmentation. Although you might not be out of memory (fragmented), you might still not be able to allocate the huge chunk you wanna have.
I know you asked for arguments for, but I can only see arguments against.
I don't see anyway to achieve this in a multi-threaded application. How do you know which thread is actually responsible for the out-of-memory error? One thread could allocating new memory constantly and have gc-roots to 99% of the heap, but the first allocation that fails occurs in another thread.
A practical example: whenever I have occurred an OutOfMemoryError in our Java application (running on a JBoss server), it's not like one thread dies and the rest of the server continues to run: no, there are several OOMEs, killing several threads (some of which are JBoss' internal threads). I don't see what I as a programmer could do to recover from that - or even what JBoss could do to recover from it. In fact, I am not even sure you CAN: the javadoc for VirtualMachineError suggests that the JVM may be "broken" after such an error is thrown. But maybe the question was more targeted at language design.
uClibc has an internal static buffer of 8 bytes or so for file I/O when there is no more memory to be allocated dynamically.
What is the compelling argument for making it a recoverable error?
In Java, a compelling argument for not making it a recoverable error is because Java allows OOM to be signalled at any time, including at times where the result could be your program entering an inconsistent state. Reliable recoery from an OOM is therefore impossible; if you catch the OOM exception, you can not rely on any of your program state. See
No-throw VirtualMachineError guarantees
I'm working on SpiderMonkey, the JavaScript VM used in Firefox (and gnome and a few others). When you're out of memory, you may want to do any of the following things:
Run the garbage-collector. We don't run the garbage-collector all the time, as it would kill performance and battery, so by the time you're reaching out of memory error, some garbage may have accumulated.
Free memory. For instance, get rid of some of the in-memory cache.
Kill or postpone non-essential tasks. For instance, unload some tabs that haven't be used in a long time from memory.
Log things to help the developer troubleshoot the out-of-memory error.
Display a semi-nice error message to let the user know what's going on.
...
So yes, there are many reasons to handle out-of-memory errors manually!
I have this:
void *smalloc(size_t size) {
void *mem = null;
for(;;) {
mem = malloc(size);
if(mem == NULL) {
sleep(1);
} else
break;
}
return mem;
}
Which has saved a system a few times already. Just because you're out of memory now, doesn't mean some other part of the system or other processes running on the system have some memory they'll give back soon. You better be very very careful before attempting such tricks, and have all control over every memory you do allocate in your program though.