How to speed up the "Waiting for resources" step of a triggered build? - palantir-foundry

Why does the "Waiting for resources" step during a build takes so long at times?
When I run my build it occasionally will start running and in other cases it will be waiting for resources for quite some time. What is the cause of this?

Waiting for resources means that your build is queued behind other jobs. If the jobs in front are computationally heavy, your job may have to wait for a long time. Scheduling the build to run at a time when less resource intensive jobs are queued will help. If the jobs in front are computationally heavy, your job may have to wait for a long time.

Related

Schedule jobs between GPUs for PyTorch Models

I'm trying to build up a system that trains deep models on requests. A user comes to my web site, clicks a button and a training process starts.
However, I have two GPUs and I'm not sure which is the best way to queue/handle jobs between the two GPUs: start a job when at least one GPU is available, queue the job if there are currently no GPUs available. I'd like to use one GPU per job request.
Is this something I can do in combination with Celery? I've used this in the past but I'm not sure how to handle this GPU related problem.
Thanks a lot!
Not sure about celery as I've never used it, but conceptually what seems reasonable (and the question is quite open ended anyway):
create thread(s) responsible solely for distributing tasks to certain GPUs and receiving requests
if any GPU is free assign task immediately to it
if both are occupied estimate time it will probably take to finish the task (neural network training)
add it to the GPU will smallest approximated time
Time estimation
ETA of current task can be approximated quite well given fixed number of samples and epochs. If that's not the case (e.g. early stopping) it will be harder/way harder and would need some heuristic.
When GPUs are overloaded (say each has 5 tasks in queue), what I would do is:
Stop process currently on-going on GPU
Run new process for a few batches of data to make rough estimation how long it might take to finish this task
Ask it to the estimation of all tasks
Now, this depends on the traffic. If it's big and would interrupt on-going process too often you should simply add new tasks to GPU queue which has the least amount of tasks (some heuristic would be needed here as well, you should have estimated possible amount of requests by now, assuming only 2 GPUs it cannot be huge probably).

Is it possible to make an application that never has any hard page faults or has them all upon starting the application?

Is it possible to make an application that never has any hard page faults or has them all upon starting the application?
Sure, it is can happen very easily.
In fact, most simple processes when run back to back on most systems with a page cache will have zero hard faults, since everything (code and data0 is in the disk cache.
Here's a simple example on my system:
$ perf stat -e faults,major-faults,minor-faults true
Performance counter stats for 'true':
42 faults
0 major-faults
42 minor-faults
0.000569273 seconds time elapsed
There were a total of 42 faults but all of them soft/minor.

What is the difference between Schedulers.io() and Schedulers.computation()

I use Observables in couchbase.
What is the difference between Schedulers.io() and Schedulers.computation()?
Brief introduction of RxJava schedulers.
Schedulers.io() – This is used to perform non-CPU-intensive operations like making network calls, reading disc/files, database operations, etc., This maintains a pool of threads.
Schedulers.newThread() – Using this, a new thread will be created each time a task is scheduled. It’s usually suggested not to use scheduler unless there is a very long-running operation. The threads created via newThread() won’t be reused.
Schedulers.computation() – This schedular can be used to perform CPU-intensive operations like processing huge data, bitmap processing etc., The number of threads created using this scheduler completely depends on number CPU cores available.
Schedulers.single() – This scheduler will execute all the tasks in sequential order they are added. This can be used when there is a necessity of sequential execution is required.
Schedulers.immediate() – This scheduler executes the task immediately in a synchronous way by blocking the main thread.
Schedulers.trampoline() – It executes the tasks in First In – First Out manner. All the scheduled tasks will be executed one by one by limiting the number of background threads to one.
Schedulers.from() – This allows us to create a scheduler from an executor by limiting the number of threads to be created. When the thread pool is occupied, tasks will be queued.
From the documentation of rx:
Schedulers.computation( ) - meant for computational work such as event-loops and callback processing; do not use this scheduler for I/O (use Schedulers.io( ) instead); the number of threads, by default, is equal to the number of processors
Schedulers.io( ) - meant for I/O-bound work such as asynchronous performance of blocking I/O, this scheduler is backed by a thread-pool that will grow as needed; for ordinary computational work, switch to Schedulers.computation( ); Schedulers.io( ) by default is a CachedThreadScheduler, which is something like a new thread scheduler with thread caching

Metro App BackgroundTask TimeTrigger/MaintenanceTrigger Usage

I read an article on BackgroundTasks: TimeTrigger and MaintenaceTrigger.
Here they demonstrate how these triggers can be used to download email. I'm trying to understand the practicality and appropriateness of this approach.
Quotas for BackgroundTasks on LockScreen are 2 seconds CPU time and non-LockScreen is 1 second CPU time.
Given this restriction, how is it possible that one can download emails in this amount of time? Surely, just establishing a connection to the remote server will take more time than that?
Am i misunderstanding something about how BackgroundTasks work or is this article inaccurate?
http://blogs.msdn.com/b/windowsappdev/archive/2012/05/24/being-productive-in-the-background-background-tasks.aspx
CPU Time is not the same as the amount of seconds that have passed. Your link references a Word Document, Introduction to Background Tasks, which contains the following:
CPU usage time refers to the amount of CPU time used by the app and not the wall clock time of the background task. For example, if the background task is waiting in its code for the remote server to respond, and it is not actually using the CPU, then the wait time is not counted against the CPU quota because the background task is not using the CPU.
If you are establishing a connection to the mail server (and waiting for it to respond), then you are not using any CPU. This means the time that you spent waiting is not counted against you.
Of course, you will want to test your background task to make sure that it stays within the limits.

How does sleep(), wait() and pause() work?

How do sleep(), wait(), pause(), functions work?
We can see the sleeping operation from a more abstract point of view: it is an operation that let you wait for an event.
The event in question is triggered when the time passed from sleep invocation exceeds the sleep parameter.
When a process is active (ie: it owns a CPU) it can wait for an event in an active or in a passive way:
An active wait is when a process actively/explicitly waits for the event:
sleep( t ):
while not [event: elapsedTime > t ]:
NOP // no operatior - do nothing
This is a trivial algorithm and can be implemented wherever in a portable way, but has the issue that while your process is actively waiting it still owns the CPU, wasting it (since your process doesn't really need the CPU, while other tasks could need it).
Usually this should be done only by those process that cannot passively wait (see the point below).
A passive wait instead is done by asking to something else to wake you up when the event happens, and suspending yourself (ie: releasing the CPU):
sleep( t ):
system.wakeMeUpWhen( [event: elapsedTime > t ] )
release CPU
In order to implement a passive wait you need some external support: you must be able to release your CPU and to ask somebody else to wake you up when the event happens.
This could be not possible on single-task devices (like many embedded devices) unless the hardware provides a wakeMeUpWhen operation, since there's nobody to release the CPU to or to ask to been waken up.
x86 processors (and most others) offer a HLT operation that lets the CPU sleep until an external interrupt is triggered. This way also operating system kernels can sleep in order to keep the CPU cool.
Modern operating systems are multitasking, which means it appears to run multiple programs simultaneously. In fact, your computer only (traditionally, at least) only has one CPU, so it can only execute one instruction from one program at the same time.
The way the OS makes it appear that multiple stuff (you're browsing the web, listening to music and downloading files) is happening at once is by executing each task for a very short time (let's say 10 ms). This fast switching makes it appear that stuff is happening simultaneously when everything is in fact happening sequentially. (with obvious differences for multi-core system).
As for the answer to the question: with sleep or wait or synchronous IO, the program is basically telling the OS to execute other tasks, and do not run me again until: X ms has elapsed, the event has been signaled, or the data is ready.
sleep() causes the calling thread to be removed from of the Operating System's ready queue and inserted into another queue where the OS periodically checks if the sleep() has timed out, after which the thread is readied again. When the thread is removed from the queue, the operating system will schedule other readied threads during the sleep period, including the 'idle' thread, which is always in the ready queue.
These are system calls. Lookup the implementation in Open-source code like in Linux or Open BSD.