Learning garbage collection theory [closed] - language-agnostic

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to learn the theory behind garbage collection. How do i go about it? The obvious answer is - a compiler textbook... The question is, is it necessary to learn lexical analysis, parsing and other stuff that usually precedes garbage collection in a text?
In short, what are the prerequisites to learning about Garbage collection theory?
P.S - I do know what is the purpose of parsing, lexical analysis etc. Just not how they are implemented.

Read these papers in order. They are in progressive subject matter/difficulty order (not chronological).
List taken directly from Prof. Kathryn McKinley's Memory Management course page here, where you'll find links to all the articles.
I took the course last semester, so I read all these and I have to say I learned what I set out to learn!
Note that links to freely-available copies of most of the papers below are included in the garbage-collection tag wiki at https://stackoverflow.com/tags/garbage-collection/info.
List processing in real time on a serial computer, Baker, CACM, 21(4) 280--294, 1978.
A nonrecursive list compacting algorithm , Cheney, CACM, 13(11): 677--678, 1970.
A Real-time garbage collector based on the lifetimes of objects, Lieberman & Hewitt, CACM, 26(6): 419--429, 1983.
Generation scavenging: A non-disruptive high-performance storage reclamation algorithm, Ungar, Proceedings of the first ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments, 1984, pages 157--167.
Simple generational garbage collection and fast allocation, Appel, Software--Practice and Experience 19(2):171-183, February 1989.
Age-Based Garbage Collection, D. Stefanovic, K. S. McKinley, J. E. B. Moss, ACM Conference on Object-Oriented Programming Systems, Languages and Applications. (OOPSLA), pp. 370--381. Denver CO, November 1999.
Older-first Garbage Collection in Practice: Evaluation in a Java Virtual Machine, D. Stefanovic, M. Hertz, S. M. Blackburn, K. S. McKinley, and J. E. B. Moss, Memory System Performance, Berlin, Germany, pp. 175--184, June 2002.
Beltway: Getting Around Garbage Collection Gridlock, S. M. Blackburn, R. Jones, K. S. McKinley, and J. E. B. Moss, ACM Conference on Programming Language Design and Implementation, Berlin, Germany, pp. 153--164, June 2002.
An efficient machine-independent procedure for garbage collection in various list structures, Schorr & Waite, CACM, 10(8): 501--506, 1967.
Comparison of compacting algorithms for garbage collection, Cohen & Nicolau, ACM Transactions on Programming Languages and Systems (TOPLAS), Volume 5, Issue 4, pages 532--553, October 1983.
MC2: High-Performance Garbage Collection for Memory-Constrained Environments, Sachindran, Berger & Moss, ACM Conference on Object-Oriented Programming Systems, Languages and Applications, pp. 81-96, Vancouver, BC, October 2004.
Immix: A Mark-Region Garbage Collector with Space Efficiency, Fast Collection, and Mutator Performance, Blackburn & McKinley, ACM Conference on Programming Language Design and Implementation, pp.22--32, Tucson, AZ, June 2008.
An Efficient Incremental Automatic Garbage Collector, Deutsch & Bobrow, CACM, 19(9): 522--526, September 1976.
Ulterior Reference Counting: Fast Garbage Collection without the Wait, S. M. Blackburn and K. S. McKinley , Proceedings of the ACM 2003 SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications, pp. 344-359, Annehiem, CA, October 2003.
Cycle Tracing: Efficient Concurrent Mark-Sweep Cycle Collection, Frampton & Blackburn, 2009. (In submission to ISMM.)
Multiprocessing compactifying garbage collection, Guy L. Steele, Jr., CACM 18(9): 495-508, 1975.
On-the-fly garbage collection: an exercise in cooperation, E. W. Dijkstra, L. Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens, Communications of the ACM, 21(11):966--975, November 1978.
Correctness-Preserving Derivation of Concurrent Garbage Collection Algorithms, Vechev, Yahav, and Bacon, ACM Conference on Programming Language Design and Implementation, Ottawa, Ontario, pp. 341-353, 2006.
A Real-time Garbage Collector with Low Overhead and Consistent Utilization, Bacon, Cheng, and Rajan, ACM Symposium on Principles of Programming Languages, New Orleans, Louisiana, pp. 285-298, 2003.
Tax-and-spend: democratic scheduling for real-time garbage collection, Auerbach, Bacon, Cheng, Grove, Biron, Gracie, McCloskey, Micic, and Sciampacone, ACM International Conference On Embedded Software, Atlanta, GA, pp. 245-254, 2008.
Garbage collection in an uncooperative environment, H. Boehm and M. Weiser, Software Practice and Experience, 18(9):807-820, 1988.
Hoard: A Scalable Memory Allocator for Multithreaded Applications, E. D. Berger, K. S. McKinley, R. D. Blumofe, and P. R. Wilson, The Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, pp. 117--128, November 2000.
Cork: Dynamic Memory Leak Detection for Garbage-Collected Languages,Jump & McKinley, In submission to ACM Transactions on Software Practice & Experience, 2009. (Abbreviated version appears in ACM Conference on Programming Languagesm, Nice, France, January 2009.)
Leak Pruning, Bond & McKinley, ACM Conference on Architecture Support for Programming Languages and Operating Systems, Washington, DC, March 2009. (To appear.)
Free-me: A Static Analysis for Individual Object Reclamation, Guyer & McKinley, ACM Conference on Programming Language Design and Implementation, Ottawa, Canada, pp. 364-375, June 2006.
Garbage collection can be faster than stack allocation, Appel, Information Processing Letters 25(4):275-279, 17 June 1987.
The Garbage Collection Advantage: Improving Program Locality Huang, Blackburn, McKinley, Moss, Wang, & Cheng, ACM Conference on Object-Oriented Programming Systems, Languages, & Applications, Vancouver, BC, pp. 69-80, October 2004.
Demystifying Magic: High-level Low-level Programming, Daniel Frampton, Stephen M. Blackburn, Perry Cheng, Robin Garner, David P. Grove, J. Eliot B. Moss & Sergey I. Salishev. ACM International Conference on Virtual Execution Environments, Washington DC, March 2009. (To appear.)
Myths and Realities: The Performance Impact of Garbage Collection, S. M. Blackburn, P. Cheng, and K. S. McKinley, ACM SIGMETRICS Conference on Measurement & Modeling Computer Systems, pp. 25--36, New York, NY, June 2004.
A unified theory of garbage collection, Bacon, Cheng, & Rajan, ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications, Vancouver, BC, Canada, pp. 50-68, 2004.

I want to learn the theory behind garbage collection. How do i go about it?
I am also a dabbler interested in garbage collection (to the point that I wrote my own garbage collected VM called HLVM). I learned by reading as many research papers on garbage collection as I could get my hands on and by playing with the ideas myself, both raw in my virtual machine and also by writing memory-safe high-level simulations.
The obvious answer is - a compiler textbook... The question is, is it necessary to learn lexical analysis, parsing and other stuff that usually precedes garbage collection in a text?
The lexical analysis, parsing and other stuff is not relevant to garbage collection. You might get an outdated cursory overview of garbage collection from a compiler book but you need to read research papers to get an up-to-date view, e.g. with regard to multicore.
In short, what are the prerequisites to learning about Garbage collection theory?
You need to know about basic graph theory, pointers, stacks, threads and (if you're interested in multi-threading) low-level concurrency primitives such as locks.
Garbage collection is all about determining reachability. When a program can no longer obtain a reference to a value because that value has become unreachable then the GC can recycle the memory that value is occupying. Reachability is determined by traversing the heap starting from a set of "global roots" (global variables and pointers on the thread's stacks and in the core's registers)
GC design has many facets but you might begin with the four main garbage collection algorithms:
Mark-and-sweep (McCarthy, 1960)
Mark-and-compact (Haddon and Waite, 1967)
Stop-and-copy (Cheney, 1970)
Mark-region (McKinley et al., 2007)
Perhaps the most notable evolution of these basic ideas is generational garbage collection, which was the defacto standard design for many years.
My personal feeling is that some of the obscure work on garbage collection conveys just as much useful information so I'd also highly recommend:
Baker's treadmill (a beautiful real-time GC).
VCGC (a completely different tri-color marking scheme).
You might also like to study the three kinds of write barrier (Dijkstra's, Steele's and Yuasa's) and look at the card marking and remembered set techniques.
Then you might also like to examine the actual design decisions some implementors chose for language implementations like Java and .NET as well as the SML/NJ compiler for Standard ML, the OCaml compiler, the Glasgow Haskell compiler and others. The differences between the techniques adopted are as big as the similarities between them!
There are also some great tangentially-related papers like Henderson's Accurate Garbage Collection in an Uncooperative Environment. I used that technique to avoid having to write a stack walker for HLVM.
The memorymanagement.org website is an invaluable resource, particularly the glossary of GC-related terms.

There is a whole book on garbage collection, and a quite good one, if I may add:
Richard Jones & Rafael Lins, Garbage Collection,
Wiley and Sons (1996), ISBN 0471941484
Richard Jones also maintains a nice site collecting garbage collection resources.
Most early garbage collection papers are eminently readable. You could start with Paul Wilson's survey of "Uniprocessor Garbage Collection Techniques" (1992, LNCS vol. 637) and then dive into the original literature on topics that sound interesting.

You might also want to take a look at Squeak: Open Personal Computing, which covers the Squeak Smalltalk garbage collector, among other ST design issues. You should also take a look at Squeak itself - it is almost completely written in Smalltalk, and all the source, including the GC, is freely available and easy to study using the Smalltalk browsers.

Related

In Diffusion Models literature, what do they mean by "the reversal of the diffusion process has the identical functional form as the forward process"?

I am trying to understand Diffusion Probabilistic Models, in particular the paper by (Sohl-Dickstein et al. 2015 https://arxiv.org/abs/1503.03585). When defining the backward diffusion model, they take the conditional probability of one step de-noising to be normally distributed:
screenshot from paper
They argue that this can be done because the reversal of the diffusion process has the identical functional form as the forward diffusion. What do they mean by this? Does someone have a good reference where this is explained in more detail?
Many thanks for your help!
I tried looking at the reference given in the paper:
Feller, W. On the theory of stochastic processes, with par- ticular reference to applications. In Proceedings of the [First] Berkeley Symposium on Mathematical Statistics and Probability. The Regents of the University of Cali- fornia, 1949.
However, I am struggling to find a satisfactory answer within that particular reference.
It's means that the reversal of the diffusion process can be modeled as a Gaussian (binomial) distribution

What does the term "Real-Time Software Development refer to?

I saw a job description with the term Real-Time Software Development:
Software Engineers at Boeing develop solutions that provide world class performance and capability to customers around the world.
Boeing Defense, Space and Security in St. Louis is looking for
software engineers to join the growing and talented teams developing
modeling and simulation software for a variety of applications,
including flight control and aerodynamic performance, weapon and
sensor systems, simulation tools and more. The software is integrated
with live assets to enable a next-generation virtual battle
environment to explore new system concepts and optimal engineering
solutions.
Our software engineers are responsible for full life-cycle software development which means you will have a hand in defining the
requirements; designing, implementing and testing the software. You
will work with a team in a casual but professional environment where
there is long-term potential for career growth into management or
technical leadership positions.
**Languages & Databases**
Real-time SW Development Tool
Real-time Target Environment
Job:*Software Engineer
I can't figure out what that means in this context, what does Real Time Software development mean?
The links in comments give some useful information. The real problem with Real Time is that there are far less usages than ordinary scientific or data processing applications and so less specialists around.
I used a Real Time development environment many years ago, a a friend of mine used another one more recently. The generic characteristics were :
the development system is an IDE more or less like any other IDE
you have the ability to get the precise time that will last any routine, because if you use a RT system, it is because you need deterministic processing times
you have an emulator that allows you to run the program or more exactly simulate it running on the real system with different inputs (including hardware inputs) and control both the outputs and the times
you generally mix high level programming (C or others) for non critical parts and low level assembly routines in time critical parts.
The remaining really depended on the simulated system.
Real time in this context means software that always run in the same time. Normal server and desktop OSes such as Mac, Linux, and Windows have multitasking without exact scheduling, making it impossible to say exactly how long time it will take for a piece of code to run. In a real time OS, the time it will take a piece of code is always the same.
This is used in space craft, aircraft and similar areas.
Not to be confused with real time processing speed, eg. encoding video in real time means to encode it as fast as the frames are coming.

What techniques exist for the software-driven locomotion of a bipedal robot?

I'm programming a software agent to control a robot player in a simulated game of soccer. Ultimately I hope to enter it in the RoboCup competition.
Amongst the various challenges involved in creating such an agent, the motion of it's body is one of the first I'm facing. The simulation I'm targeting uses a Nao robot body with 22 hinge to control. Six in each leg, four in each arm and two in the neck:
(source: sourceforge.net)
I have an interest in machine learning and believe there must be some techniques available to control this guy.
At any point in time, it is known:
The angle of all 22 hinges
The X,Y,Z output of an accelerometer located in the robot's chest
The X,Y,Z output of a gyroscope located in the robot's chest
The location of certain landmarks (corners, goals) via a camera in the robot's head
A vector for the force applied to the bottom of each foot, along with a vector giving the position of the force on the foot's sole
The types of tasks I'd like to achieve are:
Running in a straight line as fast as possible
Moving at a defined speed (that is, one function that handles fast and slow walking depending upon an additional input)
Walking backwards
Turning on the spot
Running along a simple curve
Stepping sideways
Jumping as high as possible and landing without falling over
Kicking a ball that's in front of your feet
Making 'subconscious' stabilising movements when subjected to unexpected forces (hit by ball or another player), ideally in tandem with one of the above
For each of these tasks I believe I could come up with a suitable fitness function, but not a set of training inputs with expected outputs. That is, any machine learning approach would need to offer unsupervised learning.
I've seen some examples in open-source projects of circular functions (sine waves) wired into each hinge's angle with differing amplitudes and phases. These seem to walk in straight lines ok, but they all look a bit clunky. It's not an approach that would work for all of the tasks I mention above though.
Some teams apparently use inverse kinematics, though I don't know much about that.
So, what approaches are there for robot biped locomotion/ambulation?
As an aside, I wrote and published a .NET library called TinMan that provides basic interaction with the soccer simulation server. It has a simple programming model for the sensors and actuators of the robot's 22 hinges.
You can read more about RoboCup's 3D Simulated Soccer League:
http://en.wikipedia.org/wiki/RoboCup_3D_Soccer_Simulation_League
http://simspark.sourceforge.net/wiki/index.php/Main_Page
http://code.google.com/p/tin-man/
There is a significant body of research literature on robot motion planning and robot locomotion.
General Robot Locomotion Control
For bipedal robots, there are at least two major approaches to robot design and control (whether the robot is simulated or physically real):
Zero Moment Point - a dynamics-based approach to locomotion stability and control.
Biologically-inspired locomotion - a control approach modeled after biological neural networks in mammals, insects, etc., that focuses on use of central pattern generators modified by other motor control programs/loops to control overall walking and maintain stability.
Motion Control for Bipedal Soccer Robot
There are really two aspects to handling the control issues for your simulated biped robot:
Basic walking and locomotion control
Task-oriented motion planning
The first part is just about handling the basic control issues for maintaining robot stability (assuming you are using some physics-based model with gravity), walking in a straight-line, turning, etc. The second part is focused on getting your robot to accomplish specific tasks as a soccer player, e.g., run toward the ball, kick the ball, block an opposing player, etc. It is probably easiest to solve these separately and link the second part as a higher-level controller that sends trajectory and goal directives to the first part.
There are a lot of relevant papers and books which could be suggested, but I've listed some potentially useful ones below that you may wish to include in whatever research you have already done.
Reading Suggestions
LaValle, Steven Michael (2006). Planning Algorithms, Cambridge University Press.
Raibert, Marc (1986). Legged Robots that Balance. MIT Press.
Vukobratovic, Miomir and Borovac, Branislav (2004). "Zero-Moment Point - Thirty Five Years of its Life", International Journal of Humanoid Robotics, Vol. 1, No. 1, pp 157–173.
Hirose, Masato and Takenaka, T (2001). "Development of the humanoid robot ASIMO", Honda R&D Technical Review, vol 13, no. 1.
Wu, QiDi and Liu, ChengJu and Zhang, JiaQi and Chen, QiJun (2009). "Survey of locomotion control of legged robots inspired by biological concept ", Science in China Series F: Information Sciences, vol 52, no. 10, pp 1715--1729, Springer.
Wahde, Mattias and Pettersson, Jimmy (2002) "A brief review of bipedal robotics research", Proceedings of the 8th Mechatronics Forum International Conference, pp 480-488.
Shan, J., Junshi, C. and Jiapin, C. (2000). "Design of central pattern generator for
humanoid robot walking based on multi-objective GA", In: Proc. of the IEEE/RSJ
International Conference on Intelligent Robots and Systems, pp. 1930–1935.
Chestnutt, J., Lau, M., Cheung, G., Kuffner, J., Hodgins, J., and Kanade, T. (2005). "Footstep planning for the Honda ASIMO humanoid", Proceedings of the 2005 IEEE International Conference on Robotics and Automation (ICRA 2005), pp 629-634.
I was working on a project not that dissimilar from this (making a robotic tuna) and one of the methods we were exploring was using a genetic algorithm to tune the performance of an artificial central pattern generator (in our case the pattern was a number of sine waves operating on each joint of the tail). It might be worth giving a shot, Genetic Algorithms are another one of those tools that can be incredibly powerful, if you are careful about selecting a fitness function.
Here's a great paper from 1999 by Peter Nordin and Mats G. Nordahl that outlines an evolutionary approach to controlling a humanoid robot, based on their experience building the ELVIS robot:
An Evolutionary Architecture for a Humanoid Robot
I've been thinking about this for quite some time now and I realized that you need at least two intelligent "agents" to make this work properly. The basic idea is that you have two types intelligent activity here:
Subconscious Motor Control (SMC).
Conscious Decision Making (CDM).
Training for the SMC could be done on-line... if you really think about it: defining success within motor control is basically done when you provide a signal to your robot, it evaluates that signal and either accepts it or rejects it. If your robot accepts a signal and it results in a "failure", then your robot goes "offline" and it can't accept any more signals. Defining "failure" and "offline" could be tricky, but I was thinking that it would be a failure if, for example, a sensor on the robot indicates that the robot is immobile (laying on the ground).
So your fitness function for the SMC might be something of the sort: numAcceptedSignals/numGivenSignals + numFailure
The CDM is another AI agent that generates signals and the fitness function for it could be: (numSignalsAccepted/numSignalsGenerated)/(numWinGoals/numLossGoals)
So what you do is you run the CDM and all the output that comes out of it goes to the SMC... at the end of a game you run your fitness functions. Alternately you can combine the SMC and the CDM into a single agent and you can make a composite fitness function based on the other two fitness functions. I don't know how else you could do it...
Finally, you have to determine what constitutes a learning session: is it half a game, full game, just a few moves, etc. If a game lasts 1 minute and you have a total of 8 players on the field, then the process of training could be VERY slow!
Update
Here is a quick reference to a paper that used genetic programming to create "softbots" that play soccer: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.136&rep=rep1&type=pdf
With regards to your comments: I was thinking that for the subconscious motor control (SMC), the signals would come from the conscious decision maker (CDM). This way you're evolving your SMC agent to properly handle the CDM agent's commands (signals). You want to maximize the up-time of the SMC agent regardless of what the CDM agent says.
The SMC agent receives an input, for example a vector force on a joint, and it then runs it through its processing unit to determine if it should execute that input or if it should reject it. The SMC should only execute inputs that it doesn't "think" it will recover from and it should reject inputs that it "thinks" would lead to a "catastrophic failure".
Now the SMC agent has an output: accept or reject a signal (1 or 0). The CDM can use that signal for its own training... the CDM wants to maximize the number of signals that the SMC accepts and it also wants to satisfy a goal: a high score for its own team and a low score for the opposing team. So the CDM has its own processing unit that is being evolved to satisfy both of those needs. Your reference provided a 3-layer design, while mine is only a 2-layer... I think mine was a right step in towards the 3-layer design.
One more thing to note here: is falling really a "catastrophic failure"? What if your robot falls, but the CDM makes it stand up again? I think that would be a valid behavior, so you shouldn't penalize the robot for falling... perhaps a better thing to do is penalize it for the amount of time it takes in order to perform a goal (not necessarily a soccer goal).
There is this tutorial on humanoid locomotion control that describes the software stack used on the HRP-4 humanoid (which can walk or climb stairs). It consists mainly of:
Linear inverted pendulum: a simplified model for balancing. It involves only the center of mass (COM) and ZMP already mentioned in other answers.
Trajectory optimization: the robot computes what it wants to do, ideally, for the next 2 seconds or so. It keeps recomputing this trajectory as it moves, which is known as model predictive control.
Balance control: the last stage that corrects the robot's posture based on sensor measurements and the desired trajectory.
Follow links to the academic papers and source code to learn more.

Basis for claim that the number of bugs per line of code is constant regardless of the language used

I've heard people say (although I can't recall who in particular) that the number of bugs per line of code is roughly constant regardless of what language is used. What is the research that backs this up?
Edited to add: I don't have access to it, but apparently the authors of this paper
"asked the question whether the number of bugs per lines of code (LOC) is the same for programs written in different programming languages or not."
In his book Code Complete (quoting from the 2nd Edition), in the chapter "Developer Testing," Steve McConnell cites a handful of studies across a variety of languages:
Industry average experience is about 1-25 errors per 1000 lines of code for delivered software. The software has usually been developed using a hodgepodge of techniques (Boehm 1981, Gremillion 1984, Yourdon 1989a, Jones 1998, Jones 2000, Weber 2003). Cases that have one-tenth as many errors as this are rare; cases that have 10 times more tend not to be reported. (They probably aren't ever completed!)
The Applications Division at Microsoft experiences about 10–20 defects per 1000 lines of code during in-house testing and 0.5 defects per 1000 lines of code in released product (Moore 1992). The technique used to achieve this level is a combination of the code-reading techniques described in Other Kinds of Collaborative Development Practices, and independent testing.
Harlan Mills pioneered "cleanroom development," a technique that has been able to achieve rates as low as 3 defects per 1000 lines of code during in-house testing and 0.1 defects per 1000 lines of code in released product (Cobb and Mills 1990).
These studies ranged from high-level languages like Java, down to C++ and C, all the way down to assembly. Considering the massive impact of Code Complete on software engineering as a discipline, I suspect it is responsible for popularizing this idea.
One possible source would be Les Hatton's 1995 paper "Computer programming languages and safety-related systems", in which he concludes that language choice is at least close to irrelevant and other factors (chiefly fluency in the chosen language) are the controlling factors.
About all I could add to that would be to summarize various other papers, in which defect rates for individual projects (and such) are given. I've done a bit of looking, and never found a correlation between language and defect rate, but that's not really the same as saying the defect rate is constant across languages (i.e., they may be different, but they vary so widely within each language that I've never been able to prove a difference).

Financial applications on GPGPU

I want to know what sort of financial applications can be implemented using a GPGPU. I'm aware of Option pricing/ Stock price estimation using Monte Carlo simulation on GPGPU using CUDA. Can someone enumerate the various possibilities of utilizing GPGPU for any application in Finance domain,
There are many financial applications that can be run on the GPU in various fields, including pricing and risk. There are some links from NVIDIA's Computational Finance page.
It's true that Monte Carlo is the most obvious starting point for many people. Monte Carlo is a very broad class of applications many of which are amenable to the GPU. Also many lattice based problems can be run on the GPU. Explicit finite difference methods run well and are simple to implement, many examples on NVIDIA's site as well as in the SDK, it's also used in Oil & Gas codes a lot so plenty of material. Implicit finite difference methods can also work well depending on the exact nature of the problem, Mike Giles has a 3D ADI solver on his site which also has other useful finance stuff.
GPUs are also good for linear algebra type problems, especially where you can leave the data on the GPU to do reasonable work. NVIDIA provide cuBLAS with the CUDA Toolkit and you can get cuLAPACK too.
Basically, anything that requires a lot of parallel mathematics to run. As you originally stated, Monte Carlo simultation of options that cannot be priced with closed-form solutions are excellent candidates. Anything that involves large matrixes and operations upon them will be ideal; after all, 3D graphics use alot of matrix mathematics.
Given that many trader desktops sometimes have 'workstation' class GPUs in order to drive several monitors, possibly with video feeds, limited 3D graphics (volatility surfaces, etc) it would make sense to run some of the pricing analytics on the GPU, rather than pushing the responsibility onto a compute grid; in my experience the compute grids are frequently struggling under the weight of EVERYONE in the bank trying to use them, and some of the grid computing products leave alot to be desired.
Outside of this particular problem, there's not a great deal more that can be easily achieved with GPUs, because the instruction set and pipelines are more limited in their functional scope compared to a regular CISC CPU.
The problem with adoption has been one of standardisation; NVidia had CUDA, ATI had Stream. Most banks have enough vendor lock-in to deal with without hooking their derivative analytics (which many regard as extremely sensitive IP) into a gfx card vendor's acceleration technology. I suppose with the availability of OpenCL as an open standard this may change.
F# is used a lot in finance, so you might check out these links
http://blogs.msdn.com/satnam_singh/archive/2009/12/15/gpgpu-and-x64-multicore-programming-with-accelerator-from-f.aspx
http://tomasp.net/blog/accelerator-intro.aspx
High-end GPUs are starting to offer ECC memory (a serious consideration for financial and, eh, military applications) and high-precision types.
But it really is all about Monte Carlo at the moment.
You can go to workshops on it, and from their descriptions see that it'll focus on Monte Carlo.
A good start would be probably to check NVIDIA's website:
CUDA's Finance Showcases
CUDA's Finance Tutorials
Using a GPU introduces limitations to architecture, deployment and maintenance of your app.
Think twice before you invest efforts in such solution.
E.g. if you're running in virtual environment, it would require all physical machines to have GPU hardware installed and a special vGPU hardware and software support + licenses.
What if you decide to host your service in the cloud (e.g. Azure, Amazon)?
In many cases it is worth building your architecture in advance to support scale out and be flexible and scalable (with some overhead of course) rather than scale up and squeeze as much as you can from your hardware.
Answering the complement of your question: anything that involves accounting can't be done on GPGPU (or binary floating point, for that matter)