NVIDIA’s TESLA and Compute Unified Device Architecture
While the war over the latest+greatest video cards for the current generation of graphics intensive games seems always to ebb and flow between nVidia and its arch-rival ATI, I’ve long preferred nVidia for their better support of Linux. Thus, all of my machines have some sort of nVidia Graphics Processing Unit (GPU) in them.
For those who spend their workdays in the markets and their weekends pondering derivatives pricing, latency, oceans of market data, portfolio optimization, and how to make every last damn thing faster, a preference for nVidia cards could prove to yield an unexpected benefit.
nVidia has recently unveiled a product line dubbed “TESLA” which leverages their absurdly fast GPUs to provide a supercomputer-like High Performance Computing (HPC) platform at a previously unimaginable price point. TESLA computers are regular machines that have a set of slightly modified GPUs in them; modified such that they have no video out, but instead become additive processing clusters which the machine can use for compute intensive tasks. For about $10K you can buy a 1U machine with some 4 teraflops of capacity. By way of comparison, this is over 20 times faster than the funky Helmer project I’d been drooling over a few months ago in a production-worthy package ready for the server room today.
So, TESLA refers to the machines built with these specialized GPUs. Making all this power usable is what CUDA is about…
CUDA is nVidia’s model for parallel programming that provides a C-based software environment for the development of applications that take can advantage of nVidia’s GPU architecture. These advantages center around parallelism, torrential memory bandwidth and obscene floating point performance. The below graphs illustrate these advantages against high-end Intel hardware.

Although I haven’t yet taken the plunge and purchased one of the trick new TESLA machines, I have been able to dip my toe into the waters as CUDA will work with most newer nVidia cards. All you need to do is download the CUDA environment and SDK and you can very quickly get up and running. They have an expansive and well-documented set of examples including many of interest to algorithmic traders – particularly surrounding derivatives pricing and monte-carlo methods. I haven’t tried yet under linux, but was able to successfully modify their sample programs under vista after less than an hour of fiddling.
I had seen some flashes of interest in CUDA on the Quantlib mailing lists, but I’m not sure if there’s been any real follow-up. Even if it takes some time for the open source community to get plugged-in, there are already a number of finance-oriented commercial offerings including a plugin to Matlab which allows some operations within Matlab to be offloaded to whatever CUDA subsystems are available.
If you’re comfortable with C and happen to have an nVidia card handy, I encourage you to take a look-see. The price/performance advantage is remarkable and if you have a problem that can profitably make use of it, one of these boxes might pay for itself in a big hurry!
monte-carlo methods, options pricing, portfolio management, technology