Archive

Archive for the ‘technology’ Category

lock free

February 16th, 2010

One of the recurring technology themes in these pages has been the ongoing and dramatic move from single to multi-core systems and the need to seriously increase the parallelism in our software designs. For me, one of the seminal, large-grained design patterns was the SEDA Architecture. For years, this informed my systems’ designs and formed a conceptual backbone for development. That said, I’ve been broadly aware for some time that SEDA’s golden age has (incredibly!) already passed us by, but haven’t identified what might replace it as a reference point for my design efforts.

Before considering tools, languages or patterns that might help, we need to reflect on the problem(s) we’re trying to solve. The problems inside an EMS look to me, after years of development, a lot like network routing problems. Indeed, my current view is that this (not just concurrency as I’d suggested at the time) is why the unfortunate Aleynikov & co. at GS were using erlang.

Why network routing? Think about the load on an EMS. The main issue is that you’re getting many thousands of teeny little messages per second and only a relatively small number of them matter to only a relatively small subset of ‘agents’ within the system. Reducing latency is all about making sure the time you spend on each message is minimized, and that the agents who are interested in a particular message needn’t wait for each other to do whatever they care to do based on the message. So, really you’re trying to route each message through your system with as few ‘hops’ possible and as much parallelism as you can muster under the (radically!) new assumption that you may have hundreds or thousands of cores available to you during the lifetime of the design.

I spent some time thinking (hoping) that languages might help furnish an answer. Perhaps a move to a functional language like erlang, ocaml or scala might help furnish at least a partial answer. But erlang is slow and peculiar, ocaml doesn’t support intra-process concurrency and scala looks like a bloated language on a bloated platform (jvm+java class library). And none of them seem to have achieved anything near the critical mass which is so crucial for the development of usable libraries and the availability of skilled developers with long experience in the technology.  Naturally, reasonable people will disagree about such things, but this is my view (today). Java is ok (and certainly sells servers), but it’s not obvious how it’s going to help me offload my work onto a GPU anytime soon (and jni is both painful and slow) and I’ve never been able to get comfortable with just how damn big VMs get.  Image size isn’t free and if we’re looking to go deep into the sub-millisecond response time, while running thousands of concurrent strategies, it seems we need to disintermediate the VMs and interpreters of the world. If they’re really necessary, they can be happily used for the analysis process (as I currently use R), or they can be lit-up and bridged from some lower-level language for batch-like services.

The good people at Intel have been thinking about this problem for a while as have many other seriously over-educated people. One of the (sensible sounding) conclusions reached as people look for ways to solve problems similar to my own, is that in such systems we should keep messages waiting as little as possible – ideally, not at all(!). This can be a problem in SEDA-like architectures which are basically made-up of (non-blocking, asynchronous) i/o processes linked to (blocking) queues linking pools of workers. Blocking queues can pile up and cause all sorts of problems like priority inversion and other such enigmatically named nasties. Lock-free queues and other data structures, algos and techniques promise some ways around this and I’ve been spending time looking into how they might be employed to address my issues.

Before I’m besieged by throngs of angry erlang/ocaml/scala/java developers, allow me one last observation on the topic.  (Peeved python and ruby users may rant away – vous m’amusez  ;^)

Why might a lock free algorithm be better than an equivalent, hardware-based locking implementation?  The answer isn’t obvious.  If locking is implemented in hardware as is typical (eg, with a compare-and-swap (CAS) instruction), then its explicit cost is measurable in (few) nanoseconds.  Hardware is fast.  The issue isn’t the speed of execution of the underlying primitives so much as it’s a consequence of the side effects of these operations at a very low level.  For real performance, cache coherence is King.  See here for an accessible discussion by IBM’s Paul McKenney and here for some remarkable examples from Igor Ostrovsky.  This indicates that if you want the highest possible performance, you need to be aware of what is happening ‘in the metal.’  So we need to use a system-level language and erlang, java & friends lose their candidacy in spite of any fantastic benefits they might offer.

Given that even the DoD has mostly given up on ADA means that we’re left with C/C++.

Ok, so language doesn’t seem to resolve much for us. (Indeed, it was mostly hopeful thinking on my part – design is mostly language agnostic and hardware is hardware…)

Apart from Intel’s own Threading Building Blocks (TBB) framework, there are a variety of toolkits available for exploiting lock free parallelism. Perhaps the newest and least known is called FastFlow, which is a C++ template library that provides a variety of facilities for writing efficient lock-free network models. It also claims to be faster than TBB, Cilk and OpenMP while holding out the promise of one day becoming CUDA- (or more generally, GPU-) aware which would be an incredible win. Finally, it is very small – the current version (not including tests and examples), weighs in at ~5K lines of (mostly) C++ templates.  Thus, it seems to me particularly well-suited for some experimentation to assess the fit of these techniques in this space and the level of difficulty of doing so.

In the remainder of this post, I’ll briefly describe the FF design and then illustrate a sample C++ program which uses FastFlow to ‘architecturally prototype’ a feed handler interacting with strategies inside an EMS / strategy container.

Read more…

EMS Internals, technology

Kooderive

February 3rd, 2010
photo by Simon Rogerson

photo by Simon Rogerson

Some time back, I’d written about NVidia’s CUDA noting that it looked ideal for many asset-pricing and monte-carlo type problems in finance.  At the time, I was hopeful that it would be quickly integrated into existing open source efforts like QuantLib, but adoption has proved slower than I’d hoped, most likely because implementing non-trivial problems on CUDA is, well, even less trivial than doing them without..

LMM on CUDA

Happily, I’ve just seen a promising first step in this direction as Über-quant and C++ artisan Mark Joshi recently announced an open-source project, Kooderive which looks to implement the LIBOR Market Model (LMM)  on top of CUDA.  His announcement on the QuantLib mailing lists reads:

Dear All,

various people have shown interest in the use of CUDA with QuantLib. I
have now made some progress on a CUDA implementation of the LIBOR
market model
.

In particular, I now have a path generator for the LMM working which
does 16384 paths for 40 rates, 40 steps, 5 factor model, displaced
diffusion predictor-corrector that takes 0.1 seconds on my Quadro 4600.

The state of the project is code fragments that can be called from
other code. Those who are interested can get the code via
the subversion repository on kooderive.sourceforge.net .  The only
project file is currently for VC9 x64. It also uses thrust and the
CUDA SDK.

The next stage will be writing routines, that use QuantLib for the CPU
stuff and kooderive for the GPU stuff,  to actually price things.

A gentle reminder that I will be giving a course on the LMM and
QuantLib in June in London, and I will include a session on kooderive
if there
is sufficient interest.

I am happy to take code contributions for kooderive. However, I am not
looking for a redesign of the library or contributions which introduce
dependence on other libraries. I am interested in contributions of
separate routines and of optimizations of existing routines that do
not change interfaces.

regards

Mark

Pricing exotic interest rate derivatives – The LIBOR Market Model in
QuantLib June 2010, London,
http://www.moneyscience.com/training/index.html

Assoc Prof Mark Joshi
Centre for Actuarial Studies
University of Melbourne
My website is www.markjoshi.com


EMS Internals, dereferenced, monte-carlo methods, open-source software, options pricing, technology

core arb

December 15th, 2009
core arbitrage?

FIX interface?

Cloud computing looks to have turned yet another interesting corner.   This time the turn leads towards the development of a liquid, fully electronic new marketplace in “spot instances”.

Spot‘ means what you would expect it to in the context of trading: the current pricing for immediate delivery of a commodity.  ‘Instance‘ is the atomic element within Amazon’s cloud environment; an instance is the smallest chunk of computing capability which can be provisioned within the cloud.

Amazon is making markets in cores and they’re exposing functionality just as a regular exchange would: both through user interface ’screens’ as well as programmable APIs.

From their announcement:

Spot Instances enable you to bid for unused Amazon EC2 capacity. Instances are charged the Spot Price set by Amazon EC2, which fluctuates periodically depending on the supply of and demand for Spot Instance capacity. To use Spot Instances, you place a Spot Instance request, specifying the instance type, the region desired, the number of Spot Instances you want to run, and the maximum price you are willing to pay per instance hour. To determine how that maximum price compares to past Spot Prices, the Spot Price history is available via the Amazon EC2 API and the AWS Management Console. If your maximum price bid exceeds the current Spot Price, your request is fulfilled and your instances will run until either you choose to terminate them or the Spot Price increases above your maximum price (whichever is sooner).

embedded optionality

While the inclusion of, effectively, a market data service is neat, probably the most interesting aspect of the initial protocol they’ve designed is that it contains embedded optionality and behaves a bit like barrier options.  That is, when I setup an ‘order’, I need specify a maximum price I’m willing to pay.  When the spot price drops below my max, I get “knocked-into” a contract and instances are allocated to me.  If the spot price rises above my max while I’m running, I get “knocked-out” of the contract and my jobs get terminated.

The intent is to allow for low-priority jobs to be dynamically run whenever pricing drops below a user’s threshold, but the (intended?) consequence is that it adds the delicious and malleable tang of path dependency to these instruments…

secondary markets, FIX, arbitrage..?

Amazon currently controls the market entirely, but it’s not hard to imagine a secondary market evolving.  Given that others are beginning to copy Amazon’s APIs, one can also imagine markets which operate across providers …  perhaps accessed via FIX?…

Who knows?  In the not-too-distant future, we may well be able to implement ‘core arb‘ strategies…or make markets in cores… or find that we can effectively hedge with disciplined exposure to the ‘core market’ or …

FIX Protocol, dereferenced, technology

ready to launch

November 8th, 2009
he wasnt ready...

poor Jorge wasn't ready...

In this post I’m going to revisit some of the topics discussed in the recent ‘containing a strategy‘ and ‘multi-strategy trading with regimes‘ posts, focusing on the process of assembling a strategy and its context in preparation for its launch into any of a variety of modes.

I recently realized that – from the perspective of a strategy container – the process of walk-forward testing is remarkably similar to the regime-switching model we’d discussed previously.  Up until now, I’ve employed walk-forward testing in an ad-hoc manner by taking an existing strategy and then writing a little driver very much like a unit-test scaffolding which would walk the strategy forward, permuting parameters based on previous performance.  Not a general solution, but straight-forward as I employ the strategy parameter optimizer from stratbox in this kind of a toolkit use-case.

I sat down to write one of these walk-forward scaffolds yesterday and started to think about how I could generalize the solution and roll it into stratbox’s GUI and it occurred to me that I could likely kill two birds with one stone…

Read more…

EMS Internals, back-testing, regime-switching, technology

easy money

October 27th, 2009

you, hf-trading

There seems to be a developing meme out there suggesting that algorithmic-, and in particular high-frequency, trading is some kind of gold-rush route to easy money which brings to mind…

…this revision of a paper I’d read previously: “Statistical Arbitrage in the US Equities Market” by Avellaneda and Lee.   It’s a detailed and thoroughly worked (and now re-worked) paper illustrating the development and analysis of a US equity stat-arb strategy based on Principal Component Analysis (PCA) and then revised to use ETFs.

I came across this paper as I have still never used PCA in any of my own strategy development work and read Carol Alexander’s excellent Market Models over my summer vacation with an eye towards giving a PCA hedging model a spin in the near-term. Thus, I wanted another look at this paper as a reference point.  Although it’s an excellent paper, I’m not going to urge you to go out and read it immediately unless you have a reasonably pressing practical interest.  Instead, I find it interesting largely because of one of its authors – Professor Avellaneda – and its conclusions in the form of its strategies’ performance.

I’ve seen Prof Avellaneda speak a number of times at a variety of quant meetups organized by the relevant Columbia/NYU financial engineering depts.  His paper reminds me that at least once during my noisome adolescent years, my father intoned darkly that:

the streets are littered with brilliant minds

Read more…

books, dereferenced, strategy development, technology

our solid-state future

September 4th, 2009
Mmmm... hardware..

Mmmm... hardware..

I’ve never been a hardware guy. Hardware has gotten so fast throughout my professional life that it has just never been a big issue. Also, on wall st we had a robust and annual budget for h/w so I’d routinely sign-off on hundreds of thousands of dollars on all sorts of machines I’d never lay eyes on and somehow they always did the trick.

Before 9/11, they’d be in server racks in the building or down the street, but since then they might also be in increasingly far-flung places like weehawken or long island, tampa, even texas or beyond. The machines always seemed unbelievably overpriced – I remember over the years pretty consistently paying something like $40K for a low-end db server.  But that’s what it cost and you could only purchase approved products from approved channels, so nobody spent much thought on it.  Now that I don’t have the same kinds of constraints – or budgets! – I increasingly have to think of hardware.

As a software engineer, the hardware itself is also insisting that I pay some uncharacteristic attention to it.  The evolution of processors has reached a point where the programming paradigms many of us have fruitfully employed over many years are no longer suited for getting full performance out of today’s machines.  The recent introduction of remarkably powerful and inexpensive parallel-computing platforms based on GPUs like nvidia’s cuda also outline a future that even current university training doesn’t address in a fashion practically adapted for institutional application.  Cores are multiplying like Tribbles.

The lines between persistent storage and main memory are also blurring as consumer SSDs push up from the ‘low’-end while exotic ioDrives and the like offer a glimpse of a world where the performance gap between the two approaches nil and after their long reign myriad metallic platters will spin no more.

Read more…

EMS Internals, market data, technology

containing a strategy

August 19th, 2009

My son recently had his first birthday and amazes me daily with his new feats as he runs around increasingly stably exploring the world around him.  It occurs to me that the system I use to trade every day, Stratbox, is approaching its fourth “birthday” in the next few months.  I hadn’t originally intended to write a system – an algorithmic trading platform – but found that existing products were limited, expensive and didn’t fit my mental model of what they should do.

This isn’t surprising as I wanted the system to support all of the activities associated with our algorithmic trading.  It turns out that that’s a lot to ask of a system.  It also turns out that you learn as you go and so the system continues to evolve.  A few years ago I’d posted about the basics of a strategy container and in this post I’m going to come back to this topic and describe some of the layers of code and thought developed since then.

First, let’s consider the role of a strategy container.  Its job is to intermediate between trading strategies and the external environments with which they interact.  It must also provide services that strategies can use (e.g., position management) and that it wouldn’t make sense for each strategy to re-implement.  In the past I’ve focused on the former responsibility of adapting strategies to external environments.  Why is this necessary and interesting?  Because it allows us to take the same exact strategy and run it live, or in simulation or in backtest, etc.  Interesting and necessary, but not what I want to focus on this time.  Instead, I want to look at the services provided to strategies; the ‘ecosystem’ a strategy container provides in the hope that strategies might flourish within it.

Read more…

EMS Internals, FIX Protocol, portfolio management, strategy development, technology

real battlebots

August 17th, 2009

wedding party popper

There’s been a lot of attention focused on  trading battlebots recently.  It’s important to keep in mind that this is part of a long-standing, broad and arguably inexorable trend that is now spreading rapidly away from its successful base in industrial manufacturing to every other conceivable field from scheduling and logistics, to CAD and on to more aggressive pursuits like trading and battlefield operations.  Perhaps looking at the state of the art in related fields can inform us about the direction of our algo bots.

This article in Foreign Policy illustrates an area where automation is making great strides into historically human undertakings.  The use of so-called drone aircraft for recon and tactical missile strikes has reached a remarkable milestone: this year, the US Air Force will train more “pilots” for unmanned aircraft than for real fighters or bombers.  Evidently there’s good reason for this change:

By 2013, software and communications improvements will allow the Air Force’s unmanned-aircraft pilots to simultaneously fly three drones at one time, and four in an emergency. Another factor supporting the likely proliferation of drones such as the Predator, Reaper, and Global Hawk is their low cost compared with new manned aircraft such as the F-35 Joint Strike Fighter.

According to the Government Accountability Office, $24.5 million will purchase a set of four MQ-9 Reaper hunter-killer drones plus a ground station and satellite relay. (See page 117 of this report.) The latest guess of the price for a single F-35 fighter-bomber is $100 million. (See page 93.) This gap in cost led Defense Secretary Robert Gates to demand the cancellation of the manned F-22 Raptor program in order to fund the purchase of more drones for service in Afghanistan and Iraq.

Read more…

dereferenced, technology

the trading frequency spectrum

July 28th, 2009

I’ve been saving the above image in a stubbed-out blog post I’ve wanted to write since a conversation I’d had in Jerusalem last fall.  The recent attention to high frequency trading and all of its attendant evils has reminded me that the topic is relevant and so I relate various thoughts at the risk of jumping on a cacophonous bandwagon of rumbling misinformation.

First of all, the conversation.  It was with a talented guy who acted as the CFO for a variety of companies including a small startup hedge fund which traded US equities at a high frequency.   Although he was a part-time cfo, he seemed pretty plugged-into their trading operations and noted that they use an agency-only brokerage service for automated traders I’m familiar with and that they were “looking at full data for many” hundred stocks concurrently. He remarked that their trading was going well but that their hit rate was something like 4% and dropping.  By hit rate, he meant that they were placing limits frequently and generally pulling the orders if they didn’t get hit immediately.  He didn’t specify, but I imagine that “immediately” might range from milliseconds out to a second or twenty.  If the market is composed of makers and takers, then these guys were definitely makers of liquidity in the strict sense that they were placing limits and making markets.

At the time I thought it was interesting because it seemed that so many people were focused on the very, very short term trade that the frequency was becoming saturated.  It looked like a reminder that trading frequencies populate a spectrum; in this case, this part of the spectrum was becoming so saturated that returns were becoming increasingly difficult to obtain as more players crowded into it.  I’m not sure how this hedge fund has fared, but at the time I remember thinking that they were going to have a tough time competing if they were only geared for high-frequency trading as the space becomes increasingly expensive to play in as the inevitable talent and technology arms race marches on.

Lo and Khandani provide the below image illustrating this phenomenon happening to a class of contrarian strategies Lo & MacKinlay had described in 1990.  The strategies stop working as people squeeze out the alpha.

Read more…

hedge funds, our managed markets, startup, strategy development, technology

the other interesting thing about the Serge Aleynikov story

July 8th, 2009
his haunted house

as suspected, the seat of evil can be found in NJ

There’s a whole bunch of interesting things about this story of how a programmer has allegedly stolen some of the code at the place he’d worked.  One is the remarkable reverb it’s created amongst bloggers.  The house pictured left is evidently the diabolical mastermind’s home according to the NJ Real Estate Report.  Another is the fact that a programmer stealing some code is news. Funny what becomes news (apparently a fêted pedophile died) and what doesn’t (we are creating millions of refugees in Pakistan).

One angle that I haven’t seen highlighted in all of the commentary is Mr Aleynikov’s choice of weapon.  Seems that he was an erlang guy with an interest in ocaml.  Choosing functional programming for algo trading systems is an interesting but not unique choice.

Read more…

our managed markets, strategy development, technology