basic economics of an algorithmic trading startup

March 2nd, 2010

or: how to quit your job to riches!

Recently I had a thought-provoking email exchange with a reader of this blog.  It was with a fellow who wanted to startup an algorithmic trading business and was seeking my advice.  Its arrival was timely for me and I hope the advice I provided the aspiring entrepreneur was helpful.

The conversation forced me to reconsider familiar terrain from a different perspective, so I share it along with some further thoughts in the hope that it might act as something of a counter-weight to articles and blogs entreating you to “learn algo-trading!” (as though it were a fun! weekend hobby) or “make a million % trading the xyz” and more nefarious advertisements selling the trading equivalents of male-enhancement pills & potions (“I turned a used bus ticket and some pocket change into $7M in 2 months trading the e-mini!”).

I’ve changed the writer’s name and all identifying characteristics for obvious reasons.  He wrote: Read more…

startup

lock free

February 16th, 2010

One of the recurring technology themes in these pages has been the ongoing and dramatic move from single to multi-core systems and the need to seriously increase the parallelism in our software designs. For me, one of the seminal, large-grained design patterns was the SEDA Architecture. For years, this informed my systems’ designs and formed a conceptual backbone for development. That said, I’ve been broadly aware for some time that SEDA’s golden age has (incredibly!) already passed us by, but haven’t identified what might replace it as a reference point for my design efforts.

Before considering tools, languages or patterns that might help, we need to reflect on the problem(s) we’re trying to solve. The problems inside an EMS look to me, after years of development, a lot like network routing problems. Indeed, my current view is that this (not just concurrency as I’d suggested at the time) is why the unfortunate Aleynikov & co. at GS were using erlang.

Why network routing? Think about the load on an EMS. The main issue is that you’re getting many thousands of teeny little messages per second and only a relatively small number of them matter to only a relatively small subset of ‘agents’ within the system. Reducing latency is all about making sure the time you spend on each message is minimized, and that the agents who are interested in a particular message needn’t wait for each other to do whatever they care to do based on the message. So, really you’re trying to route each message through your system with as few ‘hops’ possible and as much parallelism as you can muster under the (radically!) new assumption that you may have hundreds or thousands of cores available to you during the lifetime of the design.

I spent some time thinking (hoping) that languages might help furnish an answer. Perhaps a move to a functional language like erlang, ocaml or scala might help furnish at least a partial answer. But erlang is slow and peculiar, ocaml doesn’t support intra-process concurrency and scala looks like a bloated language on a bloated platform (jvm+java class library). And none of them seem to have achieved anything near the critical mass which is so crucial for the development of usable libraries and the availability of skilled developers with long experience in the technology.  Naturally, reasonable people will disagree about such things, but this is my view (today). Java is ok (and certainly sells servers), but it’s not obvious how it’s going to help me offload my work onto a GPU anytime soon (and jni is both painful and slow) and I’ve never been able to get comfortable with just how damn big VMs get.  Image size isn’t free and if we’re looking to go deep into the sub-millisecond response time, while running thousands of concurrent strategies, it seems we need to disintermediate the VMs and interpreters of the world. If they’re really necessary, they can be happily used for the analysis process (as I currently use R), or they can be lit-up and bridged from some lower-level language for batch-like services.

The good people at Intel have been thinking about this problem for a while as have many other seriously over-educated people. One of the (sensible sounding) conclusions reached as people look for ways to solve problems similar to my own, is that in such systems we should keep messages waiting as little as possible – ideally, not at all(!). This can be a problem in SEDA-like architectures which are basically made-up of (non-blocking, asynchronous) i/o processes linked to (blocking) queues linking pools of workers. Blocking queues can pile up and cause all sorts of problems like priority inversion and other such enigmatically named nasties. Lock-free queues and other data structures, algos and techniques promise some ways around this and I’ve been spending time looking into how they might be employed to address my issues.

Before I’m besieged by throngs of angry erlang/ocaml/scala/java developers, allow me one last observation on the topic.  (Peeved python and ruby users may rant away – vous m’amusez  ;^)

Why might a lock free algorithm be better than an equivalent, hardware-based locking implementation?  The answer isn’t obvious.  If locking is implemented in hardware as is typical (eg, with a compare-and-swap (CAS) instruction), then its explicit cost is measurable in (few) nanoseconds.  Hardware is fast.  The issue isn’t the speed of execution of the underlying primitives so much as it’s a consequence of the side effects of these operations at a very low level.  For real performance, cache coherence is King.  See here for an accessible discussion by IBM’s Paul McKenney and here for some remarkable examples from Igor Ostrovsky.  This indicates that if you want the highest possible performance, you need to be aware of what is happening ‘in the metal.’  So we need to use a system-level language and erlang, java & friends lose their candidacy in spite of any fantastic benefits they might offer.

Given that even the DoD has mostly given up on ADA means that we’re left with C/C++.

Ok, so language doesn’t seem to resolve much for us. (Indeed, it was mostly hopeful thinking on my part – design is mostly language agnostic and hardware is hardware…)

Apart from Intel’s own Threading Building Blocks (TBB) framework, there are a variety of toolkits available for exploiting lock free parallelism. Perhaps the newest and least known is called FastFlow, which is a C++ template library that provides a variety of facilities for writing efficient lock-free network models. It also claims to be faster than TBB, Cilk and OpenMP while holding out the promise of one day becoming CUDA- (or more generally, GPU-) aware which would be an incredible win. Finally, it is very small – the current version (not including tests and examples), weighs in at ~5K lines of (mostly) C++ templates.  Thus, it seems to me particularly well-suited for some experimentation to assess the fit of these techniques in this space and the level of difficulty of doing so.

In the remainder of this post, I’ll briefly describe the FF design and then illustrate a sample C++ program which uses FastFlow to ‘architecturally prototype’ a feed handler interacting with strategies inside an EMS / strategy container.

Read more…

EMS Internals, technology

transitions

February 8th, 2010

Today we return to our series on regime switching and the topic of managing portfolios of strategies.  In particular, we build on the examples illustrated in sensitivity testing and steppin’ out, in which we showed historical and then real-time ‘forward-walking’ of strategies.  The next step we’d described was to evolve the techniques illustrated to support the real-time management of a portfolio of strategies.

In the example below, we look at another ‘meta’ strategy named StrategyPortfolio which maintains a dynamic portfolio – P – of strategies which it will select from a set of strategies – S – running concurrently in simulation.  The constituents of P as well as their cash allocations and parameterizations will be rebalanced/adjusted regularly after an initial ‘out-of-sample’ period during which only the S strategies are run.

Apart education, the intention of this strategy, as I’d originally suggested here, is to ‘back-into’ a regime-switching strategy without attempting to directly quantify the regimes explicitly.

This has proved to be even more interesting than I’d expected, not so much because it performs particularly well (though it’s promising), but because of all of the things it has taught us.  In particular, the transitions are a killer and there are properties of strategies which (dis-)qualify them from being effective in such a scheme…

Read more…

EMS Internals, portfolio management, regime-switching, strategy development

Kooderive

February 3rd, 2010
photo by Simon Rogerson

photo by Simon Rogerson

Some time back, I’d written about NVidia’s CUDA noting that it looked ideal for many asset-pricing and monte-carlo type problems in finance.  At the time, I was hopeful that it would be quickly integrated into existing open source efforts like QuantLib, but adoption has proved slower than I’d hoped, most likely because implementing non-trivial problems on CUDA is, well, even less trivial than doing them without..

LMM on CUDA

Happily, I’ve just seen a promising first step in this direction as Über-quant and C++ artisan Mark Joshi recently announced an open-source project, Kooderive which looks to implement the LIBOR Market Model (LMM)  on top of CUDA.  His announcement on the QuantLib mailing lists reads:

Dear All,

various people have shown interest in the use of CUDA with QuantLib. I
have now made some progress on a CUDA implementation of the LIBOR
market model
.

In particular, I now have a path generator for the LMM working which
does 16384 paths for 40 rates, 40 steps, 5 factor model, displaced
diffusion predictor-corrector that takes 0.1 seconds on my Quadro 4600.

The state of the project is code fragments that can be called from
other code. Those who are interested can get the code via
the subversion repository on kooderive.sourceforge.net .  The only
project file is currently for VC9 x64. It also uses thrust and the
CUDA SDK.

The next stage will be writing routines, that use QuantLib for the CPU
stuff and kooderive for the GPU stuff,  to actually price things.

A gentle reminder that I will be giving a course on the LMM and
QuantLib in June in London, and I will include a session on kooderive
if there
is sufficient interest.

I am happy to take code contributions for kooderive. However, I am not
looking for a redesign of the library or contributions which introduce
dependence on other libraries. I am interested in contributions of
separate routines and of optimizations of existing routines that do
not change interfaces.

regards

Mark

Pricing exotic interest rate derivatives – The LIBOR Market Model in
QuantLib June 2010, London,
http://www.moneyscience.com/training/index.html

Assoc Prof Mark Joshi
Centre for Actuarial Studies
University of Melbourne
My website is www.markjoshi.com


EMS Internals, dereferenced, monte-carlo methods, open-source software, options pricing, technology

dingbat kabuki

January 28th, 2010

Like many Americans, last night I dutifully switched on my TV at 9pm to see the State of our Union.  Always a spectacle, America’s leadership have upped the surreality ante with the bizarre backdrop of Biden lip-synching amiably in the background whilst Madame Speaker sat with all the calm collection of a fish on a hook and never seemed fully in control of herself or her eyebrows.  The spectacle of would’ve-been king McCain sitting there and glowering openly at the lecturn as his confederates sat in stony silence while their ‘opposition’ applauded like drunken high schoolers at a home coming at every mundane utterance proved a bit much and I had turned off the glowing beacon of groupthink by 9:25 and gone to investigate something on my computer.  I was surprised and delighted to see that it was still available: dingbatkabuki.com

Dingbat Kabuki and other structural market hacks

When I first started puppetmaster trading, one of my dearest friends, a Yale-educated economist and professor of same, asked me an important question.  He asked:

In the markets, there are always ‘insiders’ who have the ability to trade on knowledge that you can’t know or with an advantage that you can’t have.  How are you going to compete with these players?

I provided a variety of answers, but at the time my conception of the universe of people with both inside knowledge and the ability to trade on it was limited to cases like that of Mr Rajaratnam.  I believed that cases like these were constrained by clear laws that were duly surveilled and prosecuted by the appropriate authorities.  The problem seemed like a very real one, but constrained in size and not essential to my enterprise.  I still hope that my belief of the time was true, but since then I’ve certainly understood that there’s more than one way to hack the market.

For some, a market hack might consist of some kind of simple (or complex) algorithm(s) applied to some set of markets.  But this really isn’t a hack so much as it’s a trading strategy – like many that have long existed – only that it’s now implemented in software where originally it would have been implemented in wetware.  While implementing trading strategies in software does open up new vistas in terms of the kinds of strategies that you can look to implement – computers are faster than people by a noteworthy amount in many tasks – but, for the most part, you’re really still just trading and when you take on positions, you are still bearing risk.  You might be ‘hacking’ but it’s really not a market hack as I’ve come to appreciate.

Read more…

execution quality, market structure, our managed markets, post-trade analysis

“the SEC made Madoff”

January 17th, 2010

Bill Harts, a friend of mine who has, as they say, forgotten more about electronic trading and market structure than most will ever be burdened by, has recently taken an interest in the public letters written to the SEC in response to their requests for public comments on dark pools.  Mostly, these letters are funny and reveal people’s propensity to point shoot and aim in that untidy order.

But some are revealing and one in particular is 100% required reading for anyone interested in electronic markets.

The writer introduces himself thusly:

I am Steve Wunsch, the principal inventor of two SEC‐regulated stock exchanges, the Arizona Stock Exchange “AZX” (originally called Wunsch Auction Systems, Inc. “WASI”) and the ISE Stock Exchange, both of which include dark pools. In fact, both of them, like all modern stock exchanges, have both lit and dark components and, thus, have provided me with potentially useful perspective on the dark pool question and on transparency in general. I will focus heavily on the latter, for it is impossible to understand the dark pool issues raised without understanding the value of transparency or, if improperly applied, the lack thereof. The AZX experience was, I believe, particularly instructive in this regard. Its highly transparent call market structure, combined with its unique regulatory status as a “low volume exempt”exchange, enabled me to see transparency and the role of regulation in promoting it from a perspective that I don’t believe anyone else has.

He deftly mixes snark and a historical perspective on regulation with an opinionated and informed view on the forces driving current equity markets’ microstructure arguing that the worst issues are due to regulatory failures.  He concludes, logically enough, that the SEC should be disbanded.  Perhaps his most inflammatory bit is his claim that the “SEC made Madoff.”  For effect, the section is entitled “An American Oligarchy”:

AN AMERICAN OLIGARCHY

It is not in the Commission’s interest to admit failures of policy, such as the ones I have described in this letter, and I have never seen it done. It was not in the Commission’s interest to admit that Bernie Madoff was the SEC’s most trusted and intimate confidante in formulating and selling transparency, electronic trading and
the whole NMS concept to Wall Street, the public and Congress. His legitimate business was the epitome of the kind of transparent electronic competition that NMS’s leveling policies were trying to create, and he occupied the most favored place of all industry advisors on policy and rules as NMS was being created. In a very real and literal sense, Madoff’s legitimate business and NMS were made for each other. NMS cleared a path for the application of continuous transparency by new electronic competitors, very visibly led by Madoff, enabling him to become at one time the third largest market in the United States, even though he wasn’t officially registered as anything but a broker‐dealer.

Had the SEC not emasculated the rules by which the NYSE controlled its members, Madoff would never have happened. In the time before NMS, when the exchange had Rule 390 or the stronger Rule 394 before it, diverting orders away from the floor or selling them to Madoff would have been banned. But on antitrust principles, the SEC wanted to foster NYSE‐busting competition in NMS, and Madoff became its PosterBoy for such competition. In order to make way for him, the SEC opened up a variety of loopholes that allowed orders to be diverted from NYSE to Madoff and printed on regionals like Cincinnati. Rules 19c‐1, 19c‐2 and 19c‐3 were in this vein. There were perennial attempts by the NYSE to plug the loopholes and rein in the membership, but the SEC batted them all away, enabling Madoff to continually grow his business. Eventually, the NMS environment forced the NYSE to abandon Rule 394, then Rule 390 and ultimately its membership organization altogether when it demutualized. This was all very good for Madoff. And Madoff was very good for NMS, giving it industry cred far in excess of what this poorly articulated socialist leveling theory could have had without his support.

In spite of a 457‐page SEC investigation into Madoff and how his Ponzi scheme was missed, the most obvious reasons were not considered, namely, that Madoff played a central role in helping the Commission design and sell NMS, and that NMS made him rich long before the Ponzi scheme. Most importantly, the credibility that theCommission’s collaboration with Madoff on NMS conferred on him was the principal factor enabling him to bring in money for the Ponzi scheme. Although the investigation’s report notes his credibility in the industry, it is mentioned as if itwere just a fact of life and was already there. Not mentioned is that his superior access to the SEC and apparent influence over the Commission, both of which were implicitly proved by his ability to get rich on NMS, are the most important reasons that he had such extraordinary credibility in the industry. The truth is that the SEC made Madoff. He could not have existed as a threat to investors without the Commission’s active and dedicated support over several decades.

Although, in typical blogger fashion, I’ve highlighted his spiciest claim, the rest of the letter is more technical and informative while just as entertaining.  I encourage you to read it and then engage in a thought experiment in which You are the designer of an electronic exchange and must balance the needs of a very heterogeneous set of users and stakeholders while ensuring transparency, liquidity, profitability, “fairness”, performance (he references an exchange targeting 100M executions per second) and utterly fail-safe transactional integrity…

I have embedded the full letter below the break…

Read more…

dereferenced, our managed markets

core arb

December 15th, 2009
core arbitrage?

FIX interface?

Cloud computing looks to have turned yet another interesting corner.   This time the turn leads towards the development of a liquid, fully electronic new marketplace in “spot instances”.

Spot‘ means what you would expect it to in the context of trading: the current pricing for immediate delivery of a commodity.  ‘Instance‘ is the atomic element within Amazon’s cloud environment; an instance is the smallest chunk of computing capability which can be provisioned within the cloud.

Amazon is making markets in cores and they’re exposing functionality just as a regular exchange would: both through user interface ’screens’ as well as programmable APIs.

From their announcement:

Spot Instances enable you to bid for unused Amazon EC2 capacity. Instances are charged the Spot Price set by Amazon EC2, which fluctuates periodically depending on the supply of and demand for Spot Instance capacity. To use Spot Instances, you place a Spot Instance request, specifying the instance type, the region desired, the number of Spot Instances you want to run, and the maximum price you are willing to pay per instance hour. To determine how that maximum price compares to past Spot Prices, the Spot Price history is available via the Amazon EC2 API and the AWS Management Console. If your maximum price bid exceeds the current Spot Price, your request is fulfilled and your instances will run until either you choose to terminate them or the Spot Price increases above your maximum price (whichever is sooner).

embedded optionality

While the inclusion of, effectively, a market data service is neat, probably the most interesting aspect of the initial protocol they’ve designed is that it contains embedded optionality and behaves a bit like barrier options.  That is, when I setup an ‘order’, I need specify a maximum price I’m willing to pay.  When the spot price drops below my max, I get “knocked-into” a contract and instances are allocated to me.  If the spot price rises above my max while I’m running, I get “knocked-out” of the contract and my jobs get terminated.

The intent is to allow for low-priority jobs to be dynamically run whenever pricing drops below a user’s threshold, but the (intended?) consequence is that it adds the delicious and malleable tang of path dependency to these instruments…

secondary markets, FIX, arbitrage..?

Amazon currently controls the market entirely, but it’s not hard to imagine a secondary market evolving.  Given that others are beginning to copy Amazon’s APIs, one can also imagine markets which operate across providers …  perhaps accessed via FIX?…

Who knows?  In the not-too-distant future, we may well be able to implement ‘core arb‘ strategies…or make markets in cores… or find that we can effectively hedge with disciplined exposure to the ‘core market’ or …

FIX Protocol, dereferenced, technology

peaky

December 8th, 2009
messages per second across all feeds

messages per second across "all" feeds

I came across this compelling site which uses a hardware-based ticker plant (Exegy) in a colo environment to measure peak bandwidth across scads of NA feeds and then, every minute, updates a chart like the above to capture the average messages/sec across all of them.  Pretty swank.

While the uninformed may rail against colocation rather than focus on less intriguing issues like banana-variety corruption, they miss the basic point that colo can be done by anyone with the checkbook and the wish to do so.

unfair advantage?

unfair advantage?

It’s sort of like that boat in Forrest Gump.  Forrest wanted to be a shrimper.  So he invested in a boat.  With his initial capital, hard work, perseverance and a bit of luck, Forrest made a go of it.  He might easily have not made it. Colo is like that.  You can shrimp without a boat if you have a mask and fins, but it’s likely not a sustainable model… either way, it’s hard to see the harm in Gump’s boat.  Or colocation.

Hat-tip to Rodrick’s Web Log !! for spotting the market data peaks site.

dereferenced, market data, our managed markets

steppin’ out

November 25th, 2009

We’ve been looking at what we’ve been calling “meta-strategies” – strategies that act upon other strategies – with the goal of implementing something like we’d described in the recent regime-switching post.  (Please note that since then I’ve added a category to capture this thread.)

Last time we saw an example of historical forward-walking of a portfolio-oriented day-trading strategy which utilized daily data.  This time we do something a bit more interesting and correspondingly complex.  Today we’ll look at a real-time forward-walk of a moderate-frequency strategy (trades perhaps a few hundred times in  a day) which looks at the top-of-the-book but doesn’t use market-depth.  The strategy is a simple mean-reverter that we’ve described before though we’ve had to make some small changes to get it to behave in the context we’re looking at now…

Read more…

EMS Internals, regime-switching, strategy development

sensitivity testing

November 14th, 2009

'optimization' or 'search'?

We’ve been looking at how a strategy container might view and implement a variety of modes for strategies it will launch and contain.  Last time I documented a uniform initialization process for many of them, including a posited walk-forward parameter optimization mode.  I’ve implemented an initial version of this that I’ll illustrate through a screencast (first ever – be gentle) below, but before continuing want to raise a couple of cautionary notes about the slope we’re traversing here.

From the very first post on this blog I’ve tried to underline the danger that over ‘optimization’ poses in view of the simple unalterable fact that if you look at enough random junk, you are bound to see things that look impossibly good.  Doesn’t mean they’re actually good.  In the context of trading strategy development, this is a particular danger as strategy parameter optimizers are easy to come by and can be very misleading if employed naively.  I think this is in part due to the term ‘optimization’ which is really a stretch for what these tools do.  They’re better described as search tools as they are really searching through a tuple-space of possible parameter combinations that you’ve specified, and then ranking them by some criteria you specify.

They’re still useful, but less as ‘optimizers’ and more as tools for judging the sensitivity of the strategy to different parameterizations.  If the strategy demonstrates good performance and stability over a variety of market conditions and parameterizations, you may just have found yourself a winner

Anyway, I felt that had to be said…

Read more…

EMS Internals, back-testing, performance analysis, portfolio management, regime-switching, strategy development