As I’ve written before, I’m not a particularly big fan of technical analysis or any of the many and varied charting techniques people espouse. That said, we are working with a proprietary futures trading company and some of the successful (non-algo) trading that they do involves point-and-figure charts. Although a trading algorithm doesn’t care about graphical representations, I wasn’t familiar with the technique and decided that the best way to understand it was to try to implement it, which is how I spent my Saturday evening …
The above applet re-uses the one I’d written previously in discussing simple stochastic processes. This time, it illustrates a point & figure chart below the regular line chart. Point & figure charts expose two characteristics: a “box size” (in ticks) and a “reversal” (in boxes). The applet allows you to vary both and then generate a day’s worth of random/synthetic data to view it. One of the nice features of JFreeChart is that you can easily “zoom” into a chart by dragging within the chart. I’ve disabled this in the line chart but you can try it in the p&f chart. (Note: you should right-click and “Auto-Range-Both Axes” before you generate new data or you’ll stay in the zoomed segment of the chart.)
Now that I think I understand the basics of point & figure charting, it will be interesting to see what an algo might do with it…
open-source software, strategy development, technology

One Big Table (and chair)
Last time I described the trajectory of my research into using hdf5 for large amounts of tick data. This time I describe the basic design of the prototype I implemented and some of its performance characteristics.
Read more…
EMS Internals, market data, open-source software, technology

One of the nicest things about the holiday season (Happy New Year, btw) is that it provides a lovely opportunity to spend some quality time with a project that’s a bit more exploratory than might be meaningfully undertaken while trading in lively markets.
A number of months ago, I mentioned using HDF5 to manage tick data as RDBMSes just aren’t up to the task and specialized Tick DBs are absurdly expensive. While I’d spent some time exploring this idea through the fall, I never had a discrete chunk of time to really explore the technology beyond determing that its Java interfaces weren’t production-worthy. This meant that we’d have to drop into C to access the functionality we’re interested in and that we’d have to come up with our own bridge out into Java for access by StratBox while StratCloud could access it directly.
Below, I describe what I’ve learned through my holiday geek-spelunking-trek including some timings on various configurable characteristics of HDF5 (e.g., compression and “chunking”).
Read more…
EMS Internals, market data, open-source software, post-trade analysis, technology
I hope you’ll forgive me a post as off-topic as the picture accompanying it is peripatetic. As far as I can tell, it has no algorithmic trading application. But I’ve spent my life around software and it’s pretty rare that I find something that truly wows me.
And Shazam is one of those rare beasts.
I don’t know how they do what they do, but what they do is really something else. Install a free iPhone application (I think there are other implementations available as well), start it up and point the phone’s mic at the source of some music. In a few seconds, the software will identify the song, download its album cover and assorted info, and point you to where you can purchase the song. It’s not perfect – it couldn’t identify the beginning of Beethoven’s 5th, but it was able to identify Carmina Burana and had no problems with any popular music I pointed it at.
I’m certain that if you had a contest between this system and your favorite neighborhood audiophile, Shazam would win hands down for speed and accuracy. It’s really something else.

dereferenced, startup, technology
While the war over the latest+greatest video cards for the current generation of graphics intensive games seems always to ebb and flow between nVidia and its arch-rival ATI, I’ve long preferred nVidia for their better support of Linux. Thus, all of my machines have some sort of nVidia Graphics Processing Unit (GPU) in them.
For those who spend their workdays in the markets and their weekends pondering derivatives pricing, latency, oceans of market data, portfolio optimization, and how to make every last damn thing faster, a preference for nVidia cards could prove to yield an unexpected benefit.
nVidia has recently unveiled a product line dubbed “TESLA” which leverages their absurdly fast GPUs to provide a supercomputer-like High Performance Computing (HPC) platform at a previously unimaginable price point. TESLA computers are regular machines that have a set of slightly modified GPUs in them; modified such that they have no video out, but instead become additive processing clusters which the machine can use for compute intensive tasks. For about $10K you can buy a 1U machine with some 4 teraflops of capacity. By way of comparison, this is over 20 times faster than the funky Helmer project I’d been drooling over a few months ago in a production-worthy package ready for the server room today.
So, TESLA refers to the machines built with these specialized GPUs. Making all this power usable is what CUDA is about…
Read more…
monte-carlo methods, options pricing, portfolio management, technology
Inevitably one of the first ideas people have when they start thinking about how to write a trading algorithm turns out to be among the hardest: trading the news. The problems are many and in some cases not so obvious…but the natural appeal of the idea seems universally compelling.
Just after the dot.com craze, a brilliant friend of mine (who had just sold his web consulting startup) decided to write a book. The premise was glorious. A bunch of clever college-age kids formed a startup to predict the stock market. The method they used was to constantly comb the web with ultra-sophisticated algorithms which would run across giant server farms overnight and ultimately generate tomorrow’s headlines. Based on the headlines that their system generated, they would place trades that would take advantage of these predicted events.
Sadly, my friend never went on to complete his book, so I don’t know how it all turned out. (Instead, he went on to start another successful company, this time in the field of robotics.) While he was writing it, I loved getting new drafts as they were filled with clever ideas. But the core idea of predicting headlines and then using those headlines to trade always struck me as especially cute.
For those of us without access to news-predicting algos, writing strategies based on the news is rather less straight forward, though there are a growing variety of products and services aiming to fill the gaps. Today must have been trading-the-news-day as I found a few articles on the topic in my mailbox and even received a cold call from a vendor, Need to Know News, with just such an offering. Below I’ll look at some of these offerings and consider some of the issues involved in writing trading strategies based on the news. Read more…
back-testing, market data, startup, strategy development, technology

I came across a cute tool, Wordle, while reading this post on the Big Picture and decided to run wordle against my blog. The above picture is how wordle depicts this blog. Given my last post, it’s pretty funny that “data” wins by a landslide.
Wanting to always be a paragon of netiquette, I decided to also run it against the Big Picture. This yielded an interesting insight into that blog’s focus:

Update – I’ve fixed the link to wordle (.net not .com), thanks to Luca’s eagle-eyed editing.
dereferenced, technology

While Carl Sagan’s famous formulation introduced a generation to the vastness of the cosmos, more recent history suggests that his memorable term might now be more aptly applied to financial extents: our deficits and debts, perhaps, to the economically or politically minded. But for those of us with the markets on our mind, the term has to evoke the enormity of the data we create and must manage every day. We’ve recently been working with the NYSE’s TAQ data in an effort to integrate it into StratBox’s back-testing and optimization capabilities. And the enormity of the data is really just staggering.
Each day, the NYSE publishes all of the day’s quotes and trades as well as some reference data. Compressed, the data will just about fit onto a DVD. For one day. A DVD. Compressed. It’s really mind-boggling. A year of the stuff, uncompressed, will require over a petabyte of storage. Over 1,125,899,906,842,624 bytes. And that’s just the US Equities markets. You want options data, too? I hope your uncle is named EMC, because just managing the data is going to be a challenge…
Read more…
back-testing, market data, open-source software, post-trade analysis, technology

One of these days I’m going to give an overview of all the excellent open-source software I use on a daily basis. Until that day comes, I’ll observe that finance remains one of the big areas where open-source software has made relatively limited inroads.
Two production-quality packages fight that unhappy state: QuantLib – a comprehensive framework for quantitative finance – and QuickFix – a full-featured FIX engine. Both are C++ libraries and both provide very nice interfaces to facilitate integration with other languages, including Java. QuantLib is a big and complicated library and integrating it with Java is not totally obvious. Below, I’ll describe how to build and use QuantLib from Java.
These instructions are based on a unix installation. I’m not really a windows developer and don’t have all the shiny tools that windows developers use, so it’s not an area of focus for me. That said, I have managed to build QuantLib under windows by using MinGW+MSYS but it wasn’t terribly easy and I don’t currently have a working installation, so I won’t cover that here. If this is your aim, don’t be dismayed as it is possible and it had all the functionality I enjoy under linux.
Using QuantLib from Java (on linux)
- Build QuantLib
- Build QuantLib-SWIG
- Requires a working copy of SWIG. Again, look to the SWIG instructions, but it should be easy.
- Once SWIG is available, building the QuantLib/SWIG interfaces should only require:
sh autogen.sh
./configure \
–with-jdk-include=${JAVA_HOME}/include \
–with-jdk-system-include=${JAVA_HOME}/include/linux
make -C Java
sudo make install
- Programs which call QuantLib functionality will also need to explicitly load the QuantLib libraries. This can be done with something like the following static block appearing before your main method:
static { // Load QuantLib
try { System.loadLibrary("QuantLibJNI"); }
catch (RuntimeException e) { e.printStackTrace(); }
}
- That’s it. Now test your configuration by running the examples in Quantlib-SWIG/Java/examples.
It’s worth understanding how Quantlib is being used from java. SWIG is creating a JNI interface into those methods within Quantlib which have been exposed through their declaration in the swig *.i files. These files are found in Quantlib-SWIG/SWIG and they determine what functionality from Quantlib will be available to you. You’ll likely need to get familiar with a subset of those files that you care about. If you find that some functionality you care about isn’t exposed in those files, you may need to expose it yourself.
There’s a learning curve, but it’s worth traversing so you can get at all the rich functionality so many smart people have put together.
FIX Protocol, monte-carlo methods, open-source software, options pricing, technology

I came across this article on a Go-playing program and thought it was interesting. Particularly this aspect of it (from wikipedia):
One major alternative to using hand-coded knowledge and searches is the use of Monte-Carlo methods. This is done by generating a list of potential moves, and for each move playing out thousands of games at random on the resulting board. The move which leads to the best set of random games for the current player is chosen as the best move. The advantage of this technique is that it requires very little domain knowledge or expert input, the tradeoff being increased memory and processor requirements.
dereferenced, monte-carlo methods, technology