<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: managing tick data with hdf5</title>
	<atom:link href="http://www.puppetmastertrading.com/blog/index.php/2009/01/04/managing-tick-data-with-hdf5/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.puppetmastertrading.com/blog/2009/01/04/managing-tick-data-with-hdf5/</link>
	<description>Algorithmic trading experiences</description>
	<lastBuildDate>Wed, 28 Jul 2010 01:46:39 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: tito</title>
		<link>http://www.puppetmastertrading.com/blog/2009/01/04/managing-tick-data-with-hdf5/comment-page-1/#comment-6625</link>
		<dc:creator>tito</dc:creator>
		<pubDate>Sat, 31 Oct 2009 12:24:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.puppetmastertrading.com/blog/?p=262#comment-6625</guid>
		<description>I&#039;ve given-up irrevocably on trying to fit this kind of data into a relational db.  Round hole, square peg as they say.  

100M records is a small fraction of the *daily* US equity activity (top of book only) data delivered by TAQ which has thousands of instruments.  If you&#039;re interested in ~30 instruments, then an rdbms can be made to work, but not if you want a broader view.

That said, I knew a very smart guy who had a scheme for encoding one minute&#039;s worth of tick data into a blob which was stored in mysql.  Again, up to some size, these can be made to work, but not for larger applications...

Just for fun, how fast can you get a merged stream of ticks from your db?  With hdf5 in c it was over 1M/sec (including the lookup and merge).  I&#039;m guessing that mysql is going to be at least an order of magnitude off that mark...</description>
		<content:encoded><![CDATA[<p>I&#8217;ve given-up irrevocably on trying to fit this kind of data into a relational db.  Round hole, square peg as they say.  </p>
<p>100M records is a small fraction of the *daily* US equity activity (top of book only) data delivered by TAQ which has thousands of instruments.  If you&#8217;re interested in ~30 instruments, then an rdbms can be made to work, but not if you want a broader view.</p>
<p>That said, I knew a very smart guy who had a scheme for encoding one minute&#8217;s worth of tick data into a blob which was stored in mysql.  Again, up to some size, these can be made to work, but not for larger applications&#8230;</p>
<p>Just for fun, how fast can you get a merged stream of ticks from your db?  With hdf5 in c it was over 1M/sec (including the lookup and merge).  I&#8217;m guessing that mysql is going to be at least an order of magnitude off that mark&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Calimera</title>
		<link>http://www.puppetmastertrading.com/blog/2009/01/04/managing-tick-data-with-hdf5/comment-page-1/#comment-6624</link>
		<dc:creator>Calimera</dc:creator>
		<pubDate>Sat, 31 Oct 2009 12:04:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.puppetmastertrading.com/blog/?p=262#comment-6624</guid>
		<description>Have you tested MySql ?
I have One table for ticks partitioned montly...
I store 100.000.000 of ticks each month...
I collect 30 instruments...</description>
		<content:encoded><![CDATA[<p>Have you tested MySql ?<br />
I have One table for ticks partitioned montly&#8230;<br />
I store 100.000.000 of ticks each month&#8230;<br />
I collect 30 instruments&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tito</title>
		<link>http://www.puppetmastertrading.com/blog/2009/01/04/managing-tick-data-with-hdf5/comment-page-1/#comment-964</link>
		<dc:creator>tito</dc:creator>
		<pubDate>Wed, 11 Mar 2009 10:09:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.puppetmastertrading.com/blog/?p=262#comment-964</guid>
		<description>David, I agree that it would be nice to see an open source version of a tick db, but I&#039;m not too optimistic.  It&#039;s a reasonable effort, has a very limited potential audience, and just possibly people have had their fill with handouts for wall st types!

The time-synchronous use case you mention seems to me a special case of my one use-case...</description>
		<content:encoded><![CDATA[<p>David, I agree that it would be nice to see an open source version of a tick db, but I&#8217;m not too optimistic.  It&#8217;s a reasonable effort, has a very limited potential audience, and just possibly people have had their fill with handouts for wall st types!</p>
<p>The time-synchronous use case you mention seems to me a special case of my one use-case&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Blentham</title>
		<link>http://www.puppetmastertrading.com/blog/2009/01/04/managing-tick-data-with-hdf5/comment-page-1/#comment-903</link>
		<dc:creator>David Blentham</dc:creator>
		<pubDate>Sat, 07 Mar 2009 18:17:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.puppetmastertrading.com/blog/?p=262#comment-903</guid>
		<description>A common application for a tick database is being able to create an array of time-synchronous prices across securities (taking into account daylight savings, etc).  Someone needs to create a template db in HDF5 for time series.  This would fulfill a huge need in the financial market.  I have to agree 3rd party vendors  are very expensive; the trick they employ is massive indexing.</description>
		<content:encoded><![CDATA[<p>A common application for a tick database is being able to create an array of time-synchronous prices across securities (taking into account daylight savings, etc).  Someone needs to create a template db in HDF5 for time series.  This would fulfill a huge need in the financial market.  I have to agree 3rd party vendors  are very expensive; the trick they employ is massive indexing.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hack the market &#187; tick data &#38; hdf5 (part 2)</title>
		<link>http://www.puppetmastertrading.com/blog/2009/01/04/managing-tick-data-with-hdf5/comment-page-1/#comment-422</link>
		<dc:creator>Hack the market &#187; tick data &#38; hdf5 (part 2)</dc:creator>
		<pubDate>Tue, 13 Jan 2009 19:30:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.puppetmastertrading.com/blog/?p=262#comment-422</guid>
		<description>[...] Last time I described the trajectory of my research into using hdf5 for large amounts of tick data.  This time I describe the basic design of the prototype I implemented and some of its performance characteristics. [...]</description>
		<content:encoded><![CDATA[<p>[...] Last time I described the trajectory of my research into using hdf5 for large amounts of tick data.  This time I describe the basic design of the prototype I implemented and some of its performance characteristics. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tito</title>
		<link>http://www.puppetmastertrading.com/blog/2009/01/04/managing-tick-data-with-hdf5/comment-page-1/#comment-389</link>
		<dc:creator>tito</dc:creator>
		<pubDate>Sun, 04 Jan 2009 21:04:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.puppetmastertrading.com/blog/?p=262#comment-389</guid>
		<description>A buy v build decision is always going to be specific to the organizational and technical contexts for which it&#039;s being made and the use-cases required, not to mention cost considerations.  This might be especially true for a start-up.  It&#039;s certainly impossible to economically reproduce the rich functionality of kdb+/Q and the attempt would be foolish.  But if what&#039;s needed is a sufficiently small subset of their functionality integrated with an existing platform, then it just might be reasonable...  Or not.  In any case, a non-commercial license is swell for educational purposes, but does nothing to solve the problems I&#039;m looking to address.</description>
		<content:encoded><![CDATA[<p>A buy v build decision is always going to be specific to the organizational and technical contexts for which it&#8217;s being made and the use-cases required, not to mention cost considerations.  This might be especially true for a start-up.  It&#8217;s certainly impossible to economically reproduce the rich functionality of kdb+/Q and the attempt would be foolish.  But if what&#8217;s needed is a sufficiently small subset of their functionality integrated with an existing platform, then it just might be reasonable&#8230;  Or not.  In any case, a non-commercial license is swell for educational purposes, but does nothing to solve the problems I&#8217;m looking to address.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: brainyoga</title>
		<link>http://www.puppetmastertrading.com/blog/2009/01/04/managing-tick-data-with-hdf5/comment-page-1/#comment-387</link>
		<dc:creator>brainyoga</dc:creator>
		<pubDate>Sun, 04 Jan 2009 19:45:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.puppetmastertrading.com/blog/?p=262#comment-387</guid>
		<description>sounds like you are travelling the well-trodden path of investigating tick databases, which usually end up with the conclusion that it is cheaper to buy one than build one. You mentioned in a previous post that dev-licenses can cost a 6 figure number - Kx Systems has a non-commercial license available for download free of charge. It has some limitations (e.g. 2 hour timeout) but allows you to get your feet wet with it.</description>
		<content:encoded><![CDATA[<p>sounds like you are travelling the well-trodden path of investigating tick databases, which usually end up with the conclusion that it is cheaper to buy one than build one. You mentioned in a previous post that dev-licenses can cost a 6 figure number &#8211; Kx Systems has a non-commercial license available for download free of charge. It has some limitations (e.g. 2 hour timeout) but allows you to get your feet wet with it.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
