<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'><id>tag:blogger.com,1999:blog-860423771829255614.post8908268768859093340..comments</id><updated>2010-05-27T12:27:15.778-07:00</updated><category term='apache'/><category term='linux'/><category term='couchdb'/><category term='xml'/><category term='macos'/><category term='xsl'/><category term='openhug'/><category term='java'/><category term='erlang'/><category term='ec2'/><category term='lucene'/><category term='music'/><category term='xslt'/><category term='fosdem'/><category term='vserver'/><category term='hadoop'/><category term='home'/><category term='katta'/><category term='iphone'/><category term='bigtable'/><category term='nosql'/><category term='eclipse'/><category term='aws'/><category term='work'/><category term='hbase'/><category term='xen'/><title type='text'>Comments on Lineland: HBase vs. CouchDB in Berlin</title><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.larsgeorge.com/feeds/8908268768859093340/comments/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default'/><link rel='alternate' type='text/html' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html'/><author><name>Lars George</name><uri>http://www.blogger.com/profile/18168538475015227467</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='24' height='32' src='http://1.bp.blogspot.com/_Cib_A77V54U/StyCL3z7l0I/AAAAAAAAAEQ/mbc8p6guQDg/S220/2236d0070436f2f003151.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>5</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-860423771829255614.post-9042287959627217696</id><published>2010-05-27T12:27:15.764-07:00</published><updated>2010-05-27T12:27:15.764-07:00</updated><title type='text'>Hi,

I am struggling with the difference between s...</title><content type='html'>Hi,&lt;br /&gt;&lt;br /&gt;I am struggling with the difference between several systems.  I am trying to get a grasp on the storage of CouchDB (document-oriented), C-Store (column-oriented), and HBase (which says it is row and column but, in the end is a stoage map array)&lt;br /&gt;&lt;br /&gt;Thanks</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/9042287959627217696'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/9042287959627217696'/><link rel='alternate' type='text/html' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html?showComment=1274988435764#c9042287959627217696' title=''/><author><name>Robert</name><uri>http://www.blogger.com/profile/16048188804919417057</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html' ref='tag:blogger.com,1999:blog-860423771829255614.post-8908268768859093340' source='http://www.blogger.com/feeds/860423771829255614/posts/default/8908268768859093340' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-2001887455'/></entry><entry><id>tag:blogger.com,1999:blog-860423771829255614.post-5558883860927498354</id><published>2010-03-07T22:16:22.436-08:00</published><updated>2010-03-07T22:16:22.436-08:00</updated><title type='text'>Hi,

I really struggled with HBase because of my R...</title><content type='html'>Hi,&lt;br /&gt;&lt;br /&gt;I really struggled with HBase because of my RDBMS &amp;amp; SQL background and finally I found an analogy between the two: One should think that He can only write an SQL Query to an HBase table such that he can only &amp;quot;filter&amp;quot; for a unique ROW_ID... &lt;br /&gt;&lt;br /&gt;Meaning that the only valid pseudo-SQL query for an HBase table is like this one: &lt;br /&gt;&amp;quot;select col1, col2 from hbasetable where row_id = 1&amp;quot;; &lt;br /&gt;You can not specify anything other than row_id in WHERE part of your SQL query...&lt;br /&gt;&lt;br /&gt;What do you think of my argument? Is it valid?</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/5558883860927498354'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/5558883860927498354'/><link rel='alternate' type='text/html' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html?showComment=1268028982436#c5558883860927498354' title=''/><author><name>tunga</name><uri>http://www.blogger.com/profile/15089901581731764174</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='07852702495117944109'/><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html' ref='tag:blogger.com,1999:blog-860423771829255614.post-8908268768859093340' source='http://www.blogger.com/feeds/860423771829255614/posts/default/8908268768859093340' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-926378745'/></entry><entry><id>tag:blogger.com,1999:blog-860423771829255614.post-1919879198804921163</id><published>2009-04-24T06:54:00.000-07:00</published><updated>2009-04-24T06:54:00.000-07:00</updated><title type='text'>Thanks Lars,

That does explain a lot.

RE: CouchD...</title><content type='html'>Thanks Lars,&lt;br /&gt;&lt;br /&gt;That does explain a lot.&lt;br /&gt;&lt;br /&gt;RE: CouchDB clustering, I suppose you could stick a load balancer in front of a number of CouchDB nodes to achieve some form of clustering? Since it's REST based, this should be trivial. But, I get your point that this isn't suitable for certain use cases. (Adding nodes won't give you more storage space, for example.)&lt;br /&gt;&lt;br /&gt;On HBase... I'm more interested in doing a 'join' between two tables, so the lookup table solution would work. I could also have a M/R job that would create the lookup tables from scratch.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/1919879198804921163'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/1919879198804921163'/><link rel='alternate' type='text/html' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html?showComment=1240581240000#c1919879198804921163' title=''/><author><name>p7a</name><uri>http://www.blogger.com/profile/05421864805907825028</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html' ref='tag:blogger.com,1999:blog-860423771829255614.post-8908268768859093340' source='http://www.blogger.com/feeds/860423771829255614/posts/default/8908268768859093340' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1970981629'/></entry><entry><id>tag:blogger.com,1999:blog-860423771829255614.post-793195723451542891</id><published>2009-04-23T23:34:00.000-07:00</published><updated>2009-04-23T23:34:00.000-07:00</updated><title type='text'>Hi p7a,

CouchDB handles replication for you, but ...</title><content type='html'>Hi p7a,&lt;br /&gt;&lt;br /&gt;CouchDB handles replication for you, but not the clustering. So you can replicate across many machines that run independently of each other or in remote locations. But it does not form a scalable cluster as I would define it. HBase is a cluster of machines that act as one system. As you add more machines to it it scales to handle more storage and load needs. It implicitly handles splitting the data into regions and distributing them across the whole cluster. That is the part that CouchDB - as of now - does not do. You would have to put a layer on top to shard the data into many replicated instances. Then you are facing the same issues scaling a conventional relational database. &lt;br /&gt;&lt;br /&gt;HBase's secondary indexes are an add-on that is - again, as of now - not the fastest as it relies on another addition called Transactional HBase. The thing with HBase is to not think of it as a database with many ways to find data quickly. It is more like a Java ArrayList where you have to iterate over it to process each entry or ask for one entry by its key.&lt;br /&gt;&lt;br /&gt;If you need the secondary indexes you can either design this into your code by filling extra look-up tables with the data you need (although you still get no wild card matches etc.) or use Lucene for example to generate a searchable index that you can query any way you like.&lt;br /&gt;&lt;br /&gt;While there are common markers between CouchDB and HBase they do fit different needs and there is less of a functional overlap as there is when it comes to general functionality they provide, i.e. storing data.&lt;br /&gt;&lt;br /&gt;Does that help explaining it a bit better?</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/793195723451542891'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/793195723451542891'/><link rel='alternate' type='text/html' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html?showComment=1240554840000#c793195723451542891' title=''/><author><name>Lars George</name><uri>http://www.blogger.com/profile/18168538475015227467</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html' ref='tag:blogger.com,1999:blog-860423771829255614.post-8908268768859093340' source='http://www.blogger.com/feeds/860423771829255614/posts/default/8908268768859093340' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-2071041050'/></entry><entry><id>tag:blogger.com,1999:blog-860423771829255614.post-4842952449327917429</id><published>2009-04-23T08:33:00.000-07:00</published><updated>2009-04-23T08:33:00.000-07:00</updated><title type='text'>I'm still digging into CouchDB but as I understand...</title><content type='html'>I'm still digging into CouchDB but as I understand it handles clustering for you.&lt;br /&gt;&lt;br /&gt;One thing I really liked about CouchDB is that it incrementally updates the secondary index for you.&lt;br /&gt;&lt;br /&gt;HBase 0.19 apparently supports secondary indices, but i can't find enough literature on how it works.&lt;br /&gt;&lt;br /&gt;For the moment, you need to declare the secondary index when the column group is created, but apparently the devs are working on a fix for that.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/4842952449327917429'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/860423771829255614/8908268768859093340/comments/default/4842952449327917429'/><link rel='alternate' type='text/html' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html?showComment=1240500780000#c4842952449327917429' title=''/><author><name>p7a</name><uri>http://www.blogger.com/profile/05421864805907825028</uri><email>noreply@blogger.com</email><gd:image xmlns:gd='http://schemas.google.com/g/2005' rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.larsgeorge.com/2009/03/hbase-vs-couchdb-in-berlin.html' ref='tag:blogger.com,1999:blog-860423771829255614.post-8908268768859093340' source='http://www.blogger.com/feeds/860423771829255614/posts/default/8908268768859093340' type='text/html'/><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='blogger.itemClass' value='pid-1970981629'/></entry></feed>
