Scott Adams Blog: The Illusion of Winning 08/30/2010

Let’s say that you and I decide to play pool. We agree to play eight-ball, best of five games. Our perception is that what follows is a contest to see who will do something called winning.

But I don’t see it that way. I always imagine the outcome of eight-ball to be predetermined, to about 95% certainty, based on who has practiced that specific skill the most over his lifetime. The remaining 5% is mostly luck, and playing a best of five series eliminates most of the luck too.

I found this really insightful. It seems obvious when you read it but for some reason it doesn’t seem to occur naturally to me.

Posted via email from Sijin Joseph

High Scalability – High Scalability – Pomegranate – Storing Billions and Billions of Tiny Little Files

Pomegranate is a novel distributed file system built over distributed tabular storage that acts an awful lot like a NoSQL system. It’s targeted at increasing the performance of tiny object access in order to support applications like online photo and micro-blog services, which require high concurrency, high throughput, and low latency. Their tests seem to indicate it works:

We have demonstrate that file system over tabular storage performs well for highly concurrent access. In our test cluster, we observed linearly increased more than 100,000 aggregate read and write requests served per second (RPS). 

Rather than sitting atop the file system like almost every other K-V store, Pomegranate is baked into file system. The idea is that the file system API is common to every platform so it wouldn’t require a separate API to use. Every application could use it out of the box.

The features of Pomegranate are:

  • It handles billions of small files efficiently, even in one directory;
  • It provide separate and scalable caching layer, which can be snapshot-able;
  • The storage layer uses log structured store to absorb small file writes to utilize the disk bandwidth;
  • Build a global namespace for both small files and large files;
  • Columnar storage to exploit temporal and spatial locality;
  • Distributed extendible hash to index metadata;
  • Snapshot-able and reconfigurable caching to increase parallelism and tolerant failures;
  • Pomegranate should be the first file system that is built over tabular storage, and the building experience should be worthy for file system community. 

Very cool technology. This reminded me of a distributed filesystem Google Tech Talk (http://www.youtube.com/watch?v=3xKZ4KGkQY8) on Wuala (http://www.wuala.com/) that I found fascinating for all the little problems they had to overcome to make this work.

Posted via email from Sijin Joseph