Saturday, October 31, 2009

Beyond BigTable - Spanner

BigTable is a distributed storage system designed to work in very large distributed systems where you have petabytes of data on thousands of servers. It is currently used in over 100 projects such as Google Earth, Google Maps, Blogger, Orkut, and many more.

However, there is some current work at Google on a new storage and computation system called Spanner. Some of its main characteristics:
  • Automation: moves and replicates data based on usage and constraints patterns.
  • Use of hierarchical directories (instead of rows as it is in BigTable).
  • Support of distributed transactions
  • Fine grained access control on the data
  • String consistency across tablet replicas
  • Scale to 10M machines and 1k petabytes of data, across 1000s of locations
There is a keynote talk by Jeff Dean of Google on "Design, Lessons and Advice from Building Large Distributed Systems" that talks about large distributed systems and includes topics on BigTable and Spanner. You can view the slides of the presentation here.

No comments: