Wednesday, January 03, 2007

Scaling vertically or horizontally?

A couple of recent articles on the evolution of the eBay architecture and the Flickr database architecture reminds me of a leason I rarely see documented but many developers learn the hard way at some point in their careers: if you have large sets of data then it is nearly always better to scale the application horizontally instead of vertically.

Can all applications be scaled horizontally? No. It only works if your data set is largely independent of eachother, allowing each data element to be hosted on an arbitary server . For example, the stock price of company A will usually have little to do with the stock price of company B. (In some cases two companies will be related by ownership or industry or something else, but the system your building may not care about such relationships.)

Scaling horizontally also has other advantages like not bringing down your whole site if a single db/node fails.

eBay have made some interesting and extreme scaling design decisions including:
  • No business logic in the database. No stored procs!
  • Moved CPU-intensive work from the database to the application tier. Rreferential integrity, joins and sorting all done in the application tier.
  • No client side transactions, no distributed transactions. Single db transactions managed through anonymous PL/SQL blocks - I'de love to see their O/R mapping layer!
  • No session state in the application tier. Transient state maintained in cookies or scratch db.

No comments: