September 26, 2005

distributed caches

It seems there are lots of little companies popping up everywhere touting the next great solution to scalability woes -- the transparent distributed object cache! An interesting debate has ensured on various blogs about whether it's appropriate to provide such technology with an API that explicitly distinguishes between what's cached from what's not, or if it should be done in a transparent "API-less" fashion.

This debate is an old one, and reminds me a lot of Jim Waldo et al's old Note on Distributed Computing that was very influential in distributed computing circles around the time.

Simply put, it is highly unlikely to provide a general transparent distributed object mechanism that preserves identity, takes into account latency and partial failure, and highly scalable concurrency. It strikes me that proponents of these distributed caches get way too caught up in the coolness of implementation details and don't really look at the broader implications, which really is Billy's point here.

The best case I've seen of a general mainstream distributed object cache with parallel operations is the Oracle Database's Real Application Clusters. And the whole reason they can pull this off is because the relational model and SQL completely takes algorithmic control out of the hands of the developer and keeps it in the hands of the SQL optimizer. And secondly, they rely on multi-version concurrency controlled transactions as their management model, which prevents readers & writers from blocking each other.

Yes, as a developer, you can provide hints, or re-write SQL in ways that the optimizer can better work with, and as an administrator you can declare certain preferred storage & caching settings, but in the end, it is the runtime framework that figures out the most optimal and scalable way to access the data.

As soon as you lift the layer of abstraction and give algorithmic control to a developer at the Java language level, you give up the transactional illusion (Java isn't naturally a transactional language), you give up the consistency illusion (object identity is NOT preserved across local/remote and it's requires a lot of runtime dancing to make it happen), and you're exposed to concurrency, latency, and partial failure issues that no runtime can paper over. So you'd better be an expert developer to handle this.

Perhaps the solution is to take an approach similar to where Microsoft is going with their recently-announced LINQ -- provide declarative query semantics and transactions as a native part of the Java language, and allow vendors to compete on the plumbing to make it work in a distributed and concurrent environment.

Posted by stu at 02:52 PM