November 04, 2003

PDC: architecture

So it took a few days for me to get settled back in Toronto after my 3 month stint in Tokyo. I have a few things I'd like to say about the PDC Architecture Symposium that was on Friday.

The morning talks by Pat Helland and David Campbell were two of the best talks on architecture I've heard, period. It was an excellent analysis of the troubles facing enterprise architects today and tomorrow with the advent of "internet scale services". It was also a talk by seasoned veterens who aren't buying this "SOAs everywhere, death to objects" rhetoric we see floating out of various groups from time to time. I'll discuss this in a moment.

The final panel discussion on "What is Service Oriented Analysis and Design?" really didn't seem to have a coherent message. I noticed most of the applause went to Martin Fowler, who had the most pragmatic message: services are about distributed systems integration. Gartner seemed to see it as a way of creating some kind of new "composite application". One other panelist saw SOA's everywhere and even wanted their mouse driver to be a service. I think this might be a case of the classic cognitive problem "when you have a hammer, everything looks like a nail".

Pat Helland's talk was full and I barely had room to stand outside to watch the slides and listen. The general sense of the talk was his service master/agent (aka. fiefdoms/emissaries) model of services & data that he's been working on for some time).

Data is divided broadly into 4 categories: resource data (i.e. volatile "state of the business" data), activity data (i.e. private to a business process) , reference data (i.e. versioned/timestamped data), and request/response data (the stuff inside messages).

Services are divided into two groups: service-masters (resource-data and activity-data, high concurrency, pessimistic locking), and service-agents (activity-data only, optimistic locking, low concurrency).

What really impressed me was that they have created some very workable categories for types of data and a way to structure your system to start to reason about the "bounded uncertainty" necessary when dealing with widely distributed large-scale systems. Traditional distributed systems are "local" and "trusted" - they can use guaranteed techniques such as two-phase distributed transactions for agreement. Internet-scale systems unfortunately can't rely on these guarantees because transaction isolation typically implies locks, and locks imply denial of service. So, the idea is to use asynchronous communication, durable queues, and compensations to deal with this uncertainty. This is effectively how sites like eBay and Amazon.com scale.

David Campbell's talk also spoke about the role of the different forms of data out there: relations, XML, and objects. He spoke highly of object persistence (object/relational mapping) within service-agents for activity-oriented data, relations for resource-oriented data, and XML for data that requires multiple-combined schemas (i.e. extensibility), such as for request-response messages that need to evolve over time. I really want to review the powerpoint slides for this talk, because it went by quite quickly, but they're not online!!! Pat Helland's talk seems to be online, thankfully. I guess I can wait for the DVD...

Posted by stu at November 4, 2003 09:22 AM