October 31, 2005

Let confusion reign

The Enterprise Service Bus (ESB) debacle is a prevailing sign of the integration industry's utter disorganization and confusion. Customers & vendors do not seem to know or agree upon what they wants in the integration space -- only that it involves some magical mixture of reliable messaging middleware, business process orchestration, and XML-aware routing and data transformation. So, no one really can agree on what an ESB is, other than it's some sort of bundle of features that might be implemented by one or more products and tied together in an "architecture" (whatever that is).

Despite this frustration, I tend to think it might be a good thing (in the long run). Indecision and acrimony is usually is an indication that something is important. How many things that are important in life are nearly impossible to define in an agreed upon manner? What I would like to address today are the arguments against the ESB, and also the caveats to consider when adopting one. Buzzword bingo follows; please try not to cry (much). Also, I come from a biased background (BEA consulting), but that doesn't mean what I'm saying has anything to do with BEA's agenda, it's just my interpretation of the market.

The ESB opponents seem to have three arguments, not always held simultaneously:

a. ESB is not a product, it's a pattern (aka. I can do that stuff today with [insert favorite tool here] )
b. ESB is proprietary, web standards only should be used (aka. the "fabric" approach).
c. ESB is unnecessary, as is all of SOAP and WS-*, we all should be using REST-style XML+HTTP+SSL.

IN short, my answers are:
a. FUD.
b. Standards are absolutely necessary but can sometimes be overrated, or solidified too quickly, before the industry knows what it's doing.
c. B.S.

Argument (a) is a game of FUD to me: vendors and interest groups are trying to protect their turf. For example, Microsoft claims they have everything an ESB has with Biztalk, which is true, but it's disingenuous. Respected developers are falling in love with Biztalk 2004, but its not like this is particularly new -- BEA has had everything Biztalk has (and thus an ESB has) with WebLogic Integration (WLI) 8.1, since mid-2003. Yet BEA doesn't claim WLI is an ESB (though you could build an ESB with it). BEA claims the AquaLogic Service Bus is an ESB. It supports stateless multi-transport / multi-format stateless transformation & routing in an appliance-like manner -- no custom code other than XPath, XQuery and a graphical pipeline language. IBM recently announced 2 ESB's -- one based on WebSphere Application Server (for standards-based interop) and one based on MQ (for proprietary + standards interop). Then there are the plethora of smaller ESB's from Cape Clear, Sonic, Polarlake, IONA, Fiorano, etc. They're all ESB-like and yet none of them fit the broadest definiton. Some are more optimized to certain use cases than others. Some ESB vendors still require code for the "last mile", others need it for cases such as long running processes. And one has to wonder whether BPEL support is required or not to be an ESB, even though it's not even standardized yet!

So the story isn't over yet as to what an ESB should / should not be. The takeaway is that many ESB products make life simpler than using traditional brokers, but no ESB products (yet) cover all popular integration use cases. YMMV.

Argument (b) has some merit to it, which is why I will address it at length.

Many ESB's push their proprietary messaging heritage and have features that virtually ensure lock-in. Cape Clear distinguishes between "service-centric vs. message-centric" ESBs, and have a paper on the subject. It is marketing focused (being a thinly-veiled attack on Sonic), and it pushes the purist "fabric" approach to ESB a bit too much, but there are good points. My view on this debate is to look at it in terms of feature lock-in.

There are two classes of features: fundamental and instrumental. Fundamental features are "required" features to do anything useful with a product. Instrumental features are "tooling", and not required for the core operation of the product.

Fundamental features tend to be the core policies or abstractions exposed by a product. In an ESB, the fundamental features are routing, transformation, security, auditing, SLA enforcement, and management. These features must have, at the very least, a clearly-delineated mode that is on track with a standards-based "least common denominator" approach, or else you will be locked-in to vendor's infrastructure.

For example, Sonic ESB likes to push "distributed SOA" or "itinerary-based routing" as a core feature of their ESB, one that no other vendor can touch. Itinerary-based routing is the idea that a message is like an "agent", it has a series of endpoints in its header and the ESB infrastructure will read this itinerary as it routes the message through the network. This approach to routing has some intuitive appeal, and was also pushed, briefly, by Microsoft with their WS-Routing specification.

However, all of this is irrelevant now -- the standards process has jettisoned WS-Routing in favour of WS-Addressing. Itineraries are inherently insecure because every ESB intermediary must modify the message header as it is going through its itinerary. WS-Addressing adopts the "next hop" approach to routing, the same one that the TCP/IP adopted: the IP header is never modified, and routing decisions are made by intermediaries (such as a Cisco switch or router). Microsoft published a paper in mid-2004 explaining how to handle web services routing with this model.

The moral of this story is that routing is a fundamental feature, and "next hop" routing is the only standard way to approach it in a transport-neutral fashion. Arguments that claim other ESBs or BPM engines like BizTalk are "hub and spoke", and not "truly distributed" are disingenuous. The entire internet is based on "next hop" routing with registry-based lookup (DNS). The web services world will likely follow the same approach when ESB's start integrating with UDDI registries.

Sonic does make one good argument, though flawed, about the benefits of their approach. Itinerary-based routing facilitates a global process view instead of a splintered process view. That is, you can "orchestrate the orchestrations" across multiple intermediaries. The flaw in this is that the argument really isn't about routing, it's about global interaction management. All of these modern ESB/BPM hybrid engines, whether BizTalk, WLI, or BPEL-based are about "orchestration". They all require a central conductor to manage the process state. BPEL is just-another-way to implement an orchestration -- something you could also do with BizTalk's XLANG or WLI's JPD. But one doesn't need the drawbacks of itinerary-based routing to get a global view, one just needs a a contractual set of interactions -- something also known as choreography. Perhaps WS-CDL (Choreography Description Language) will eventually catch on to fill this void. Perhaps a future BPEL extension will -- I've noticed that IBM has released a WS-BPEL 2.0 sub-process extension just a few weeks ago. Until the industry figures out how it wants to handle choreography, which will probably require a number of years, itinerary-based routing is 100% proprietary, will only work on a single vendor's ESB (though you might be able to write a lot of custom code to bridge the gap) and will likely never be standardized. Use it if it makes sense to you, but understand the risks.

Instrumental features, as I mentioned, tend to be more pluggable -- they're tooling, they're "nice to have", but they're not absolutely necessary for the product to operate. That is, assuming one can separate the policy from the underlying implementation, supporting proprietary protocols, data formats or transports can effectively (but not completely) become instrumental. The key is to ensure that there is a very clear demarcation between what is core to the ESB and what is not, and any dependency on instrumental features has a clear abstraction. This is arguably where Java EE has always shined -- in creating a market for pluggable device drivers, whether database (JDBC), messaging (JMS), or general connectivity (JCA), allowing you to choose whatever core programmatic model you'd like for your application.

So, in an ESB built on Java EE, reliability and security could be made pluggable between SSL and WS-Security, or JMS or WS-ReliableMessaging -- assuming your ESB vendor chooses to do so in its core framework. One can use JMS in the short term, and move to a WS-ReliableMessaging endpoint once it is more widely adopted. An ESB should allow this without requiring any code changes. Routing can be generalized to rely on any metadata in any one of transport-specific headers (HTTP, JMS, etc.), SOAP standard headers, or on the content of the message itself. ESB intermediary should be able to expose both a REST-based XML+HTTPS endpoint and a SOAP-based WS-Security endpoint to the same service -- so long as the underlying service has some known way of handling security, and the ESB knows how to translate between approaches. Last-mile connectivity to legacy enterprise systems or packaged applications can be made effectively (but again, not completely) instrumental by using API-level standards such as the Java Connector Architecture (JCA), again assuming there is a general way of mapping non-XML data into XML data and vice-versa.

One key fundamental feature in the ESB market that is lacking good standards support is in the manageability and SLA enforcement space. The standards here (such as WSDM) are rather poorly adopted and lack a lot of what is needed. It will likely be years before standards evolve here, so every ESB vendor will have the opportunity to at least "try" to provide some level of interoperability with sub-optimal standards like JMX or SNMP.

A trend you'll notice here is that API-level standards are very flexible but potentially have a lot of labour associated with them. This to me is contrary whole point of an ESB -- to reduce the amount of labour required to integrate applications! If your ESB requires you to write a lot of custom code, it's not doing its job as well as it should.

Another other trend is that the ESB vendor has to expose its own set of abstractions to implement transformation, routing, and management. Consistent with my earlier point, these abstractions should NOT have their primary exposure through an API, they should be exposed themselves as standard services or through some form of management interface. But, having said that, there will probably be some level of lock-in on how well the ESB vendor manages the distinction between the fundamental features of management, transformation, routing, and endpoint bindings.

There have been early attempts at standardizing this, with mixed results. There's the BPEL 1.1 draft and Apache WSIF, both of which are useful, have some teething problems, and probably will never be adopted by a standards body in their current form. OASIS is working on WS-BPEL 2.0 which has some very significant changes over 1.1. And I believe Java Business Integration (JBI) is hoped to be a generalized alternative to WSIF. Nevertheless, in theory, you can port a BPEL 1.1 + WSIF process between Oracle and IBM's BPEL engines, though I'd be curious about how well that would work in practice. But both WSIF and JBI assume your ESB is implemented on Java EE! There will be no standard way to port a BPEL 1.1 or WS-BPEL 2.0 process that uses WSIF or JBI onto BizTalk, for example.

The moral of the standards story: this stuff is too new to expect a truly portable ESB execution language. WS-BPEL 2.0 will be close, but in practise it probably will only be portable among Java EE based containers. That might be OK -- SQL is a standard that isn't the same everywhere and certainly isn't portable, but it has been a success in terms of adoption. But WS-BPEL is arguably not appropriate for stateless ESB's like AquaLogic. Should an ESB vendor adopt BPEL for all message exchange patterns, or should it have seperate products that optimize stateful vs. stateless processing? We have a cart-before-the-horse standard, yet again! Hence proprietary extensions will abound.

My final point on argument (b) is that the "fabric" approach seems to only be pushed by small vendors that have nothing to lose, but also don't have a long track record. The WS-* standards aren't completely ready yet, so there needs to be the ability for an enterprise to choose de facto and/or proprietary standards that are suitable for them in the short run. This of course is only appropriate for intra-enterprise services, or tighly-coupled cross-enterprise integration -- which is why SaaS proponents often discount this usage of ESBs!

One claim is that only "pure" WS-ReliableMessaging implementations should be adopted. Bigger vendors are basing their WS-RM implementations on their older MOM technology, such as MQ Series or JMS, and this is somehow a bad thing. I don't understand this line of reasoning at all. The infrastructure underneath a high speed reliable messaging protocol is both sophisticated and requires a lot of investment to develop. Older MOM's are proven. Why throw them out? The lesson of recent years is that Interoperability is achieved at the protocol layer, not the API. Who really cares if my underlying MOM has a JMS binding? The point is that they must eventually expose WS-ReliableMessaging over TCP, UDP, or HTTP to be interoperable. Having said that, WS-ReliableMessaging alone has had 3 major revisions since 2003, and all the various ESB/Fabric players support it at varying levels. Until the big vendors such as IBM, Microsoft, BEA, and Oracle have their WS-RM implementations shipping, this standard is too new to be your "sole" approach to reliable messaging.

Cape Clear actually turns API-based pluggability into a feature: it is pluggable with any JMS-based middleware engine. For some, this might be compelling, especially if their MOM vendor has uses proprietary approaches to fundamental features. It has the downside of (potentially) being less performing than a fully-integrated stack. For example, I'd actually be interested to see how Cape Clear performs on WebLogic Server 9 vs. the AquaLogic Service Bus 2.x. It would indicate how much of the performance increase ALSB is showing is due to WLS9 vs. path-length and memory allocation improvements over WLI 8.1's dynamic transformation and routing.

Argument (c) to me is irrelevant if an ESB supports REST-style XML+HTTP+SSL. I believe this is the case with some vendors (though not all), so I that's all I will say there.

Posted by stu at October 31, 2005 11:14 AM