SOAP Intermediaries

So far, we have addressed SOAP headers as a means for vertical extensibility within SOAP messages. There is another related notion, however: horizontal extensibility . Vertical extensibility is about the ability to introduce new pieces of information within a SOAP message, and horizontal extensibility is about targeting different parts of the same SOAP message to different recipients. Horizontal extensibility is provided by SOAP intermediaries .

The Need for Intermediaries

SOAP intermediaries are applications that can process parts of a SOAP message as it travels from its origination point to its final destination point (see Figure 3.7). Intermediaries can both accept and forward SOAP messages. Three key use-cases define the need for SOAP intermediaries: crossing trust domains, ensuring scalability, and providing value-added services along the SOAP message path.

Figure 3.7. Intermediaries on the SOAP message path.

graphics/03fig07.gif

Crossing trust domains is a common issue faced while implementing security in distributed systems. Consider the relation between a corporate or departmental network and the Internet. For small organizations, it is likely that the IT department has put most computers on the network within a single trusted security domain. Employees can see their co-workers computers as well as the IT servers and they can freely exchange information between them without the need for separate logons. On the other hand, the corporate network probably treats all computers on the Internet as part of a separate security domain that is not trusted. Before an Internet request reaches the network, it needs to cross from its untrustworthy domain to the trusted domain of the network. Corporate firewalls and virtual private network (VPN) gateways are the Cerberean guards of the gates to the network's riches. Their job is to let some requests cross the trust domain boundary and deny access to others.

Another important need for intermediaries arises because of the scalability requirements of distributed systems. A simplistic view of distributed systems could identify two types of entities: those that request some work to be done (clients) and those that do the work (servers). Clients send messages directly to the servers with which they want to communicate. Servers, in turn, get some work done and respond. In this naïve universe, there is little need for distributed computing infrastructure. Alas, you cannot use this model to build highly scalable distributed systems.

Take basic e-mail as an example—the service we've grown to depend on so much in the Net era. When someone@company.com sends an e-mail message to myfriend@london.co.uk, it is definitely not the case that their e-mail client locates the mail server london.co.uk and sends the message to it. Instead, the client sends the message to its e-mail server at company.com. Based on the priority of the message and how busy the mail server is, the message will leave either by itself or in a batch of other messages. Messages are often batched to improve performance. It is likely that the message will make a few hops through different nodes on the Internet before it gets to the mail server in London.

The lesson from this example is that highly scalable distributed systems (such as e-mail) require flexible buffering of messages and routing based not only on message parameters such as origin, destination, and priority but also on the state of the system measured by parameters such as the availability and load of its nodes as well as network traffic information. Intermediaries hidden from the eyes of the originators and final recipients of messages perform all this work behind the scenes.

Last but not least, you need intermediaries so that you can provide value-added services in a distributed system. The type of services can vary significantly. Here are a couple of common examples:

Securing message exchanges, particularly when transmitting messages through untrustworthy domains, such as using HTTP/SMTP on the Internet. You could secure SOAP messages by passing them through an intermediary that first encrypts them and then digitally signs them. On the receiving side, an intermediary will perform the inverse operations—checking the digital signature and, if it is valid, decrypting the message.
Providing message-tracing facilities. Tracing allows the recipient of messages to find out the exact path that the message went through complete with detailed timings of arrivals and departures to and from intermediaries along the way. This information is indispensable for tasks such as measuring quality of service (QoS), auditing systems, and identifying scalability bottlenecks.

Intermediaries in SOAP

As the previous section has shown, intermediaries are an extremely important concept in distributed systems. SOAP is specifically designed with intermediaries in mind. It has simple yet flexible facilities that address the three key aspects of an intermediary-enabled architecture:

How do you pass information to intermediaries?
How do you identify who should process what?
What happens to information that is processed by intermediaries?

From the discussion of intermediaries, you can see that most of the information that intermediaries require is completely orthogonal to the information contained in SOAP message bodies. For example, whether logging of inventory check requests is enabled or not is irrelevant to the inventory check service. Therefore, only information in SOAP headers can be explicitly targeted at intermediaries. The question then becomes one of deciding how to target the recipient of a particular header. This does not mean that an intermediary cannot look at, process, or change the SOAP message body; it certainly can do that. However, SOAP itself defines no mechanism to instruct an intermediary to do that. Contrast this to a SOAP message explicitly targeting a piece of information contained in a SOAP header at an intermediary with the understanding that it must at least attempt to process it.

All header elements can optionally have the SOAP-ENV:actor attribute. The value of the attribute is a URI that identifies who should handle the header entry. Essentially, that URI is the "name" of the intermediary. The special value http://schemas.xmlsoap.org/soap/actor/next indicates that the header entry's recipient is the next SOAP application that processes the message. This is useful for hop-by-hop processing required, for example, by message tracing. Of course, omitting the actor attribute implies that the final recipient of the SOAP message should process the header entry. The message body is intended for the final recipient of the SOAP message.

The issue of what happens to a header that is processed by an intermediary is a little trickier. The SOAP specification states, "the role of a recipient of a header element is similar to that of accepting a contract in that it cannot be extended beyond the recipient." This means that the intermediary should remove any header targeted for it that it has processed. The intermediary is free to introduce a new header in the message that looks the same but then this constitutes a contract between the intermediary and the next application. The goal here is to reduce system complexity by requiring that contracts about the presence, absence, and content of information in SOAP messages be very narrow in scope—from the originator of that information to the first SOAP application that handles it and not beyond.

Putting It All Together

To get a better sense of how you might use intermediaries in the real world, let's consider the potentially realistic albeit contrived example of SkatesTown's overall B2B integration architecture. Please keep in mind that all XML in the example is purely fictional—currently there isn't a standardized way to handle security and routing of SOAP messages.

SkatesTown needs to integrate various applications in several of its departments with some of its partners' applications (see Figure 3.8). Silver Bullet Consulting started working with the purchasing department building Web services to automate business functions such as checking inventory. Following the success of this engagement, Silver Bullet Consulting has been asked to use Web services to automate processes in other departments such as customer service. SkatesTown's corporate IT department is demanding centralized control over the entry point of all Web service requests to the company. They also require that all SOAP messages be transmitted over HTTPS for security reasons.

Figure 3.8. SkatesTown's system integration architecture.

graphics/03fig08.gif

At the same time, individual departments demand that their own IT units control the servers that run their own Web services. These servers have their own trust domains and are sitting deep inside the corporate network, invisible to the outside world. To address this issue, Silver Bullet Consulting develops a partner interface gateway SOAP application that acts as an intermediary between the partner applications sending SOAP messages and the department-level applications that are handling them. The gateway application is hosted on an application server that is visible to the partner applications. This server is managed by the corporate IT department. A firewall is configured to allow access to the gateway application from the partner networks only.

The gateway application has the responsibility to validate partners' security credentials and to route messages to the appropriate departmental SOAP applications. Security information and department server locations are available from SkatesTown's enterprise directory.

Here is an example message the gateway application might receive:

POST /bws/inventory/InventoryCheck HTTP/1.0
Host: partnergateway.skatestown.com
Content-Type: text/xml; charset="utf-8"
Content-Length: nnnn
SOAPAction: "/doCheck"

<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope
   SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
   xmlns:xsd="http://www.w3.org/2001/XMLSchema"
   xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <SOAP-ENV:Header>
       <td:TargetDepartment
         xmlns:td="http://www.skatestown.com/ns/partnergateway"
         SOAP-ENV:actor="urn:X-SkatesTown:PartnerGateway"
         SOAP-ENV:mustUnderstand="1">
           Purchasing
       </td:TargetDepartment>
       <ai:AuthenticationInformation
         xmlns:ai="http://www.skatestown.com/ns/security"
         SOAP-ENV:actor="urn:X-SkatesTown:PartnerGateway"
         SOAP-ENV:mustUnderstand="1">
           <username>PartnerA</username>
           <password>LongLiveSOAP</password>
       </ai:AuthenticationInformation>
   </SOAP-ENV:Header>
   <SOAP-ENV:Body>
      <doCheck>
         <arg0 xsi:type="xsd:string">947-TI</arg0>
         <arg1 xsi:type="xsd:int">1</arg1>
      </doCheck>
   </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

There are two header entries. The first identifies the target department as purchasing, and the second passes the authentication information of the message originator, partner A in this case. Both header entries are marked with mustUnderstand="1" because they are critical to the successful processing of the message. The partner gateway application is identified by the actor attribute as the place to process these.

After processing the message, the partner gateway application might forward the following message:

POST /bws/services/InventoryCheck HTTP/1.0
Host: purchasing.skatestown.com
Content-Type: text/xml; charset="utf-8"
Content-Length: nnnn
SOAPAction: "/doCheck"

<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope
   SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
   xmlns:xsd="http://www.w3.org/2001/XMLSchema"
   xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <SOAP-ENV:Header>
      <cc:ClientCredentials
         xmlns:cc="http://schemas.security.org/soap/security"
         SOAP-ENV:mustUnderstand="1">
         <ClientID>/External/Partners/PartnerA</ClientID>
      </cc:ClientCredentials>
   </SOAP-ENV:Header>
   <SOAP-ENV:Body>
      <doCheck>
         <arg0 xsi:type="xsd:string">947-TI</arg0>
         <arg1 xsi:type="xsd:int">1</arg1>
      </doCheck>
   </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Note how the previous two header entries have disappeared. They were meant for the gateway application only. Having extracted the purchasing department's location from the enterprise directory, the gateway application forwards the message to purchasing.skatestown.com. A new header entry is meant for the final recipient of the message. The entry specifies the security identity of the message originator as /External/Partners/PartnerA. This identity was presumably obtained from SkatesTown's security system following the successful authentication of partner A. The applications in the purchasing department will use this identity to check whether partner A is authorized to perform the operation requested in the SOAP message body.

This example scenario shows that intermediaries bring significant capabilities to SOAP-enabled applications and can be introduced and implemented at a fairly low cost. The inventory check service implementation does not need to change. The partner gateway does not need to know anything about inventory checking; it only understands the target department and authentication headers. Inventory check clients only need to add a couple of headers to the messages they are sending to fit in the new architecture.