SOAP Protocol Bindings

So far in this chapter, we have only shown SOAP being transmitted over HTTP. SOAP, however, is transport-independent and can be bound to any protocol type. This section looks at some of the issues involved in building Web services and transporting SOAP messages over various protocols.

General Considerations

The key issue in deciding how to bind SOAP to a particular protocol has to do with identifying how the requirements for a Web service (RPC or not, interaction pattern, synchronicity, and so on) map to the capabilities of the underlying transport protocol. In particular, the task at hand is to determine how much of the total information needed to successfully execute the Web service needs to go in the SOAP message versus somewhere else.

As Figure 3.18 shows with an HTTP example, many protocols have a packaging notion. If SOAP is to be transmitted over such protocols, a distinction needs to be made between physical (transport-level) and logical (SOAP) messages. Context information can be passed in both. In the case of HTTP, context information is passed via the target URI and the SOAPAction header. Security information might come as HTTP username and password headers. In the case of SOAP, context information is passed as SOAP headers.

Figure 3.18. Logical versus physical messages.

graphics/03fig18.gif

Sometimes, SOAP messages have to be passed over protocols whose physical messages do not have any mechanism for storing context. Consider pure sockets-based exchanges. By default, in these cases the physical and the logical message are one and the same. In these cases, you have four options for passing context information:

By convention, as in, "When listening on port 12345, I know that I have to invoke service X."
By entirely using SOAP's header-based extensibility mechanism to pass all context information.
By custom-building a very light physical protocol under SOAP messages, as in, "The first CRLF delimited line of message will be the target URI; the rest will be the SOAP message."
By using a lightweight protocol that can be layered on top of the physical protocol and can be used to move SOAP messages. Examples of such protocols are Simple MIME Exchange Protocol (SMXP) or Blocks Extensible Exchange Protocol (BEEP).

As in most cases in the software industry, reinventing the wheel is a bad idea. Therefore, the second and fourth approaches listed here typically make the most sense. The first approach is not extensible and can leave you in a tight spot if requirements change. The third approach smells of reinventing the wheel. The cost of going with the second approach is that you have to make sure that all clients interacting with your Web service will be able to support the necessary extensions. The cost of going with the fourth approach is that it might require additional infrastructure for both requestors and providers.

Another consideration that comes into play is the interaction pattern supported by the transport protocol. For example, HTTP is a request-response protocol. It makes RPCs and request-response messaging interactions very simple. For other protocols, you might have to explicitly manage the association of requests and responses. As we mentioned in the previous section, Chapter 6 discusses this topic in more detail.

Contrary to popular belief, Web services do not have to involve stateless interactions. For example, Web services could be designed in a session-oriented manner. This is probably not the best design for a high-volume Web service, but it could work fine in many cases. HTTP sessions can be leveraged to provide context information related to the session. Otherwise, you will have to use a session ID of some kind, much in the same way a message conversation ID is used.

Finally, when choosing transport protocols for Web services, think carefully about external requirements. You may discover important factors entirely outside the needs of the Web service engine. For example, when considering Web services over sockets as a higher-performance alternative to Web services over HTTP (requests and responses don't have to go through the Web server), you might want to consider the following factors:

If services have to be available over a public unsecured network, is it an acceptable risk to open a hole through the firewall for Web service traffic?
Can clients support SSL to ensure the privacy of messages? Surprisingly, some clients can speak HTTPS but not straight SSL.
What are the back-end load balancing and failover requirements? Straight sockets-based communication requires sticky load balancing. You establish a session with one server and you have to keep using this server. This approach potentially compromises scalability and failover, unless steps are taken to build request redirection and session persistence and failover capabilities into the system.

As with most things in the software industry, there is no single correct approach and no single right answer. Investigate your requirements carefully and do not be easily tempted by seemingly exciting, out-of-the-ordinary solutions. The rest of this section provides some more details about how certain protocols can be used with SOAP.

HTTP/S

This chapter has shown many examples of SOAP over HTTP. The SOAP specification defines a binding of SOAP over HTTP with the following set of rules:

The MIME media type of both HTTP requests and responses (defined in the Content-Type HTTP header) must be text/xml.
Requests must come as HTTP POST operations.
The SOAPAction header is reserved as a hint to the SOAP processor as to which Web service is being invoked. The value of the header can be any URI; it is implementation-specific.
Successful SOAP message processing must return an HTTP error code in the 200 range. Typically, this is 200 OK.
In the case of an error generated while processing the SOAP message, the HTTP response code must be 500 Internal Server Error and it must include a SOAP message with a Fault element describing the error.

In addition to these simple rules, the SOAP specification defines how SOAP messages can be exchanged over HTTP using the HTTP Extension Framework (RFC 2774, http://www.normos.org/ietf/rfc/rfc2774.txt), but this information is not very relevant to us.

In short, HTTP is the most commonly used mechanism for exchanging SOAP messages. It is aided by the industry's experience building relatively secure, scalable, reliable networks to handle HTTP traffic and by the fact that traditional Web applications and application servers primarily use HTTP. HTTP is not perfect, but we are very good at working around its limitations.

For secure message exchanges, you can use HTTPS instead of HTTP. The most common extension on top of what the SOAP specification describes is the use of HTTP usernames and passwords to authenticate Web service clients. Combined with HTTPS, this approach offers a good-enough level of security for most e-commerce scenarios. Chapter 5 discusses the role of HTTPS in Web services.

SOAP Messages with Attachments

SOAP messages will often have attachments of various types. The prototypical example is an insurance claim form in XML format that has an accident picture associated with it and/or a scanned copy of the signed accident report form. The SOAP Messages with Attachments specification defines a simple mechanism for encoding a SOAP message in a MIME multipart structure and associating this message with any number of parts (attachments) in that structure. These attachments can be in their native format, which is typically binary.

Without going into too many details, the SOAP message becomes the root of the multipart/related MIME structure. The message refers to attachments using a URI with the cid: prefix, which stands for "content ID" and uniquely identifies the parts of the MIME structure. Here is how this is done. Note that some long lines (such as the Content-Type header) have been broken in two for better readability:

MIME-Version: 1.0
Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
        start="<claim061400a.xml@claiming-it.com>"
Content-Description: This is the optional message description.

--MIME_boundary
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-ID: <claim061400a.xml@claiming-it.com>

<?xml version='1.0' ?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
..
<theSignedForm href="cid:claim061400a.tiff@claiming-it.com"/>
..
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

--MIME_boundary
Content-Type: image/tiff
Content-Transfer-Encoding: binary
Content-ID: <claim061400a.tiff@claiming-it.com>

...binary TIFF image...
--MIME_boundary--

One excellent thing about encapsulating SOAP messages in a MIME structure is that the packaging is independent of an actual transport protocol. In a sense, the MIME package is another logical message on top of the SOAP message. This type of MIME structure can then be bound to any number of other protocols. The specification defines a binding to HTTP, an example of which is shown here:

POST /insuranceClaims HTTP/1.1
Host: www.risky-stuff.com
Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
        start="<claim061400a.xml@claiming-it.com>"
Content-Length: XXXX
SOAPAction: http://schemas.risky-stuff.com/Auto-Claim
...

SOAP over SMTP

E-mail is pervasive on the Internet. The important e-mail-related protocols are Simple Mail Transfer Protocol (SMTP), Post Office Protocol (POP), and Internet Message Access Protocol (IMAP). E-mail is a great way to exchange SOAP messages when synchronicity is not required because:

E-mail messages can easily carry SOAP messages.
E-mail messages have extensible headers that can be used to transmit context information outside the SOAP message body.
Both sending and receiving of e-mail messages can be configured to require authentication. Further, using S/MIME with e-mail provides additional security for a range of applications.
E-mail can support one-to-one and one-to-many participant configurations.
E-mail messaging is buffered and queued with reliable dispatch that automatically includes multiple retries and failed delivery notification.
The Internet e-mail server infrastructure is highly scalable.

Together, these factors make e-mail a very suitable alternative to HTTP for asynchronous Web service messaging applications.

Other Protocols

Despite its low-tech nature, FTP can be very useful for simple one-way messaging using Web services. Access to FTP servers can be authenticated. Further, roles-based restrictions can be applied to particular directories on the FTP server. When using FTP, SOAP messages are mapped onto the files that are being transferred. Typically, the file names indicate the target of the SOAP message.

In addition, with companies such as Microsoft backing SMXP for their Hailstorm initiatives, the protocol is emerging as a potential candidate to layer on top of straight socket-based communications for transmission of SOAP messages.

Finally, sophisticated messaging infrastructures such as IBM's MQSeries, Microsoft's Message Queue (MSMQ), and the Java Messaging Service (JMS) are well-suited for the transport of SOAP messages. Chapter 5 shows an example of SOAP messaging using JMS.

The key constraint limiting the wide deployment of SOAP bindings to protocols other than HTTP and e-mail is the requirement of Web service interoperability. HTTP and e-mail are so pervasive that they are likely to remain the preferred choices for SOAP message transport for the foreseeable future.