In early 2002, Sun Microsystems released the first version of their Java Web Services Developer's Pack (JWSDP). This large download contained everything that a developer needs to begin creating web services using the Java platform. When it appeared, the questions that most developers immediately asked were just exactly what is a web service and why should I be interested in finding out how to build or use one? If, at that time, you looked around the bookstores and the Internet for an answer to these questions, your conclusion would most likely have been that there was plenty of hype, promise, and marketing talk from those companies interested in promoting web services to other companies (and in particular to their Chief Technology Officers), but very little that would be of real use to hands-on developers trying to come to terms with a new technology. Even today, a full year later, it is still difficult to find a consistent definition of what constitutes a web service. The most useful definition that I have been able to find is the following, which appears in the Web Services Architecture document published by the World Wide Web Consortium (W3C), available for download from their web site at http://www.w3.org/TR/ws-arch:
A web service is a software system identified by a URI, whose public interfaces and bindings are defined and described using XML. Its definition can be discovered by other software systems. These systems may then interact with the Web service in a manner prescribed by its definition, using XML-based messages conveyed by Internet protocols.
In essence, then, a web service is something that provides an interface defined in terms of XML messages and that can be accessed over the Internet (or, of course, an Intranet). What about looking at some real examples of web services to see what they are actually being used for? This is where it gets a little bit harder. At the present time, there aren't many real web services deployed and available on the Internet, although it is expected that this situation will change as web service standards, in particular those related to security, are published and start being implemented over the next year or so.
A good place to look for example web services is the XMethods web site at http://www.xmethods.com, which describes itself as a "virtual laboratory" for developers, allowing them to showcase the ways in which web services can be used. Here, you'll find a wide range of services implemented using various technologies; for example:
A route finder that provides an optimal route between two or more locations in the form of directions or a map
A service that locates synonyms for a given word
A stock quote service that provides stock prices, updated every 15 minutes
An online dictionary
A weather service
A POP3 client that allows you to access your mailbox
If you go to the XMethods web site and follow a link to any of these services, you won't find yourself directly connected to the service itself. Instead, you'll be presented with a page that tells you, among other things, where to get a service definition and who to contact for further information. If you want to actually use the service, you'll need to write your own client application. Services like these are, of course, already freely available to anyone with a web browser and a connection to the Internet (or even a cell phone!). Why bother to invent a new way of delivering them, which also puts the onus on the service consumer to write or obtain the client-side software? The answer to this question goes to the very heart of the movement towards web services — the need to perform business-to-business transactions using open but secure protocols over the Internet.
To see why the current web application model is not sufficient for business-to-business commerce and why it is also quite limiting when your client is a human consumer, consider the case of the online bookstore Amazon.com. Amazon.com has one of the best-known web sites on the Internet. Book buyers use its facilities to browse for and purchase books (and, these days, a wide range of other products), while publishers and authors use its sales ranking and reader reviews to get a feel for public reaction to their work. If you want to find a good Java book, all you have to do is use the site's search facilities to locate a few titles, read the reviews, and place your order. At each stage, the site sends you a page of HTML that your browser renders for you, and you respond by clicking a link or filling in a form to move to the next stage.
Although this is convenient for low-volume searches conducted by humans, it is not quite as useful if you want to extract and collate information from the site. Suppose, for example, that you are a publisher (or an author) wanting to keep track of the sales rankings of a group of books on a daily basis. To achieve this using the HTML-based interface provided by Amazon's web site, you need to bookmark the page for each book you are interested in, reload each of those pages every day, and manually extract the sales ranking and the latest customer reviews. If you are a little more technically minded, you could automate this process somewhat by writing a client application that reads the HTML and extracts the information using screen-scraping techniques.[1] While this is perfectly feasible, it is less than ideal, due to the following:
[1] You'll find an example that demonstrates how to write such a client for a cell phone in J2ME in a Nutshell, by Kim Topley (O'Reilly).
Amazon.com web pages contain a lot of content. This makes them large documents — often in excess of 10 kilobytes. In reality, you need only a very small portion of the information that each of them contains.
Screen-scraping programs, by their nature, are very reliant on the layout of the information source that they are analyzing—in this case, the HTML produced by Amazon.com's web servers. Unfortunately, web site designers have a habit of changing their page layouts from time to time, and these changes can invalidate the algorithm that your application uses to locate the small part of the information buried in the HTML markup that it actually needs.
The root cause of these problems is the use of HTML to convey data. HTML is, of course, reasonably good at the job it was designed for — combining raw data with markup that specifies how it is to be presented and links that allow related information to be obtained. If you're only looking for the sales ranking of your book, all you really want is a single number — you certainly don't need lots of additional tags that tell you how to present the information. This is exactly the kind of situation in which, if you had control over the server, you would choose to use XML rather than HTML to encapsulate the data, so that a client that is interested only in the raw content would not need to concern itself with stripping out the markup.
If you look back at the definition of a web service cited earlier in this chapter, you'll see that if Amazon.com provided a web service interface to its bookstore and exposed the appropriate information in XML form, authors and publishers would have an easier way to find out how their books are performing. In fact, in mid-2002, Amazon.com did exactly that. The Amazon.com web service is one of the few commercial web services currently available on the Internet. As well as writing private client applications to extract specific book-related information, web service developers can use this service to create their own web sites that incorporate information obtained from Amazon.com, without having to present it in the same way as it appears on the Amazon.com web site. Figure 1-1 shows an example software architecture that might be used to do this.
In this diagram, an end user using a web browser visits the web site of MyXMLBooks.com, a fictional company that, amongst other things, is a member of the Amazon.com Associates program. This allows MyXMLBooks.com to earn royalties on sales of books made via its own web site. MyXMLBooks.com has previously used click-through links that will display Amazon.com's own web pages when the user selects a book advertised on its web site, but now wants to make use of Amazon's web service to obtain raw information and present it in a way that is more consistent with the other pages on its site. When the user selects a book from one of MyXMLBooks.com's web pages, the HTTP request generated is routed via a controlling servlet on the MyXMLBooks.com web server, which determines that it needs to retrieve raw book data from Amazon.com. The servlet obtains this data by using a web service client implemented by MyXMLBooks.com's developers. This client uses the web service interface published by Amazon.com to invoke a method on its server that returns the required information. The method invocation is performed by creating an XML message that contains the method name and any required parameters and then sending it to Amazon.com's server using the SOAP protocol, which is discussed later in this chapter. The value (or values) returned by the method call are then wrapped in another XML message and sent back to the MyXMLBooks.com's web client, which extracts the information that it needs and uses a JSP to render it as HTML. The HTML is then returned to the client's browser.[2]
[2] Although this example uses a web browser as the client, it is equally possible to create a rich client (using Swing, for example) that would connect directly to the web service and present the results on the user's desktop rather than via a browser.
Figure 1-1 represents what will probably be a fairly typical use of a web service. Notice in particular that the direct user of the web service is not a human, but a web server. In fact, this diagram shows both a Business-to-Consumer (B2C) transaction performed using HTML over HTTP, and a Business-to-Business (B2B) transaction, which is the domain of web services and uses XML-based messaging.
Once MyXMLBooks.com adopts this architecture, which separates the presentation of information from the means by which it is obtained, it is relatively simple for it to add additional features. For example, if other online booksellers begin to offer a web service interface, MyXMLBooks.com could provide a consolidated service that routes user requests to the vendor that provides the best price or shortest delivery time for the items that the user wants to buy, or could query all of the available providers for their prices and delivery time commitments and then allow the user to make the choice. Although all of this could be done using screen scraping, the advantages of using a web service instead are:
Less data will need to be transferred because the useful information does not need to be accompanied by presentation markup.
The code required to make a request of a web service is much simpler than that required to extract data from an HTML page.
If, in the future, a standard interface were to be defined for online booksellers, MyXMLBooks.com needs only to write a single client in order for it to be able to talk to multiple booksellers.