Why web services?
Overview
The popularity of component-based programming is growing. It is rare to find an application that doesn’t involve the use of components from different vendors. Applications have become more complex and require remote components to be leveraged.
A component-based application can be an end-to–end e-commerce solution. A Web farm e-commerce application must submit orders to an Enterprise Resource Planning (ERP), back-end application. The ERP application may reside on different hardware or run on a different operating platform.
Microsoft Distributed Component Object Model, (DCOM) is a distributed object infrastructure that allows applications to invoke Component Object Models (COM) components on other servers. It has been ported to a variety of non-Windows platforms. DCOM is not widely accepted on these platforms so it is seldom used to facilitate communication between Windows computers and non-Windows ones. ERP software vendors frequently create components for Windows that communicate with the back end system using a proprietary protocol.
Some services that an e-commerce app leverages might not be available in the datacenter. If the ecommerce application is accepting credit card payments for goods purchased by customers, it will need to contact the merchant bank in order to process their credit card information. DCOM and other technologies like Java RMI and CORBA are only applicable to components and applications within the corporate datacenter. These technologies are based on proprietary protocols, which are by default connected.
There are many possible barriers that clients face when communicating over the Internet with the server. Network administrators all over the globe have installed corporate firewalls and routers to block virtually every form of Internet communication. Sometimes, it takes an act God to get network administrators to open ports beyond what is necessary.
Your clients may not be so lucky if you are able to get the right ports opened by a network administrator. For Internet situations, it is not possible to use proprietary protocols like Java RMI, CORBA, or DCOM.
These technologies are also connected-oriented, and cannot manage network interruptions gracefully. The Internet is out of your control so you can’t make assumptions about its reliability or quality. A network interruption could cause the client’s next call to the server to fail.
These technologies are highly connected-oriented, making it difficult to create the load-balanced infrastructures required to ensure high scalability. You cannot route the next request to another server if the connection between client and server has been severed.
These limitations have been overcome by developers who have attempted to leverage a model known as stateless programming. However, they have not had much success due to the technology being heavy and making it difficult to establish a connection with remote objects.
DCOM is not the best option for communication between DCOM and the credit card processing server. A third-party component, which is similar to an ERP solution, is usually installed in the client’s datacenter. In this case, it is the credit card processing provider. This component is merely a proxy to facilitate communication between the merchant bank and the e-commerce software via a proprietary protocol.
Do you see a pattern? Software vendors often resort to building their own infrastructure due to limitations in existing technologies that hinder communication between computers systems. Software vendors have often turned to building their own infrastructure to increase functionality in the ERP system and credit card processing systems.
Microsoft began to support these Internet scenarios by augmenting its technologies. This strategy included COM Internet Services (CIS), which allows for a DCOM connection between client and remote component over port 80. CIS was not widely adopted for a variety of reasons.
It was obvious that a new approach was required. Microsoft decided to tackle the problem from the bottom. Let’s take a look at the requirements that the solution needed to be successful.
InteroperabilityRemote service must be accessible by clients using other platforms.
Internet friendlinessClients who access remote services via the Internet should find the solution to their problems.
Interfaces that are well typedIt should not be unclear what type of data is being sent and received by remote services. Datatypes that are defined by remote services should be able to map fairly well with datatypes that are defined by procedural programming languages.
Use existing Internet standardsRemote service implementation should take advantage of existing Internet standards and not reinvent solutions to problems already solved. The technology can be based on existing toolsets, products and standards.
Any language supportThe solution should not be tied to one programming language. Java RMI is, for instance, tightly coupled to Java. Visual Basic and Perl would make it difficult to access remote Java objects. Clients should be able implement new Web services or use existing Web services regardless of what programming language they were written in.
Any distributed component infrastructure supportedThe solution should not be tied to any one component infrastructure. You shouldn’t have to buy, install, maintain, or manage a distributed object infrastructure in order to create a remote service or use an existing one. These protocols should allow for a basic level of communication between existing distributed objects infrastructures, such as DCOM or CORBA.
It should not be surprising that Microsoft’s solution is called Web Services. A Web service is an interface that allows a client to invoke a specific activity. The Web service can be accessed by a client through Internet standards.
Building blocks for web services
This graphic illustrates the basic building blocks required to enable remote communication between two applications.
Let’s talk about the purpose of each one of these building blocks. Many readers will be familiar with DCOM so I will also include the DCOM equivalent for each building block.
DiscoveryClient applications that require access to Web services functionality must be able to locate the remote service. This is done through what is generally called “The Remote Service Locator” (or simply “The Remote Service Locator”).Discovery. You can facilitate discovery using both a centralized directory and more ad-hoc methods. The Service Control Manager (SCM), in DCOM, provides discovery services.
DescriptionAfter determining the end point of a Web service, clients need sufficient information to be able to interact with it. A Web service description includes structured metadata about the interface to be used by client applications as well as written documentation, including examples of usage. A DCOM component exposes structured data about its interfaces through a type library (typelib). The metadata in the typelib of a component is stored in a proprietary binary format. It can be accessed via an API (proprietary application programming interface).
Format for a messageA client and server must agree on a common format and way to encode data in order to exchange it. Standard ways of encoding data will ensure that the server can properly interpret the data sent by the client. DCOM is a protocol that allows messages to be sent between a client or server. It follows the DCOM Object RPC(ORPC) protocol.
It is almost impossible to develop a toolkit to help the developer abstract from the underlying protocols without a standard format for formatting messages. The creation of an abstraction layer between the developers and the underlying protocol allows the developer more time to concentrate on the business problem and less on the infrastructure needed to implement it.
CodingData sent between the client/server must be encoded in the message body. DCOM uses a binary encryption scheme to serialize data from parameters that are exchanged between clients and servers.
TransportAfter the message has been formatted, and the data serialized into the message body, it must be transmitted between the client/server over some transport protocol. DCOM supports many proprietary protocols that are bound to various network protocols like SPX, NetBEUI and TCP.
Web Services Design Decisions
Let’s talk about some of the design decisions that went into these building blocks for web services.
How to Choose Transport Protocols
First, determine how client and server will communicate. Although the server and client can be on the same network, the client could communicate with the server via the Internet. The transport protocol must also be compatible with both LAN and Internet environments.
DCOM, CORBA and Java RMI, among others, are not well-suited to support communication between clients and servers over the Internet. Protocols like Hypertext Transfer Protocol, (HTTP), and Simple Mail Transfer Protocol are well-proven Internet protocols. HTTP is a request/response messaging protocol that allows you to submit a request and receive a response. SMTP is a routable messaging protocol that allows for synchronous communication. Let’s look at why HTTP and SMTP work well together for the Internet.
HTTP-based Web apps are stateless in nature. They don’t rely on a constant connection between the client/server. This makes HTTP an ideal protocol to use with high-availability configurations, such as firewalls. If the server handling the original client request is unavailable, any subsequent requests can automatically be routed to another server. This happens without the client even knowing.
SMTP is supported by almost all companies. SMTP is well-suited for asynchronous communication. The e-mail infrastructure handles all retries if service is interrupted. Unlike HTTP, SMTP messages can be passed to a local mail server which will attempt to deliver your mail message.
Another advantage to both HTTP and SMTP are their ubiquitousness. Employees have grown to depend on both their Web browsers and e-mail, so network administrators are comfortable supporting these services. Proxy servers and network address translation (NAT), are two technologies that allow you to access the Internet via HTTP within corporate LANs. Administrators often reveal SMTP servers that are located within firewalls. The Internet will route messages sent to this server to their destination.
For credit card processing software, a merchant bank must respond immediately to decide if the order should go to the ERP system. HTTP is well-suited for this task with its request/response messaging pattern.
ERP software packages cannot handle large orders that could be generated from e-commerce applications. It is also not necessary that orders are submitted to ERP systems in real-time. SMTP can be used to queue orders, so they can be processed serially in the ERP system.
Microsoft Message Queue Server is another option if the ERP system supports distributed transaction. Non-Internet protocols are less problematic as long as the ERP system and e-commerce app reside on the same LAN. MSMQ is more flexible than SMTP in that messages can be added to and removed from the queue during a transaction. If a transaction fails to process a message pulled from the queue, the message will be automatically placed back into the queue.
How to choose an encryption scheme
HTTP and SMTP allow data to be sent between clients and servers. However, neither protocol specifies how data should be encoded in the message’s body. Microsoft required a standard and platform-neutral way for data to be encoded between the client (and the server)
Extensible Markup Language, (XML), was chosen because it was possible to leverage Internet-based protocol. XML has many benefits, including cross-platform support and a common type system. It also supports industry-standard character sets.
Binary encoding schemes, such as Java RMI, CORBA and DCOM, must be compatible with different hardware platforms. Different hardware platforms may have different internal binary representations of multi-byte number. Intel platforms order bytes of a Multi-byte Number using the little endian protocol; many RISC Processors order bytes of a Multi-byte Number using the big endian protocol.
XML uses a text-based encoding system that utilizes standard characters sets to avoid binary encoding. Some transport protocols like SMTP can only contain text-based messages.
Binary methods of encoding such as DCOM or CORBA are cumbersome. They require support infrastructure to remove the developer from all the details. XML is lighter and easier to use because it can be created using standard text-parsing methods.
A variety of XML parsers can be used to simplify the creation and consumption XML documents on virtually every platform. XML is light and supports excellent tools, so XML encoding gives you incredible reach as almost any client can communicate with your Web service from any platform.
How to choose a formatting convention
Additional metadata is sometimes necessary in the body of a message. You might include information such as routing information or transaction information to help you determine the services that the Web service will provide to meet your request. XML does not allow you to distinguish the body of a message from the associated data.
Although transport protocols like HTTP offer an extensible mechanism to store header data, some data associated with a message may not be specific for the particular protocol. A client may send a message to multiple destinations that must be routed over different transport protocols. If routing information was placed in an HTTP header, it would need to be translated before being sent over to the next intermediary using another transport protocol such as SMTP. It should be included in the message because the routing information is specific for the message and not the transport protocols.
Simple Object Access Protocol, (SOAP), provides a protocol-agnostic way to associate header information with message body. Every SOAP message must define an envelope. An envelope contains a body that includes the message’s payload and a header that may contain metadata.
SOAP does not limit the format of the message body. This could be a concern as it makes it difficult to create a toolkit that abstracts you away from the underlying protocols without having a consistent way to encode the data. It is possible that you will spend more time learning the interface of the Web service than solving the business problem.
It was necessary to have a standard method of formatting remote procedure calls (RPC) messages and encoding their list of parameters. This is precisely what Section 7 of SOAP’s specification offers. It defines a standard naming convention for procedure-oriented messages and an encoding style.
Platforms like ASP.NET or Remoting provide a standard format to serialize data into XML messages.
How to choose the description mechanism
SOAP is a standard format for formatting messages between the Web service client and the Web service. The client will need additional information to correctly serialize and interpret the response. XML Schema allows you to create schemas that can be used for describing the content of a message.
XML Schema has a set of datatypes that can be used for describing the content of a message. You can also create your datatypes. A merchant bank might create a complicated datatype to describe content and structure of a message that is used to submit a request for credit card payments.
A schema is a collection of element and datatype definitions. The schema is used by a Web service to communicate the data expected to be in a message and to validate the messages.
However, a schema by itself is not sufficient information to describe a Web-service effectively. The schema doesn’t describe the messages between the client, server and client. A client must know, for example, whether they can expect a response from the ERP system when an order has been placed. Clients should also know the transport protocol that the Web service uses to send requests. The client must also know the address of the Web service.
This information is provided in a Web Services Description Language document (WSDL). WSDL (Web Services Description Language) is an XML document which describes a specific Web service. Tools like ASP.NET.NET.WSDL.exe or Remoting SOAPSUDS.exe are able to consume WSDL and build proxies automatically for the developer.
A Web service, like any other component used in software development, should be accompanied with written documentation. This is for developers who use the Web service to program. Documentation should explain what the Web service does and the interfaces it exposes. It should also include examples of how it can be used. If the Web service is accessible to clients via the Internet, it is particularly important to have good documentation.
Choose Discovery Mechanisms
How can you help potential clients find the Web service once it’s been developed and documented? Your approach to a potential client if the Web service is intended to be consumed only by a member or your development team can be informal. For example, you might share the URL of the WSDL file with a peer a few cubicles below. Advertising your Web service to potential clients is a different matter.
It is necessary to have a common way of advertising Web services. Universal Description, Discovery, and Integration is a common way to advertise Web services. UDDI, an industry-standard directory service, can be used for advertising and locating Web services. UDDI lets users search for Web services by using a variety of search criteria including company name and category as well as type of Web service.
DISCO is a proprietary XML format that Microsoft has created to allow Web sites to promote the services they offer. DISCO is a protocol that allows for the creation of hyperlinks to locate resources. Microsoft Visual Studio.NET is the primary user of DISCO. Developers can target a specific Web server to navigate through the Web services available.
What is missing from Web Services?
Some key components of a distributed component infrastructure may not be defined by Web services. One of the most noticeable omissions is a clearly defined API for creating and consuming Web service and a set components services such as support for distributed transaction support. Let’s talk about each of these missing parts.
Web service-specific APIAn API is used by most distributed component infrastructures to perform tasks such as creating an instance of a part, initializing the runtime and reflecting the metadata about the component. Most high-level programming languages offer some interoperability with C. The API is typically exposed as a set of C method signatures. RMI ties its API tightly to a single high level language, Java.
Microsoft left it up for individual software vendors, in order to make sure that Web services can be used on any programming language. Later in the book, I will talk about two Web services implementations on the.NET platform: ASP.NET, and Remoting.
Component servicesThe Web services platform doesn’t provide many services that are common in distributed component infrastructures such as remote object life management, object pooling and support for distributed transaction. These services will be implemented by the distributed component infrastructure.
As technology improves, some services can be added later, such as support distributed transactions. Other services, like object pooling or object lifetime management can be considered an implementation detail. Remoting, for example, defines extensions that support object lifetime management. Microsoft Component Services supports object pooling.
Summary
Although component-based programming has been a great way to increase developer productivity, some services can’t be encapsulated using a component that is located within the client’s datacenter. Legacy technologies like Java RMI, CORBA and DCOM are not suitable for allowing clients access services over the Internet. Microsoft decided to start at the bottom and create an industry-standard method of accessing remote services.
Web Services is a broad term that refers to a set of industry-standard protocols and services that are used to enable a basic level of interoperability among applications. Web services have received unprecedented industry support. Web services has received unprecedented industry support. This standard facilitates interoperability among applications regardless of platform.