If you’re just joining this article series, it is one aspect of a response to the gap between how development and how operations view technology and measure their success – it is wholly possible for development and operations to be individually successful, but for the organization to fail. So what can we do to better align development and operations so that they can speak the same language and work towards the success of the organization as a whole? This series attempts to address a portion of this problem by presenting operation teams insight into how specific architecture and development decisions affect the day-to-day operational requirements of an application.
This article is the second part of a two-part installment on Service-Oriented Architecture (SOA) and the impact of SOA on operations and DevOps. The first installment reviewed the Simple Object Access Protocol (SOAP) and the challenges that operations faces when managing a SOAP application. This installment reviews the other leading SOA technology: REpresentational State Transfer (REST). SOAP and REST behave differently and have their own management challenges and this article will help you better understand those differences and specifically how they affect the production operation of each.
REpresentational State Transfer (REST)
In the previous article we looked at SOAP and its remote procedure call (RPC) service operation model. On the surface REST and SOAP appear to differ by their supported protocols and their message payload: REST runs only on top of HTTP while SOAP can run on top of any transport and the message payload for REST messages is usually light-weight XML or JSON messages whereas SOAP messages are heavy-weight XML messages. But, as I said, this is only a surface-level analysis.
When you dive deeper into the architectural drivers behind REST and SOAP you’ll learn that REST and SOAP differ fundamentally in how they represent the system on top of which they run. It is not true that SOAP services can be directly converted to proper REST services, but rather the problem that is being solved needs to be redefined. REST defines resources and the HTTP methods / verbs define the types of operations that can be performed on those resources. So while I have been saying “REST services” in actuality it would be more appropriate to say “RESTful resources”.
The classic example that is used to illustrate the architectural approach to REST is to consider a blog. A user can create a blog entry, retrieve a blog entry, update a blog entry, and delete a blog entry. If we define our resource as a “blog” then the following HTTP operations can be used to manipulate our blog:
- POST /blog: create a new blog entry with the body of the POST containing the blog details; the HTTP Location Header will have the link to the blog entry that was created. For the purposes of this discussion let’s assume that we’re creating /blog/1
- GET /blog/1: retrieves the blog entry with id 1
- GET /blogs: returns all blog entries
- GET /blogs?searchCriteria=value&… : search for blogs that match the specified criteria
- PUT /blog/1: updates the contents of the blog entry with id 1 by replacing it with the body of the PUT
- PATCH /blog/1: updates a part of the blog entry with id 1; for example, a PATCH might just change the title of the blog and not affect its body
- DELETE /blog/1: deletes the blog entry with id 1
This approach is very HTTP-centric, which has obviously proven its ability to scale by looking at the World Wide Web, and it is very natural to us as Web consumers. But as you can see, instead of defining RPC style operations like “Create New Blog Entry” or “Validate Credit Card”, we approach the problem differently by starting with the resources rather than the operations.
A Uniform Resource Identifier (URI) identifies resources, which for all practical purposes is the portion of your URL without your host name, such as “/blog/1” or “/ticket/123”. HTTP verbs then define the operations that can be performed on those resources:
- GET: retrieve a resource
- POST: create a new resource
- PUT: update a resource in its entirety (can be used to create a resource for which you already know its identifier)
- PATCH: update a part of the resource
- DELETE: delete a resource
- HEAD: check for the existence of a resource by retrieving the HTTP headers for the resource that would be returned with a GET
- OPTIONS: request information about the communication options available for the resource; for all intents and purposes OPTIONS returns the valid HTTP verbs for the resource
At this point you can probably already tell that REST is going to result in chatty conversations. For example, if we want to change the value of a resource, such as adding an amount to the total number of sales for the day, we are going to need to retrieve the current state (GET) and then update the quantity (PUT or PATCH). In a SOAP request we could define a single operation such as “AddToTotal” that completes the action in a single call, but it is a two-step process with REST. One benefit to this, however, is that REST does allow the client, or the HTTP Server, to cache GET requests by defining a time-to-live for the resource (Expires header). If we have a resource that does not change very often then we may only need to access the web application once every 5 minutes rather than on every request, as we would need to do with a SOAP request. Remember to ask your developers how volatile each resource is and to define time-to-live values for those resources!
There is a problem in the example above: what happens if two processes try to update the total at the same time? Figure 1 shows this graphically.
Figure 1. Concurrent Modifications
In figure 1 we see the following:
1. A requests the value of the total
2. The total sales resource returns the value of 5
3. B requests the value of the total
4. The total sales resource again returns the value of 5
5. A adds 1 to the total and updates the total sales resource by PATCHing it with a value of 6
6. B does the same thing: it adds 1 to the total and updates the total sales resource by PATCHing it with a value of 6
The real value should be 7, but because the operation is not atomic, rather it is broken into a GET followed by a PATCH, there is the possibility that the PATCH operations overwrite each other. The solution is for the resource to define an entity tag, or ETag for short, with the version of the resource, and then require the PUT or PATCH call to include an “If-Matches” header. If the version of the resource does not match the current version then the server returns a 409 Conflict response, which requires the client to re-GET the resource, make its changes, and re-PATCH the resource. ETags are the mechanism that REST uses to manage transactions.
To further illustrate how chatty REST is, but also how powerful it is as an architectural pattern, you need to understand how “RESTful” the services are. The gauge of the “RESTfulness” of a solution is measured in what is called the Richardson Maturity Model. The Richardson Maturity Model (RMM) defines the following levels:
- Level 0: Uses HTTP as the transport system, but does not leverage any of the mechanisms defined for the web; uses POST for all of its operations
- Level 1: Uses HTTP as the transport system and defines the concept of resources, but continues to use POST for all of its operations
- Level 2: Uses HTTP, resources, and properly uses the HTTP verbs as described above
- Level 3: All of the above, but also adds Hypermedia Controls
Hypermedia Controls are powerful, but under an ugly acronym: HATEOAS (Hypertext As The Engine Of Application State). In other words, every resource that you access provides links to other related resources as well as links to resources that can modify the state of the resource you’re looking at. Think about what made the World Wide Web so powerful: hyperlinks! From the page that you are view you can find links to other pages that provide you with more or related information. HATEOAS hypermedia links serve the same purpose for RESTful resources: they answer the question “where can I go from here?”
If you are interested in looking at detailed requests and responses in the context of ordering coffee, there is an excellent article on InfoQ.com by Jim Webber, Savas Parastatidis, and Ian Robinson entitled “How to GET a Cup of Coffee”.
All of this is to say that from an operations perspective, the following observations are true:
- REST is chatty because it operates on resources at a Create Read Update Delete (CRUD) level. What might take a single operation in SOAP takes multiple operations in REST
- REST encourages chattiness with its support for HATEOAS: resources do not stand alone, but each resource tells you where you can go to next
- REST makes extensive use of HTTP headers, such as ETag, Expires, and Location on the way out and If-Modified and different content-types on the way in (just to name a few)
- REST makes extensive use of HTTP response codes: you will see standard 200 responses, but you’ll frequently see different 4xx and 5xx error messages as well as differentiation between things like 200 OK and 201 Created, and so forth
SOAP or REST?
The first article in this installment presented an overview of SOAP and this article presented an overview of REST so you might have already formulated some opinions about when one technology might be better than another. It is almost like discussing politics or religion, but there are times when using one technology is better than the other.
SOAP is best when:
- A formal contract is required between the client and server
- There are auditing requirements (because auditing tools already exist that can parse SOAP message)
- You are going to perform integration between components that are independently developed over a long period of time
- Integrating with existing systems that are RPC in nature
REST is best when:
- Scalability is of paramount concern
- The application is dynamic and can make good use of HATEOAS
- There are a number of direct web client consumers
- Developing new systems in which you can control the representation of resources
Of course these are opinions, but they are based on building and deploying both types of systems over the past decade.
Service-Oriented Architecture (SOA) is a strategy that can be used to integrate disparate software products. Typically, an organization builds a set of services in front of existing systems (or as part of the development effort for a new system) and then other systems can integrate with the services rather than the destination system itself. In order to facilitate this interaction in an easy-to-use manner, a common service paradigm needed to be defined, which prompted the definition of the Simple Object Access Protocol (SOAP) and later REpresentational State Transfer (REST). Both SOAP and REST impact the demands placed on the operation and sustainment of the application and those operational impacts need to be understood by operations and DevOps.
This two-part installment on SOA presented an overview of SOAP and an overview of REST with the aim of showing you the operational impacts of each on your production environment. When working with your developers, I hope that I have given you the correct questions to ask and empowered you with a knowledge of how you are going to need to configure your deployment environment, caching solution, and web servers to support either solution.