If you’re just joining this article series, it is one aspect of a response to the gap between how development and how operations view technology and measure their success – it is wholly possible for development and operations to be individually successful, but for the organization to fail. So what can we do to better align development and operations so that they can speak the same language and work towards the success of the organization as a whole? This article series attempts to address a portion of this problem by presenting operation teams insight into how specific architecture and development decisions affect the day-to-day operational requirements of an application.
The past three articles have looked at how different systems can be integrated. Specifically they focused on two core technologies: service-oriented architecture (SOA) and event-driven architecture (EDA), and two SOA implementations: Simple Object Access Protocol (SOAP) and REpresentational State Transfer (REST).
Now that we have a plan for how we might integrate the different systems in our enterprise, the next step is to measure the impact of additional load generated by new consumers of our services and how we might mitigate that load. This article begins by looking at the challenges in understanding the data in each system and the need for “one data model to rule them all”. It then reviews the Canonical Data Model enterprise integration pattern as a solution to the data model problem and the construction of a Canonical Data Store as a solution to manage load on existing systems. Finally, it reviews the challenges in managing a Canonical Data Store individually as well as the complete SOA, EDA, and Canonical Data Store deployment.
System Integration and Data Model Challenges
Exposing services on top of your application dramatically helps you integrate systems by providing a standards-based mechanism for communication, but it means that in order to talk to that application, you need to speak its language. And what I mean by this is that if you are communicating with a reservation system then you need to talk about its representation of a reservation – if you then want to talk to a property management system, which also has reservations, then you need to talk about its representation of a reservation. If you ever want to integrate the reservation system with the property management system (which is probably one of the reasons for putting services in front of those systems in the first place) then they each need to translate their representation of a reservation to the other’s before calling the other’s services. This is shown graphically in figure 1.
Figure 1. Data Model Transformations between two systems
Figure 1 shows that the reservation system needs to translate its reservation (RS) to the property management system’s reservation (PMS) before it can invoke the property management system’s services. Likewise, the property management system’s reservation needs to be translated to the reservation system’s reservation before calling its services.
Now consider what happens if we want to integrate a sales system, which has its own representation of a reservation, with both the reservation and the property management systems. Figure 2 shows the headache that this causes.
Figure 2. Data Model Transformations between three systems
Any time that a system wants to integrate with another system it needs to perform the aforementioned transformations on its data model and, as the number of systems increases, the number of transformations increases exponentially.
The solution to this problem is to define a canonical representation of the data model and then when new systems need to be integrated, they only need to provide two transformations: from its internal representation to the canonical representation and from the canonical representation back to its internal representation. This is shown in figure 3.
Figure 3. Data Model Transformations to a Canonical Model
The amount of work to build a canonical model is overkill when integrating two systems, but it has simplified the integration of three systems. And you can probably guess that if you had 10 systems then you would not even consider directly integrating each system!
Canonical Data Store
A major transformation that companies sometimes undergo is the exposure of more and more of their internal functionality via the web. When I was working at an insurance company, we had a quoting application that was used by our insurance agencies, but we eventually wanted to attract new direct business, so we created an online quoting application and exposed it to the world. Likewise, in the large-scale project I worked on over the past four years that I have been talking about in this article series, we had several legacy systems whose functionality was to ultimately be exposed to the Internet. The problem, however, is that many of these systems could not sustain the load. Furthermore, these systems were not always 100% available; for example, one system had a four hour offline period for daily reconciliation in the middle of the night – this did not affect users because they were at their hotels sleeping, but in the new world, the Internet never sleeps!
So how do we handle excessive load and deal with down times in underlying systems of record? The solution that we arrived at was to build a Canonical Data Store. The Canonical Data Store implements the Canonical Data Model enterprise integration pattern and creates a document store that is built from the ground up to support the load that it will be subjected to. The Canonical Data Store integrates with underlying systems of record via the event-driven architecture presented in the previous article. Figure 4 shows the architecture for this environment.
Figure 4. Architecture with Canonical Data Store
When changes are made to the underlying systems of record, those systems of record raise events to topics that are processed by the canonical listener. The canonical listener leverages services to insert the updated resources into a canonical data store. Clients then interact with services to access the canonical representation of each object. In this way, the canonical layer is able to service all “read” requests without needing the systems of record be online. Furthermore, because all read requests are satisfied by the canonical layer, each SOR does not need to be burdened with those read requests.
An important side effect to this strategy is that the canonical data model shields clients from the underlying systems with which it is ultimately interacting. So if we later decide to replace the ticketing system or the reservation system, clients are already using to the canonical ticket or the canonical reservation so the client impact is minimal. If you do decide to use a canonical data model, make sure that your model is truly system independent and not a thin wrapper around the SOR’s model, otherwise replacing that system will impact clients.
One caveat to the whole EDA model and the canonical data store is that writes are eventually consistent. This means that the canonical data store will eventually have the correct data, but it will not always have the latest data. Consider how you use Facebook: you post an update, which you are able to see immediately, but your friends may not see the update until the canonical store is synchronized. If your application requires that all data be consistent (always shows the current state of data) then this model will not work. But if you are able to tolerate a certain amount of latency in your consistency, then this model affords a significant amount of scalability. The answer to whether or not eventual consistency satisfies your business needs is dependent on your business model.
Impacts on Operations
This has been an interesting discussion on software architecture and a review of your technical business model, but the question remains: what is the impact on operations for such an application? If your business model allows for eventual consistency and can support an EDA with a canonical store, then how do you manage it?
Let’s consider the constituent parts:
• Systems of record
• Canonical data store
• Service infrastructure to support canonical data store
The ESB will need to support events raised by various SORs. This means that the ESB will need a significant amount of processing power. Depending on the SOR, these events may be as simple as a newly created resource, but also may be chatty to inform you of every change to those SOR’s resources. From an operations perspective this means that the ESB needs a considerable amount of processing power. It also means that the ESB and the SORs should not be too far away from each other; you may want to collocate the ESB close to the SORs so that the network overhead for raising events does not create unnecessary network overhead. Finally, if you have several different systems that are leveraging the ESB then you might want to consider segmenting the ESB to shield each system from load generated by other systems. You can learn more about ESB management options here.
As mentioned in the previous article, SORs can generate lightweight events that necessitate callbacks or heavyweight events that contain the contents of modified resource in the payload. Your decision will be dependent on whether or not the SOR can support the callback; the choice also impacts clients in terms of how much work they will need to perform. If the event is lightweight then the client can simply make the callback to retrieve the latest version of the resource. If the event is heavyweight then the client will need to ensure that it is only processing the latest version of the resource and ignore stale versions. If your SOR can support the additional load then the client is relieved of this burden.
As SORs raise events, the current state of modified resources will be reflected in the canonical data store. This means that the canonical data store listeners will need to collocated close to the underlying systems of record and that the canonical data store will need significant processing power and network bandwidth. One thing to keep in mind about the canonical data store, and the reason I dedicated an article to it, is that it will have the current state of every object across the systems you are integrating. Furthermore, it will need to respond to every event that modifies resources. In other words, canonical data store is going to be a core component of your infrastructure and it will need: a lot of processing power, a lot of memory, and a very fast database. My recommendation, and this comes from lessons learned in my last project, is to store data in a NoSQL document store, such as MongoDB. I will have a full article coming up on MongoDB, but it is currently one of my favorite options for something like a canonical data store because it is built for massive scale and it has very reasonable querying capabilities (the type you won’t find in most key/value stores.) And if you need more advanced querying capabilities then you might want to integrate it with a search index such as Apache Solr.
We found that the canonical data store and the ESB were the most challenging pieces of this puzzle to maintain, so do not underestimate the amount of work here!
Finally, the service infrastructure that supports the management of the canonical data store and that services client requests has significant bandwidth and scalability requirements. The canonical data store is accessible via its services and the fact that it sits between clients and the underlying SORs means that the amount of load it is expected to receive is substantial. You need to ensure that these are powerful machines!
This article has really brought a close to a complete enterprise application architecture: service-oriented architecture, event-driven architecture, and the canonical data store. Considering how many applications are moving to the cloud and the massive amount of load they are being subjected to, this type of architecture is becoming more common.
In the next few articles we’ll review core technologies and how to manage them, from an operations perspective. I’ve already mentioned a couple topics that I plan on covering: MongoDB and Apache Solr, but look for articles on Hadoop, HBase, Apache Storm, and more!