In the previous post in this series we introduced the Microservices Architectural Pattern. We outlined both the benefits and the drawbacks of opting for microservices and we discussed how, despite their complexity, microservices make ideal solutions for many complex legacy applications. In this post we will introduce the concept of the API Gateway.
When building an application as a set of microservices, you will need to decide how your application’s clients will interact with the microservices. With a traditional monolithic application, there is only one set of endpoints, which are typically replicated with some sort of a load balancing mechanism to distribute traffic among them. However, in a microservices architecture, each service exposes a set of more granular endpoints. We will examine how this impacts client-to-application communication and introduce an approach that relies on an API Gateway to manage these points. To keep this in context, we will used the Military Health System as a domain example. Let’s imagine that we are developing a native mobile client for a pharmacy application. It’s likely that you need to implement a patient profile page, which will display detailed information about prescription refills, refill reminders, text and email notifications, etc. For example, the associated image depicts what a user might see when scrolling through our pharmacy mobile app. Even though this only represents a sample smartphone application, you can imagine that the pharmacy details page would need to display a lot of information. For example, not only is there basic patient profile information, such as the patient and dependent demographic details (view family members), but it must also retrieve and display the number of prescriptions ready to be refilled as well as the corresponding individual order history. Additionally, the application can also be expected to manage:
Number of items currently in the patient’s cart
Drug Interaction Warnings
Generic recommendations, if applicable.
When using a monolithic application architecture, the mobile client would retrieve this data by making a single REST call to the application, such as:
A load balancer routes the request to one of several identical application instances. The application then queries various database tables and returns the entire collated response to the client. Conversely, when using a microservices architecture pattern, the data displayed on the details page is owned by multiple microservices:
Pharmacy Shopping Cart Service - Number of items in the shopping cart
Pharmacy Order Service - Refill history
Interaction Service - Basic drug interaction information
Generic Comparison Service - Generic Recommendations
Direct Client-to-Microservice Communication
In theory, a mobile client (or any other client) could make a request to each of the microservices directly. The individual microservice would have a public endpoint such as:
This URL would map to the microservices load balancer, which would then distribute the requests across the available instances. To retrieve the patient profile-specific page information, the mobile client would make requests to each of the services listed above. Obviously, there are inherent challenges and limitations with this approach. There is clearly a mismatch between the needs of the client and that of the granular APIs exposed by each of the microservices. The client in our example has to make four separate requests. In a more complex application it would have to make many more. As a commercial example, Amazon describes how hundreds of services are involved in rendering a single product page. While a client might be able to make that many requests over a private LAN, it is in all probability far too inefficient over the public Internet and even more impractical on a 3g or 4g mobile network. The direct to client approach also makes the client code considerably more complex. Another problem with directly calling the microservice is that in the absence of tight governance, development teams are free to employ protocols that are not necessarily web-friendly. One service might use a Thrift binary RPC while another might choose the AMQP messaging protocol. Neither protocol is particularly browser (or firewall-friendly), and both are best suited for internal use only, if at all. Typically, applications should use protocols such as HTTP, HTTPS, and WebSocket outside of the firewall. Direct to service approaches also makes it difficult to refactor the microservices if you need eventually change how the system is partitioned. For example, we might want to merge two services or split a single service into two or more services. However, if the connected clients are configured to communicate directly with the services, then refactoring can be considerably more difficult. To retain maximum integration flexibility, it rarely makes sense to configure a client to talk directly to a microservice.
Introducing the API Gateway
A much better approach is to establish what is commonly known as an API Gateway. An API Gateway in this sense is a server component that is set up as a single entry point into the entire system. It is similar to the Facade pattern (from object-oriented design). The API Gateway encapsulates all of the internal system architecture and provides an API that is tailored to each client, although it might have other responsibilities such as authentication, monitoring, load balancing, caching, request shaping and management, and static response handling:
The API Gateway is responsible for request routing, composition, and protocol translation. All requests from the clients are first routed through the API Gateway. The gateway then routes requests to the appropriate microservice. Many API Gateways are often designed to handle a request by invoking multiple microservices and then aggregating the results. They can translate between web-friendly protocols such as HTTPS and the unfriendly protocols that might be used internally. A properly designed API Gateway can also provide each client with a custom API. These typically expose a coarse-grained API for mobile clients. For example, consider our patient details scenario for the pharmacy app... The API Gateway can provide an endpoint (/pharmacydetails?prescriptionid=xxx) that enables a mobile client to retrieve all of the prescription details (order history, available refills, drug interactions, and the generic suggestions) with a single request. The API Gateway can handle such request by invoking the various services and combining the results. An often cited example of a well-defined API Gateway is in use by Netflix. The Netflix streaming service is available on hundreds of different kinds of devices including Smart TVs, streaming devices such as Roku boxes and Amazon Fire Sticks, smartphones, gaming consoles, tablets, etc. At first, Netflix tried to provide a one-size-fits-all API to support their streaming service. However, they quickly discovered that the diverse range of devices all came with their own unique requirements, and the single API paradigm presented considerable challenges. Netflix has since adopted an API Gateway that provides API’s specifically tailored for each device. The API’s run device-specific adapter code. Each adapter typically handles requests by invoking, on average, six to seven backend services. In aggregate, the Netflix API Gateway consistently handles billions of requests everyday.
Pros and Cons of Implementing an API Gateway
Implementing an API Gateway approach has both benefits as well as a few known drawbacks. One of the key benefits of employing an API Gateway is that it essentially encapsulates the entire internal structure of the application so, as opposed to invoking specific services, clients can simply interact with the gateway to handle a certain level of logic. The API Gateway provides all of the various types of clients with a specific API. This level of customization reduces the number of round trips between the client and application and improves performance considerably in environments that need to support a diverse array of clients. It also simplifies the client side code. An API Gateway is nothing more than another highly available component, so in that sense it must be developed, deployed, and managed just like any other application. Because the API Gateway can grow to support a wide variety of individual clients - each with their own unique set of requirements, there is at least some degree of risk that the API Gateway itself can become a development bottleneck. Part of the reason for this is that Developers must update the API Gateway in order to expose each independent microservice endpoint. The process for updating the API Gateway needs to be as lightweight as possible or developers will end up queuing gateway modifications unnecessarily. The drawbacks to the API Gateway approach are relatively minor, and for most real world application scenarios, employing a gateway is a a pragmatic decision.
Implementing a practical API Gateway
The Reactive Programming Model
As previously discussed, all microservices-based applications are considered distributed systems, and as such must instantiate some form of inter-process communication mechanism. There are two basic approaches of inter-process communication. One approach is to use an asynchronous, messaging-based methodology such as brokered JMS or AMQP. There are also several brokerless options available that communicate directly. The other approach to inter-process communication is a synchronous mechanism such as HTTP or Thrift. In many use cases, a system will typically end up supporting both asynchronous and synchronous options and may even support multiple implementations of each. As a result, your API Gateway will need to support a wide variety of communication mechanisms based on the various use cases your solution needs to satisfy.
The service discovery concept is critical to successful gateway implementation since the API Gateway needs to know the IP address and specific port for each microservice it communicates with. In a traditional N-tier application, developers could essentially hardwire the locations (although such practices present significant security concerns). However, in a cloud-based microservices application, finding the needed endpoints isn’t exactly a trivial problem. Infrastructure services such as ESB’s and message brokers will often have a static location that can be specified in an OS environment variable. However, because application services have dynamically assigned locations, determining their location isn’t that easy. The instances of a service can change dynamically too due to auto-scaling functions or upgrades. As a result, the API Gateway, like any other client in the system, relies on the system’s service discovery mechanism: either server-side discovery or client-side discovery. Because service discovery is such a critical concept to the entire microservices pattern, we will address the topic in greater detail in a follow on post. In the meantime, it is worth noting that if the system uses client-side discovery, then the API Gateway must be able to query the service registry (essentially a database of all microservice instances and their respective locations).
Handling Partial Failures
Perhaps the greatest challenge to the implementation of an API Gateway is teaching your teams how to handle a partial failure. The partial failure issue is present in all distributed system patterns and typically manifests itself whenever one service calls another service that is either responding slowly due to inherent latency issues or is wholly unavailable for whatever reason. From a practical perspective, your API Gateway should never block workflow indefinitely while waiting for a downstream service. However, how it handles those types of failures will depend on the specific environment, and of course which service is failing. Using our pharmacy app example, if the drug interaction service is unresponsive, the API Gateway should return the rest of the patient details to the client since they are still useful to the application. Generally speaking if a service is unresponsive, then the API Gateway should always return a conditional error to the client. Along the same lines, the API Gateway should also return cached data if it is available. It can be inferred that for the most part, drug details change rather infrequently. Taking this into account, the API Gateway could be configured to return cached drug detail data if the underlying microservice is unavailable. The pharmacy detail data can be cached by the API Gateway itself or be stored in an external cache, such as Redis or Memcached. Configuring the gateway to return either default data or cached data will ensure that system failures are not disruptive to the end user. Again, Netflix has a an incredibly useful library (Hystrix ) for writing code that invokes remote services. Hystrix will time out service calls that exceed a predetermined threshold. Hystrix also implements a “circuit breaker” pattern that stops clients from needlessly waiting for unresponsive services. Additionally, if error rates for a particular service exceed your specified threshold, Hystrix will trip the circuit breaker function which in turn will cause all requests to fail immediately for a specified period of time. The library allows you to define a secondary fallback action when these types of request fail, such as reading from a cache or returning a default value. Consider using a library like Hystrix for JVM environments (or an equivalent library for non-JVM environments).
In summary, for most microservices-based applications it makes sense to implement an API Gateway to act as a single entry point into your system. The API Gateway greatly simplifies request routing, composition, and protocol translation - providing the application’s various clients with a custom API. The API Gateway is also instrumental in obfuscating backend failures from the user by returning cached or default data in the event one or more services are unavailable.