An often under-appreciated service on AWS is Route 53. One could make the mistake of thinking of AWS Route 53 as just another DNS service. On the contrary, using AWS Route 53 for global load balancing, you can benefit from improved latency and better availability for your application stack.
How is this done? This article will give you an overview of how it can be set up, and hopefully, it will provide a few tips to help you along the way.
What is Global Load Balancing?
If you’re unfamiliar with load balancing or global load balancing, a quick explanation is in order.
Load balancing is a method of distributing application workload across multiple computing resources. This is typically done in a few different ways:
- DNS. With this method, a list of application endpoints is maintained in DNS, and client requests are distributed across your infrastructure by configuring policies that determine which end point is given during the DNS response. This can be done in a simple round robin fashion, or using AWS Route 53 Routing Policies.
- Client side load balancing. A list of application end points is maintained by a client application. The client then selects an end point to connect to based on logic programmed into the client.
- Server side load balancing. Clients connect to a load balancer, which then forwards requests to the application servers based on a list maintained on the load balancer.
Global load balancing involves routing application traffic to geographically diverse servers or data centers. This can be done with both physical and virtual infrastructure. We’ll be discussing using both DNS and server side load balancing using Route 53 and Elastic Load Balancing.
Benefits of Global Load Balancing
There are a number of use cases where you can benefit from global load balancing.
- Application latency. You might want to reduce application latency by locating your application servers in close geographic proximity to the application clients.
- Geolocation. Your requirements dictate that clients in a specific geographic region are routed to a subset of your application stack. This can come in handy when there are legal requirements dictating whether content is available to clients from that region, or where certain types of information (such as PII), are stored.
- Scalability. You want to spread your server footprint across multiple regions and datacenters as a means to scale your application. A use case for this would be one time events where you spin up datacenters in a region to handle traffic for a special event in that region, then spin them back down once the event is over.
- Application maintenance. You can reduce downtime in your application during maintenance cycles by shuffling traffic away from a regional installation while you perform updates or maintenance to that region’s servers, then moving the traffic back once your updates are complete.
- High availability can be achieved by distributing application load across your datacenters, and using monitoring to determine endpoint availability. Through the use of routing policies, you can have Route 53 automatically failover to a known good region if another region becomes unavailable for some reason.
- Disaster recovery could be achieved by maintaining a primary and a backup location. Data is replicated from the primary to the backup location, and if you are using Route 53 end point monitoring, failover from one region to another can be achieved in this manner.
Getting Started with Global Load Balancing
Say you want to set up virtual datacenters in multiple global AWS regions. While it is possible to globally load balance between both physical and virtual data centers, for the purposes of this discussion, our infrastructure will exist solely on AWS.
In this example, we will be using four different AWS regions to provide application availability to our clients. These regions are:
- US-West-1 (Northern California)
- US-East-1 (Northern Virginia)
- AsiaPacific-NorthEast-1 (Tokyo)
- EU-Central-1 (Frankfurt)
Using these regions will provide reasonable coverage for our theoretical globally located web clients. Naturally, you will need to assess where your clients are located for your own application implementation and which regions are best suited to serve them. You may only have US clients, in which case, US-based AWS regions may be more appropriate than the global locations that have been chosen here.
Plan and deploy your application stack in each location. Some things you will need to consider are redundancy, auto-scaling groups in each availability zone, AWS elastic load balancer deployment and cross-AZ load balancing, data replication between AZs and regions, VPC layout, connectivity between regions, application and instance monitoring, and your deployment and configuration management system.
As with everything on this list, be sure that you give careful consideration to your IP addressing scheme when connecting VPCs together. It is easy to overlook this when planning your deployment, and IP address conflicts could result if you simply use the default IP addressing in each VPC. This will make routing and network connectivity between regions difficult, if not impossible.
Our primary domain to be resolved by the web clients will be example.com. It is helpful to become familiar with Route 53 routing policies prior to attempting to configure your DNS.
The DNS records will be as follows:
- Primary domain: example.com
- Latency based domains: aws-usw1.example.com, aws-use1.example.com, aws-apne1.example.com, aws-euc1.example.com
- Example.com points to the latency-based domains, which are then configured in Route 53 with a latency-based routing policy to route the DNS traffic to the AWS region with the lowest latency for that client.
Weighted domains: aws-elb-usw1.examples.com, aws-elb-use1.example.com, aws-elb-apne1.example.com, aws-elb-euc1.example.com
The latency-based domains will point to the weighted domains. Each latency-based domain will have listed all of the weighted domain records given above.
Weighted domains give you the ability to route local client traffic to the region of your choice. For example, you want to take your datacenter in EU-Central-1 offline for maintenance, you could send that region’s traffic to another region, such as US-East-1 by changing the weighting of your records.
In the case above, you might provide a weighting of 100 to the aws-elb-euc-1.example.com, and 0 to all of the other domains. When you want to redirect the traffic for your EU clients, you would increase the weighting of US-East-1 to 100, then reduce the EU-Central-1 weighting to 0.
Traffic will then shift to the domain with the heavier weight.
One helpful tip: Be sure to increase the weight of your target domain before reducing the weight of the one from which you want to direct traffic away. Otherwise, you run the risk of not giving an endpoint for clients to connect to, effectively blackholing the traffic.
- Elastic load balancers: These will be the actual ELB endpoints listed in Route 53. Each of your weighted domains will have its own respective ELB endpoint specific to that region.
You should be cognizant of a few things when setting up your global load balancing infrastructure.
- Familiarize yourself with AWS Route 53 routing policies, then read them again. A thorough understanding of how they work is critical to a project of this type.
- Configure AWS Route 53 health checks to give your application resiliency and automated failover.
- Configure short TTLs in your DNS. Typically, 300 seconds should be sufficient to move traffic away from a datacenter relatively quickly, but your use case may require longer or shorter TTLs. Keep in mind that the longer the TTL, the longer it will take for clients to migrate to their new application end point. Shorter TTLs also mean more requests to the DNS infrastructure, and therefore more load. Except in extreme cases, DNS load should not be an issue for Amazon, but additional requests to Route 53 will mean additional costs for you. Plan accordingly.
- Even with short TTLs, traffic migration from one datacenter to another using DNS is not instantaneous. Depending on your TTLs, and the client, it could take as much as 24 – 48 hours for all traffic to move from a given datacenter when operating at scale. Plan your maintenance windows accordingly.
- It is possible to black hole client traffic if you don’t setup your DNS records in exactly the right way. One recommendation is to have your plan for the DNS peer reviewed, and then have it reviewed again during the implementation process to ensure no mistakes in the routing policies are made.
- As with item 5 above, it is a good idea to have changes to the DNS weighting peer reviewed so as to prevent any mistakes when changing your production traffic patterns. This can’t be stressed enough, as it is very easy to give the wrong record an incorrect weighting, and potentially black hole some—or all—of your client traffic.
- Pay attention to application scalability within each of your datacenters. If you have your total global application load distributed amongst all of your datacenters, moving traffic from one to another could result in overload for the datacenter to which you are moving traffic. Be sure your regional stack can handle the additional load.
- Consider using AWS Route 53 Traffic Flow to configure your routing policies. This simplifies the setup and administration of your global DNS routing, and can help you avoid mistakes that can occur when setting up your DNS infrastructure to support global load balancing.
- If you are running large-scale workloads, purchase Business Support from Amazon. It will save you a great deal of trouble when you run into issues and you can’t find the answers.
Conclusion: What’s Next?
In this article, we’ve covered load balancing, global load balancing, their use cases, AWS Route 53, and a variety of tricks and tips to consider when you look to plan and set up your AWS Route 53 infrastructure.