AWS Regions

Chapter 1: Introduction to AWS (Regions)

AWS Global Infrastructure


Thoughts on Regions

AWS is global cloud player with lots of data centres offering services around the world in multiple regions.  Regions are geographic areas with 2 or more availability zones. What's the difference between regions? The main differences are (logically) location, number of services offered, the number of availability zones in each region, the price of services, and the latency between regions. 

Why would you pick one or more regions to deploy an application in? Probably a combination of these factors, including high availability across multiple regions, but in terms of performance trying to reduce the latency for the majority of users or users with more sensitive end user experience (e.g. mobile devices and interactive applications) makes sense.

These two blogs examine inter region latency in more detail:




This AWS blog introduces multi-region latency based routing:

Here's the data from the 2nd blog, "Latency Between AWS Global Regions", in table form:

FROM/TO (average latency in ms)
US East (N. Virginia)
US West (N. California)
US West (Oregon)
EU (Ireland)
EU (Frankfurt) 
Asia Pacific (Tokyo)
Asia Pacific (Seoul)
Asia Pacific (Singapore)
Asia Pacific (Sydney) 
South America (São Paulo)
US East (N. Virginia)
0
72.74
86.98
80.55
88.66
145.3
178.4
216.7
230
119.5
US West (N. California)
71.63
0
19.46
153.2
166.6
102.5
131.5
174
157.5
192.7
US West (Oregon)
88.68
19.2
0
137
159.5
89.1
118.3
161.4
162.2
182.7
EU (Ireland)
80.52
153.2
137
0
19.56
212.4
242.4
239
309.6
191.3
EU (Frankfurt) 
88.62
166.6
159.5
19.55
0
236.5
266.2
325.9
323.5
194.9
Asia Pacific (Tokyo)
145.3
102.5
89.16
212.4
236.6
0
31.44
73.79
103.9
256.8
Asia Pacific (Seoul)
176.6
132.6
118.2
242.4
265.3
31.32
0
71.87
134
286.3
Asia Pacific (Singapore)
216.7
173.9
161.4
238.1
325.9
73.81
71.29
0
175.3
328.1
Asia Pacific (Sydney) 
229.7
157.8
161.9
309.6
323.2
103.9
134
175.4
0
322.5
South America (São Paulo)
119.5
192.7
181.7
191.6
194.9
256.7
286.3
327.9
322.5
0











A few observations.

Clusters of Regions with lower average latencies

Some of the regions form clusters in terms of having lower average latencies (picking an arbitrary cutoff of 100ms) between them. I.e. All the US zones, all the EU zones, and a few that are probably geographically closer such as US East+EU, US West+Tokyo. There are also a few pairs of regions that have sub 100ms latency. E.g. Three of the Asia Pacific regions (Tokyo, Seoul and Singapore), and Sydney+Tokyo (only),  South America and US East (just), etc.

Only 2 regions have sub 17ms latencies

What does this mean in practice? For applications where low latency really matters (e.g. think iterative html5 applications aiming for smooth 60FPS with no jitter this requires sub 17ms per frame latency) the users had better be in the same region, except for the 2 US West zones where you could just about get away having them in different zones (on average). For high availability for other applications with less critical latency requirements the clusters above may work ok, but outside those it is likely that the increased latency would start having a negative impact on user experience. 

It would be interesting to graph the latency distances between regions in more visually useful ways and do some cluster analysis as well, maybe next time.

This (flattened) contour diagram shows the approximate clusters identified above (blue regions < 100ms).  Think of the blue regions as plains, and orange, grey and yellow regions are hills, mountains, and peaks.  Less effort is required to walk around the plains, but increasing effort is needed to climb hills and cross mountain ranges from one plain to another.


Here's a 3d contour version which gives a better idea and shows the main "plains" (dark blue and orange: US, Europe, Asia) and walls between them:


Postscript

Here's a similar analysis (from 2015 so may need to be updated soon) looking at transfer speeds between EC2 and S3 across regions: http://blog.takipi.com/amazon-ec2-2015-benchmark-testing-speeds-between-aws-ec2-and-s3-regions/
What is interesting is that they have repeated the experiments for 3 years so you can see the speedups over time  (about 40% on average, Sydney region more so) and used real file uploads not just "ping" times. They also point out that between some regions it may actually be faster (but not cheaper?) to go via another intermediate region rather than directly (Does this really work? How would you do it? What about the time overhead due to going through the intermediate node/region itself? Would someone like to do the experiment?):

"The direct upload path from one point to another might not be the fastest one. Uploads from Australia to Brazil takes 61.24s. However, if you take the same path through Singapore, it will take you 22.56s, almost 3x faster. I wonder what Dijkstra would say."


Ok, I'll take the bait, I wonder what Dijkstra would say?

Maybe "Computer science is no more about computers than astronomy is about telescopes."? Oh, no that was about something else. Maybe "What on earth is "cloud computing"?

Ah, the reference is to Dijkstra's shortest path algorithm (which according to the following blog he invented in about 20 minutes while sipping coffee): https://motherboard.vice.com/en_us/article/the-simple-elegant-algorithm-that-makes-google-maps-possible

I suspect that Dijkstra would say that it depends on how you define the "shortest" path. i.e. the shortest path is not always the quickest and the algorithm works perfectly well if you just replace "distance" by "time" (I'm sure this is what Dijkstra is pointing in the lecture below). Just watch out for "Braess' paradox" (if everyone picks the fastest route you get instant congestion).
Or not! It's possible that under high demands some sort of "crowd effect" takes over:
https://phys.org/news/2010-09-scientist-braess-paradox-high-traffic.html
Ok this appears to be an adventure down a Rabbit Hole (for computer scientists at least), see this blog which introduces game theory to the problem:
https://agtb.wordpress.com/2010/10/09/computer-scientists-and-braesss-paradox/
which refers to the paper ("How Bad Is Selfish Routing?" - as usual for an academic paper the answer doesn't appear to be simple or obvious. i..e. not Bad, catastrophic, or something else in between):
http://theory.stanford.edu/~tim/papers/routing.pdf





And (finally?) another paper with the snappy title  "Analyzing the Network for AWS Distributed Cloud Computing" (from ACM SIGMETRICS Performance Evaluation Review, Vol 43, Issue 3, December 2015), which also takes measurements between regions but then provides an answer for how to optimise placement of customers and applications in different regions. They also claim the results can be generalised to > 2 regions and different upload/download patterns (cool!):
https://www3.cs.stonybrook.edu/~anshul/dcc15_aws.pdf

Comments

Post a Comment

Popular posts from this blog

Which Amazon Web Services are Interoperable?

AWS Certification glossary quiz: IAM

AWS SWF vs Lambda + step functions? Simple answer is use Lambda for all new applications.