Chapter 10: Amazon ElastiCache



(Photo: Tiberinus, god of the Tiber river with a Cornucopia, a classical cache)

This chapter is about In-Memory Caching. It introduces in-memory cache engines, and caching patterns. It supports MemCached and Redis, so I guess it's just an API abstraction for either of these?

Documentation is here. There are a total of 73 operations and data types, and Java client documentation.


What do I know about caching? Well, a long time I ago I designed and implemented a file system caching for the ABC D-Cart system. I was also a J2EE expert for a few years and did some R&D and benchmarking of J2EE patterns, architectural alternatives and different vendor products. One of the features of the J2EE standard as I recall was that it supported different caching options out of the box. I did some research and experiments and published a paper in 2001 in the middleware conference on the impact of different caching options, cache sizes, and variations across products.

Brebner P., Ran S. (2001) Entity Bean A, B, C’s: Enterprise Java Beans Commit Options and Caching. In: Guerraoui R. (eds) Middleware 2001. Middleware 2001. Lecture Notes in Computer Science, vol 2218. Springer, Berlin, Heidelberg

Online version.

And a good overview of the Borland Application Server cache options used to produce the paper results.

From memory of the problems with J2EE caching was that it didn't scale well horizontally (if at all), but there were a few 3rd party products starting to address this space along with an increasing number of Java/POJO persistence and ORM frameworks.

However, from an architectural perspective the J2EE caching was just invisible, all you had to do was turn it on or not. There were no code or other major changes required. Of course, it didn't always improve the performance (depending on options, hit rates, cache sizes and vendor products and JVM settings etc). The fundamental problem with caches is making sure you don't read stale data from them, related to cache coherence (caches with same data in multiple locations). Caches introduce state into otherwise potentially stateless architectures which is potentially bad, isn't it?.

Is caching in general an "architectural" level feature (i.e. something that is difficult to do or change)? Well I guess that depends on where and how it's implemented in your technology stack. With HTTP front ends its also invisible (so therefore not really architectural) as you just choose to add CDN caching or not with no code changes.

Does caching have an "architectural smell"?! Is this good or bad?

And maybe in web systems stale caches are ok? (but not in enterprise middleware!), another overiew of internet caching.

An interesting article on caching in web systems.

The initial problem I had with this chapter was that it didn't introduce "web" level caching very well, Sure, I'd heard of memcached and redis etc but have no idea how modern web applications are built with them or what their pros/cons and internal architecture and APIs look like.

Are they written in Java? (No, C). Can you use them with/from Java? Where do this "fit" in your application architecture? Is it hard to move from one to the other (or another that supersedes them in 6 months time?)

And what's this Zeno thing from Netflix? And now Hollow?
https://adtmag.com/articles/2016/12/13/netflix-hollow.aspx
http://www.infoworld.com/article/3147370/open-source-tools/move-over-memcached-and-redis-here-comes-netflixs-hollow.html 

There's a couple of older surveys of web cache engines (why nothing more recent?) Too hard to keep up?

Making I'm looking for the wrong thing? How about web caching?
https://www.digitalocean.com/community/tutorials/web-caching-basics-terminology-http-headers-and-caching-strategies

A comparison of memcached, redis, etc.

Are there any Java/POJO in-memory caching engines? Apache JCS is one.


And I found this which was interesting but not specifically about (but does mention) caching: "AWS and Compartmentalization".  It refers to a library for use with Route 53:

Amazon Route53 Infima is a library for managing service-level fault isolation using Amazon Route 53.

This brings us to Route 53 Infima. Infima is a library designed to model compartmentalization systematically and to help represent those kinds of configurations in DNS. With Infima, you assign endpoints to specific compartments such as availability zone. For advanced configurations you may also layer in additional compartmentalization dimensions; for example you may want to run two different software implementations of the same service (perhaps for blue/green deployments, for application-level redundancy) in each availability zone..

This looks kinda of cool but still not I really understand what it's for and how it works.
It's also mentioned in these slides.
And more details here.

This all still seems complicated at a programming level? What's actually involved in adding an in-memory cache to say a Java enterprise application? Here's a good introduction. From Part 1 the answer is that it's very intrusive to add the memcached code changes. However, Part 2 provides a simpler solution, just use Hibernate and point it to memcached engine:

Part 1: http://www.javaworld.com/article/2078565/open-source-tools/open-source-tools-use-memcached-for-java-enterprise-performance-part-1-architecture-and-setup.html

Part 2: http://www.javaworld.com/article/2078584/open-source-tools/open-source-tools-use-memcached-for-java-enterprise-performance-part-2-database-driven-web-apps.html

Currently Redis seems to have the edge of memcached.

As usual I ask price and limitation questions.


PS
Patterns of usage may be the best place to START to understand this subject?
E.g. https://www.slideshare.net/AmazonWebServices/elasticache-deep-dive-best-practices-and-usage-patterns-march-2017-aws-online-tech-talks

And a whitepaper.

This book on patterns for AWS also covers this subject.

PPS
How do you know in advance if (and by how much) caching will make an impact? Over the last 10 years I've developed a software architectural level performance modelling tool and used it with a large number of government and enterprise clients. Unsurprisingly some of these clients and technology stacks were interested in the impact of caching on end user response times and resource usage. Most of these examples I can't show publically (e.g. there was one complex Defence problem that we showed could have end user response times by caching), but we have developed a number of "demonstrations" using the Dynatrace easyTravel application.

We can automatically build performance models (simple or complex) from Dynatrace APM data. For this example we showed the potential improvement between a baseline version of the application (throughput, database server utilisation, and end user experience) and version which was modified ot included a CDN (for images and javascript, assuming 100% cache hit rate once cache was warmed up, other hit rates could be modelled).  Here are some screen shots.


Baseline performance model (showing workloads, software components, and servers, left to right).  Only a subset as all it won't fit on the screen.


Zooming in at the top of the model.


Metrics graphed after a simulation with a workload of 40TPS (average arrival rate). Basicaly the maximum capacity of this system as database server is close to saturation.


Model with CDN added for images and javascript, metrics graphed after simulation at a higher workload of 60TPS (average arrival rate).

What does this show?

The baseline simulation at 40TPS has an average database server utilisation of 85%, and end user experience (response time) of a median of 625ms and 95% of 7s. Not great.

The CDN simulation at a higher rate of 60TPS has an average database server utilisation of 75%, and end user experience (response time) of a median of 700ms and 95% of 2.1s. This is better, with better median and 95% response times, and at least 50% higher capacity. You can actually push the load up to 70TPS with a 95% response time still under 7s.  How long did this take? About an hour to build the initial model from existing APM data, modify the model for CDN, and run simulations and write the results up. Results in a graph:

Model
TPS
DB U%
RT (ms, 50%)
RT (ms, 95%)
Baseline
40
85
625
7000
CDN50
50
63
270
970
CDN60
60
75
716
2100

The model can also predict the load on the CDNs, and show resource usage for the CDN servers, which may be useful to ensure throttling doesn't occur.

Comments

Post a Comment

Popular posts from this blog

Which Amazon Web Services are Interoperable?

AWS Certification glossary quiz: IAM

AWS SWF vs Lambda + step functions? Simple answer is use Lambda for all new applications.