Spicing Up Autoscaling with Latency: Going beyond CPU and Memory

A talk by Charles Pretzer
Linkerd Service Mesh Sherpa and Field Engineer, Buoyant

Register to watch this content

By submitting your email you agree to the terms
Watch this content now

Categories covered by this talk

About this talk

For autoscaling, CPU and memory are like salt and pepper: they're the beginning of a flavorful dish. Latency is the golden saffron that takes autoscaling to new heights.

This talk will show why it is critical to scale based on latency, as well as how to do it for your own service by combining Linkerd, Prometheus, and Kubernetes. We demonstrate how to use Linkerd to instrument your service to collect aggregated service latency, store these metrics in Prometheus, and use them as custom metrics for consumption by the Horizontal Pod Autoscaler in Kubernetes. We demonstrate how latency-based autoscaling outperforms CPU- and memory-based autoscaling under a variety of conditions including live traffic from the attendees of this talk, and suggest ways to safely apply this technique to existing systems.

Proudly supported by