AutoScalling

Autoscaling is an essential tool for modern applications that require flexibility and adaptability to fluctuating workloads. The process of monitoring system performance, provisioning or deprovisioning resources, and optimizing operational cost is labor-intensive and can't be done manually. Autoscaling can help ease management overhead by automating these tasks and reducing the need for an operator to continually monitor the system's performance and make decisions about adding or removing resources.

Autoscaling can be divided into two types: vertical scaling (scaling up) and horizontal scaling (scaling out). Vertical scaling involves redeploying the solution using different hardware, while horizontal scaling requires deploying the system on additional resources. Vertical scaling is often a disruptive process that requires making the system temporarily unavailable while it is being redeployed. It is uncommon to use autoscaling to implement a vertical scaling strategy. On the other hand, horizontal scaling is non-disruptive, and the system can continue running without interruption while additional resources are provisioned.

To implement an effective autoscaling strategy, it is essential to have the following components in place:

instrumentation at the application level to capture key performance and scaling factors
monitoring components to observe these factors
decision-making logic to evaluate the monitored factors against predefined system threshold
execution components responsible for carrying out tasks associated with scaling the system.

These components typically use tools and scripts to provision or deprovision resources, reconfigure the system, and test and validate the autoscaling strategy.

It is important to implement an autoscaling strategy based on the specific requirements of the application rather than being driven by the features provided by any specific toolset. Scripting is still an essential skill, and a good autoscaling solution combines the features provided by the selected toolset with customizations in the form of scripts.

One critical consideration when designing an autoscaling solution is to make services stateless to avoid requiring that a series of requests from an application are always routed to the same instance of a service. If the solution implements a long-running task, it is crucial to design this task to support both scaling out and scaling in. Refactoring a long-running task and breaking up the processing that it performs into smaller, discrete chunks can help avoid losing data if the process is forcibly terminated.

It is also crucial to monitor the autoscaling process and log the details of each autoscaling event to measure the effectiveness of the autoscaling strategy and tune it if necessary. Additionally, it is important to be mindful of the time it takes to provision and start new instances of a service or add resources to a system, as the peak may have passed by the time these additional resources have been made available. Throttling the service can help manage the demand and ensure that the system maintains adequate performance and meets service level agreements.