AutoScale

NOTICE: AutoScale is an experimental feature and currently only works on Kubernetes deployments.

Overview

WeScale's AutoScale feature automatically adjusts the resources allocated to your database nodes based on real-time workload demands. By dynamically scaling CPU and memory resources, AutoScale ensures optimal performance and resource utilization without manual intervention.

Prerequisites

InPlacePodVerticalScaling: Ensure that the InPlacePodVerticalScaling feature is enabled on your Kubernetes cluster. As of now, Amazon EKS does not support this feature, but you can try it on Minikube.
Metrics Server: Install the Kubernetes Metrics Server to collect resource usage metrics.

How AutoScale Works

WeScale's architecture consists of:

Proxy Node (VTGate): Acts as the entry point for client requests, handling query routing and load balancing.
Database Nodes (wesql-server): Store and manage your data.
Sidecar (VTTablet): Deployed alongside each wesql-server in the same Pod, handling traffic from the Proxy Node.

AutoScale operates within a Kubernetes (k8s) cluster and involves three main modules:

Data Collection Module: Monitors the CPU and memory usage of wesql-server Pods.
Instruction Module: Adjusts the CPU and memory requests of Pods using the Kubernetes API and the InPlacePodVerticalScaling feature, allowing resource updates without Pod restarts.
Decision Module: Determines when to scale resources to maintain CPU and memory utilization at target levels (typically 90% for CPU and 75% for memory).

Enabling AutoScale

To enable AutoScale, you need to configure the relevant parameters in your WeScale deployment. By default, AutoScale is disabled.

Steps to Enable

Use kubectl edit to modify the WeScale configuration ConfigMap. For example:

kubectl edit configmap wescale-config

Modify your WeScale configuration to include the AutoScale parameters with your desired settings.

enable_auto_scale: true
auto_scale_decision_making_interval: 5s
auto_scale_compute_unit_lower_bound: 0.5
auto_scale_compute_unit_upper_bound: 10
auto_scale_cpu_ratio: 0.9
auto_scale_memory_ratio: 0.75
auto_scale_use_relaxed_cpu_memory_ratio: false
auto_scale_cluster_namespace: default
auto_scale_data_node_pod_name: mycluster-wesql-0-0
auto_scale_data_node_stateful_set_name: mycluster-wesql-0
auto_scale_logger_node_pod_name: mycluster-wesql-1-0,mycluster-wesql-2-0
auto_scale_logger_node_stateful_set_name: mycluster-wesql-1,mycluster-wesql-2

Observing AutoScale in Action

To help you observe the effects of AutoScale, follow this step-by-step example.

Step 1: Verify Initial Resource Allocation

View Desired Resources (Pod Spec)

Check the initial desired CPU and memory requests and limits specified in the Pod's spec:

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.spec.containers[?(@.name=="mysql")].resources}'

This command outputs the resources as specified in the Pod's specification, which represent the desired state set by AutoScale.

View Current Allocated Resources

To see the resources that Kubernetes has allocated to the container, especially during an in-place resize, run:

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.status.containerStatuses[?(@.name=="mysql")].allocatedResources}'

This shows the allocatedResources, indicating the resources that have been assigned to the container but may not yet be fully in use.

View Actual Resources in Use

To view the current actual resource limits and requests that the container is using:

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.status.containerStatuses[?(@.name=="mysql")].resources}'

This displays the resources under status.containerStatuses, representing the actual resource configuration of the running container.

Step 2: Simulate Increased Workload

Generate a workload that increases CPU and memory usage. You can use a tool like sysbench to simulate load:

sysbench --mysql-host=<proxy_host> --mysql-user=<user> --mysql-password=<password> --mysql-db=<db> --threads=50 --time=600 oltp_read_write run

Step 3: Observe AutoScale Adjustments

After the auto_scale_decision_making_interval (e.g., 5 seconds), AutoScale evaluates the resource utilization and adjusts the resource requests if necessary.

Re-verify Desired Resources

Check the updated desired resources:

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.spec.containers[?(@.name=="mysql")].resources}'

Check Allocated Resources

View the allocatedResources to see if Kubernetes has allocated new resources:

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.status.containerStatuses[?(@.name=="mysql")].allocatedResources}'

Check Actual Resources in Use

Verify if the container is now using the new resources:

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.status.containerStatuses[?(@.name=="mysql")].resources}'

Step 4: Observe Pod Resize Process

Check Pod Resize Status

To see if a resource update is in progress, check the Pod's resize status:

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.status.resize}'

If the output is InProgress, the resource adjustment is ongoing.

Examine Pod Conditions

Check the Pod's conditions for ResourcesAllocated:

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.status.conditions[?(@.type=="ResourcesAllocated")]}'

This condition indicates whether the new resources have been successfully allocated and are in use.

Step 5: Reduce Workload to Trigger Scaling Down

Stop or reduce the workload:

# If you used sysbench, simply stop the process

Wait for the auto_scale_decision_making_interval to pass, then check whether AutoScale reduces the resource requests.

Re-verify Desired Resources

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.spec.containers[?(@.name=="mysql")].resources}'

Re-verify Allocated Resources

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.status.containerStatuses[?(@.name=="mysql")].allocatedResources}'

Re-verify Actual Resources

kubectl get pod mycluster-wesql-0-0 -n default -o jsonpath='{.status.containerStatuses[?(@.name=="mysql")].resources}'

The CPU and memory requests should have decreased but not below the auto_scale_compute_unit_lower_bound.

Configuration Details

Configuration Parameters

Below are the parameters to configure AutoScale:

enable_auto_scale: Enables or disables the AutoScale feature.
- Default: false
- Options: true or false
auto_scale_decision_making_interval: The interval at which the decision module evaluates whether to scale resources.
- Default: 5s
- Example: 5s, 30s, 1m
auto_scale_compute_unit_lower_bound: The minimum compute units that can be allocated to a Pod.
- Default: 0.5
- Example: 0.5, 1, 2
auto_scale_compute_unit_upper_bound: The maximum compute units that can be allocated to a Pod.
- Default: 10
- Example: 4, 8, 16
auto_scale_cpu_ratio: The target CPU utilization ratio.
- Default: 0.9 (90%)
- Example: 0.75 (75%), 0.85 (85%)
auto_scale_memory_ratio: The target memory utilization ratio.
- Default: 0.75 (75%)
- Example: 0.70 (70%), 0.80 (80%)
auto_scale_use_relaxed_cpu_memory_ratio: When set to true, CPU and memory can be adjusted independently; otherwise, they are adjusted together, with every 1 core CPU paired with 4 GiB memory.
- Default: false
- Options: true or false
auto_scale_cluster_namespace: The Kubernetes namespace where your WeScale cluster is deployed.
- Default: default
- Example: production, database-cluster
auto_scale_data_node_pod_name: The name of the data node Pod to be scaled.
- Example: mycluster-wesql-0-0
auto_scale_data_node_stateful_set_name: The name of the StatefulSet for the data node.
- Example: mycluster-wesql-0
auto_scale_logger_node_pod_name: Comma-separated names of logger node Pods.
- Example: mycluster-wesql-1-0,mycluster-wesql-2-0
auto_scale_logger_node_stateful_set_name: Comma-separated names of StatefulSets for logger nodes.
- Example: mycluster-wesql-1,mycluster-wesql-2

Understanding Compute Units

Compute Unit: Represents a certain amount of CPU and memory resources (e.g., 1 compute unit = 1 vCPU and 4 GiB memory).
Lower and Upper Bounds: The auto_scale_compute_unit_lower_bound and auto_scale_compute_unit_upper_bound parameters prevent the system from allocating too few or too many resources, which could lead to performance issues or resource wastage.

Relaxed CPU and Memory Ratios

auto_scale_use_relaxed_cpu_memory_ratio: When set to true, the decision module allows CPU and memory to be adjusted independently, offering more flexibility in environments with unpredictable workloads.

Best Practices

Set Realistic Bounds: Ensure that the compute unit bounds align with your cluster's node capacities.
Adjust Decision Interval: Set the auto_scale_decision_making_interval based on how quickly your workloads change. A shorter interval allows for more responsive scaling but may increase overhead.
Monitor Regularly: Regularly monitor resource utilization and scaling actions to fine-tune your configuration over time.

Limitations

Kubernetes Version: The InPlacePodVerticalScaling feature requires Kubernetes version 1.27 or later.
Resource Availability: Scaling is subject to the availability of resources on your Kubernetes nodes. Ensure that nodes have sufficient capacity to accommodate scaling.
Single Namespace: The current configuration assumes that all Pods are within the single namespace specified by auto_scale_cluster_namespace.
Metrics Server: Ensure that the Kubernetes Metrics Server is installed and running to collect resource usage data.

By following this guide and observing the AutoScale feature in action, you can effectively manage your WeScale deployment to handle fluctuating workloads efficiently. Remember to monitor the resource allocations and utilization closely to ensure that your applications run smoothly and resources are used optimally.

AutoScale

Overview​

Prerequisites​

How AutoScale Works​

Enabling AutoScale​

Steps to Enable​

Observing AutoScale in Action​

Step 1: Verify Initial Resource Allocation​

View Desired Resources (Pod Spec)​

View Current Allocated Resources​

View Actual Resources in Use​

Step 2: Simulate Increased Workload​

Step 3: Observe AutoScale Adjustments​

Re-verify Desired Resources​

Check Allocated Resources​

Check Actual Resources in Use​

Step 4: Observe Pod Resize Process​

Check Pod Resize Status​

Examine Pod Conditions​

Step 5: Reduce Workload to Trigger Scaling Down​

Re-verify Desired Resources​

Re-verify Allocated Resources​

Re-verify Actual Resources​

Configuration Details​

Configuration Parameters​

Understanding Compute Units​

Relaxed CPU and Memory Ratios​

Best Practices​

Limitations​

Overview

Prerequisites

How AutoScale Works

Enabling AutoScale

Steps to Enable

Observing AutoScale in Action

Step 1: Verify Initial Resource Allocation

View Desired Resources (Pod Spec)

View Current Allocated Resources

View Actual Resources in Use

Step 2: Simulate Increased Workload

Step 3: Observe AutoScale Adjustments

Re-verify Desired Resources

Check Allocated Resources

Check Actual Resources in Use

Step 4: Observe Pod Resize Process

Check Pod Resize Status

Examine Pod Conditions

Step 5: Reduce Workload to Trigger Scaling Down

Re-verify Desired Resources

Re-verify Allocated Resources

Re-verify Actual Resources

Configuration Details

Configuration Parameters

Understanding Compute Units

Relaxed CPU and Memory Ratios

Best Practices

Limitations