Docs / Compute / Auto-Scaling

Auto-Scaling Guide

Guide Updated: Oct 24, 2025 Reading time: 8 min

CloudNexus Auto-Scaling automatically adjusts the number of running instances in your cluster based on real-time metrics. This ensures your application handles traffic spikes seamlessly while optimizing costs during low-demand periods.

ℹ️

Note: Auto-Scaling is available on Professional and Enterprise plans. The Starter plan supports manual scaling only.

Overview

Auto-Scaling works by monitoring defined metrics and executing scaling policies when thresholds are breached. The system supports three modes:

Target Tracking: Maintains a specific metric value.
Step Scaling: Adjusts capacity based on the size of the alarm breach.
Scheduled Scaling: Scales according to a predictable schedule.

Configuration

You can configure auto-scaling via the CloudNexus Console, CLI, or Infrastructure-as-Code (IaC).

Using the Console

Navigate to Compute > Clusters.
Select your target cluster.
Click the Scaling tab and toggle Enable Auto-Scaling.
Define min/max instances and select your scaling policy.
Click Save Configuration.

CLI Configuration

Use the cx scale command to configure auto-scaling via terminal.

cx scale --cluster prod-web-cluster \
  --min-instances 2 \
  --max-instances 10 \
  --policy target-tracking \
  --metric cpu-utilization \
  --target-value 70

IaC Example (CloudNexus YAML)

version: v2
resources:
  cluster:
    name: production-app
    instances:
      min: 2
      max: 12
    autoScaling:
      enabled: true
      strategy: target-tracking
      cooldown: 300s
      metrics:
        - type: cpu
          threshold: 75
        - type: memory
          threshold: 85

Supported Metrics

CloudNexus supports a wide range of built-in and custom metrics for scaling decisions.

Metric Name	Description	Granularity
`CPU_Utilization`	Average CPU usage across instances	1 minute
`Memory_Usage`	RAM consumption percentage	1 minute
`Network_In`	Inbound network traffic (Mbps)	30 seconds
`Request_Count`	HTTP requests per second (per instance)	15 seconds
`Custom_Metric`	User-defined metrics via Agent	Variable

Scaling Policies

Target Tracking

Maintains a specific average value for a metric. For example, keeping CPU utilization at 60%.

✅

Recommendation: Target Tracking is the easiest to configure and works well for most steady-state workloads.

Step Scaling

Defines scaling adjustments based on the magnitude of the breach. Useful for handling varying load patterns.

Scheduled Scaling

Scales instances at predefined times. Ideal for predictable traffic patterns like daily batch jobs or retail hours.

# Schedule scaling for business hours (9 AM - 6 PM)
cx scale schedule --cluster retail-api \
  --start 09:00 UTC \
  --end 18:00 UTC \
  --desired 5

Best Practices

1. Set Realistic Minimums

Always set a minimum instance count that ensures high availability. For production workloads, a minimum of 2 instances is recommended to handle single-node failures.

2. Configure Cool-Down Periods

Set an appropriate cool-down period (default: 300s) to prevent flapping. This allows instances to fully initialize and stabilize metrics before another scaling event.

3. Monitor Health Checks

Ensure your load balancer health checks are properly configured. Auto-scaling will replace unhealthy instances, but accurate health checks are crucial for this process.

4. Use Predictive Scaling

Enterprise users can enable Predictive Scaling, which uses ML to analyze historical data and scale proactively before traffic spikes occur.

⚠️

Warning: Avoid setting aggressive scaling thresholds (e.g., CPU > 30%). This can cause unnecessary scaling events and increased costs. We recommend thresholds between 60-80%.

Troubleshooting

Auto-Scaling Not Triggering

Verify that the Auto-Scaling agent is installed and running on instances.
Check Scaling Events logs in the console for errors.
Ensure metrics are being reported (check Monitoring dashboard).

Instances Terminating Unexpectedly

Check if instances are failing health checks.
Review if a Scale In policy is active during low load.
Ensure Protection Mode isn't disabled on critical instances.

# Check scaling logs
cx logs scaling --cluster prod-web \
  --since 24h \
  --level error

← VPS Hosting Load Balancers →

Auto-Scaling Guide

Table of Contents

Overview

Configuration

Using the Console

CLI Configuration

IaC Example (CloudNexus YAML)

Supported Metrics

Scaling Policies

Target Tracking

Step Scaling

Scheduled Scaling

Best Practices

1. Set Realistic Minimums

2. Configure Cool-Down Periods

3. Monitor Health Checks

4. Use Predictive Scaling

Troubleshooting

Auto-Scaling Not Triggering

Instances Terminating Unexpectedly