Comparing native autoscaling modes in AVD

Autoscaling is crucial for optimizing the cost of an AVD deployment. By default, session hosts will incur both compute costs (for the VM) and storage costs (for the OS disk in use). Powering off VMs will pause compute costs, but storage costs persist. Significant cost savings arise with the ability to automatically scale-out workloads as needed, and scale-in hosts when they are no longer in use.

Two autoscaling options

Today, Azure natively has two autoscaling methods available:

Power management autoscaling (GA)
Dynamic autoscaling (Preview)

Power management autoscaling will power-on and deallocate existing session hosts based on capacity-thresholds. Dynamic autoscaling will create and delete session hosts up to a maximum based on similar capacity-thresholds.

Scaling plans

Scaling Plans are the Azure resource that defines autoscaling logic for a host pool. Scaling Plans have a one-to-many relationship with host pools: a scaling plan can be applied to multiple host pools, but a host pool can only associate one scaling plan.

In effect, you are limited to one set of rules per host pool. Scaling plans can be enabled on certain days of the week as needed. Each scaling plan has four phases, which users can control the time-periods for:

Ramp-up
Peak
Ramp-down
Off-hours

Power management autoscaling overview

Power management autoscaling uses defined thresholds to power on or off session hosts. It assumes that you have already created hosts to provide sufficient capacity.

For each phase, admin specify the Minimum percentage of active hosts and a capacity threshold (where capacity = active sessions / total session capacity based on powered-on hosts) upon which to trigger scale-out actions. The capacity threshold is based on the maximum session limit applied to hosts, and the total number of active hosts at a given time.

During ramp-up and peak phases, breadth-first autoscaling is used to leverage provisioned resources. In ramp-down and off-peak hours, depth-first is used to broker sessions away from empty hosts that can be powered-off.

Unused session hosts are powered-off and only powered-on as the capacity threshold is breached, or the minimum active host rate dictates it.

Dynamic autoscaling uses defined thresholds and host pool limits to automatically start, stop, create, or delete session hosts based on user demand and available capacity. Unlike power management autoscaling, dynamic autoscale does not require pre-deployed hosts - since it can provision new ones.

For each phase, admin specify a minimum host pool size, in effect the capacity floor (including powered-off VMs); the maximum host pool size, which acts as the capacity ceiling; the minimum percentage of active hosts, and capacity threshold to trigger scaling actions.

During ramp-up and peak phases, autoscale powers on existing stopped VMs, and creates additional hosts as needed up to the maximum host pool size. During the ramp-down and off-peak phases, autoscale consolidates sessions onto fewer hosts and powers off and/or deletes unused hosts. Depth-first is used in latter phases to broker sessions away from hosts that can be deallocated (deleted).

Importantly, if the Minimum percentage of active hosts (%) is set to 100%, autoscale only creates or deletes hosts if the host pool capacity exceeds or drops below the defined capacity threshold.

Note: this requires Session Host Configuration, which is also in Preview.

Takeaways

Here’s a quick table summarizing the key differences for admin:

Category	Power Management	Dynamic
requires pre-deployed hosts?	Yes. It only powers-on or off existing hosts.	No. Autoscale can create hosts dynamically using the configuration.
Minimum percentage of active hosts	Defines the percentage of hosts to be started based on existing hosts and defined session limits.	Defines the percentage of hosts to be started (created) based on existing hosts and the maximum host pool size.
Maximum host pool size	N/A	Defines the ceiling for total active hosts, including those dynamically created.
Minimum host pool size	N/A	Defines the number of hosts that are always part of the pool (running or stopped).
Capacity threshold	Equals active sessions divided by total session capacity; triggers power-on actions	Equals active sessions divided by total session capacity; triggers power-on (create) actions
Ramp-up & Peak	Breadth-first LB spreads sessions across running hosts.	Breadth-first load balancing powers on stopped hosts and creates them as needed.
Ramp-down & Off-peak	Depth-first LB brokers sessions such that unused hosts can be shutdown.	Depth-first consolidates sessions and powers off (and/or deletes) unused hosts.
Scaling actions	Power-on and shutdown	Power-on and shutdown, create and delete
Cost Optimization	Reduces compute costs with host deallocation	Reduces compute costs with host deallocation, compute and storage with deletion.
User-Experience	Users may trigger VM starts with Start VM on Connect, waiting for VM to power-on	Users may trigger VM starts or creations with Start VM on Connect. Creating takes longer than starting, so they may wait for either

In closing

Persistent Desktop environments are slowly migrating into Windows 365 deployments in Enterprise organizations. But, in those scenarios, power management is typically used because VMs cannot be deleted. Enterprises may opt against W365 for these use cases due to security requirements such as Confidential Compute.

In future posts, we will consider how native capabilities stack up against third-party solutions such as Hydra. Thanks for reading.