Autoscaling and Fly
Autoscaling as described here is no longer the default for new Fly deployments and should, as such, be considered deprecated. Read the Scaling documentation for information of the current scaling system and how to manage the region pool.
Autoscaling
Autoscaling is based on a pool of regions where the application can be run. Using a selected model, the system will then create at least the minimum number of application instances across those regions. The model will then be able create instances up to the maximum count. The min and max are global parameters for the scaling. There are two scaling modes, Standard and Balanced.
Standard: Instances of the application, up to the minimum count, are evenly distributed among the regions in the pool. They are not relocated in response to traffic. New instances are added where there is demand, up to the maximum count.
Balanced: Instances of the application are, at first, evenly distributed among the regions in the pool up to the minimum count. Where traffic is high in a particular region, new instances will be created there and then, when the maximum count of instances has been used, instances will be moved from other regions to that region. This movement of instances is designed to balance supply of compute power with demand for it.
Disabled: By default autoscaling is in Disabled mode and count-based scaling is in operation. You can turn autoscaling on by setting the autoscale mode to
standard
orbalanced
To determine what the current settings of an application are, run flyctl autoscale show
:
flyctl autoscale show
Scale Mode: Standard
Min Count: 1
Max Count: 10
VM Size: shared-cpu-1x
This scaling plan sees standard, even distribution on instances, with a minimum of 1 instance and up to 10 instances that can be created on demand.
Modifying The Scaling Plan
As mentioned above, the scaling mode controls how the regions in the pool are used for allocating instances. To set the mode use:
flyctl autoscale standard
or
flyctl autoscale balanced
Both of these commands set the scaling mode and can take extra settings that tune the mode, specifically setting the minimum count (min
) and maximum count (max
) of instances. For example, to set balanced mode with a minimum number of instances of 5, you would give this command:
flyctl autoscale balanced min=5
Want to set a maximum of 10 too? Then do this:
flyctl autoscale balanced min=5 max=10
If you just want to set the max or min for the currently selected model use the set
sub-command:
flyctl autoscale set min=5 max=10
You can also turn off autoscaling and return to the recommended count-scaling option by disabling autoscaling:
flyctl autoscale disable
Viewing The Application's Scaled Status
To view where the instances of a Fly application are currently running, use flyctl status
:
flyctl status
App
Name = hellofly
Owner = dj
Version = 299
Status = running
Hostname = hellofly.fly.dev
Deployment Status
ID = 59b60abf-ba4f-fb2f-9f78-35a249e2bef5
Version = v299
Status = successful
Description = Deployment completed successfully
Allocations = 3 desired, 3 placed, 3 healthy, 0 unhealthy
Allocations
ID VERSION REGION DESIRED STATUS HEALTH CHECKS CREATED
8a9358d1 299 ams run running 1 passing 15m36s ago
7c08ce47 299 nrt run running 1 passing 15m36s ago
1b17a5e6 299 sjc run running 1 passing 15m36s ago
If a region is listed with (b)
following it, that means the region being used is a backup region in use.