Autoscaling and Fly

Autoscaling as described here is no longer the default for new Fly deployments and should, as such, be considered deprecated. Read the Scaling documentation for information of the current scaling system and how to manage the region pool.

Autoscaling

Autoscaling is based on a pool of regions where the application can be run. Using a selected model, the system will then create at least the minimum number of application instances across those regions. The model will then be able create instances up to the maximum count. The min and max are global parameters for the scaling. There are two scaling modes, Standard and Balanced.

  • Standard: Instances of the application, up to the minimum count, are evenly distributed among the regions in the pool. They are not relocated in response to traffic. New instances are added where there is demand, up to the maximum count.

  • Balanced: Instances of the application are, at first, evenly distributed among the regions in the pool up to the minimum count. Where traffic is high in a particular region, new instances will be created there and then, when the maximum count of instances has been used, instances will be moved from other regions to that region. This movement of instances is designed to balance supply of compute power with demand for it.

  • Disabled: By default autoscaling is in Disabled mode and count-based scaling is in operation. You can turn autoscaling on by setting the autoscale mode to standard or balanced

To determine what the current settings of an application are, run flyctl autoscale show:

flyctl autoscale show
     Scale Mode: Standard
      Min Count: 1
      Max Count: 10
        VM Size: shared-cpu-1x

This scaling plan sees standard, even distribution on instances, with a minimum of 1 instance and up to 10 instances that can be created on demand.

Modifying The Scaling Plan

As mentioned above, the scaling mode controls how the regions in the pool are used for allocating instances. To set the mode use:

flyctl autoscale standard

or

flyctl autoscale balanced

Both of these commands set the scaling mode and can take extra settings that tune the mode, specifically setting the minimum count (min) and maximum count (max) of instances. For example, to set balanced mode with a minimum number of instances of 5, you would give this command:

flyctl autoscale balanced min=5

Want to set a maximum of 10 too? Then do this:

flyctl autoscale balanced min=5 max=10

If you just want to set the max or min for the currently selected model use the set sub-command:

flyctl autoscale set min=5 max=10

You can also turn off autoscaling and return to the recommended count-scaling option by disabling autoscaling:

flyctl autoscale disable

Viewing The Application's Scaled Status

To view where the instances of a Fly application are currently running, use flyctl status:

flyctl status
App
  Name     = hellofly
  Owner    = dj
  Version  = 299
  Status   = running
  Hostname = hellofly.fly.dev

Deployment Status
  ID          = 59b60abf-ba4f-fb2f-9f78-35a249e2bef5
  Version     = v299
  Status      = successful
  Description = Deployment completed successfully
  Allocations = 3 desired, 3 placed, 3 healthy, 0 unhealthy

Allocations
  ID         VERSION   REGION   DESIRED   STATUS    HEALTH CHECKS   CREATED
  8a9358d1   299       ams      run       running   1 passing       15m36s ago
  7c08ce47   299       nrt      run       running   1 passing       15m36s ago
  1b17a5e6   299       sjc      run       running   1 passing       15m36s ago

If a region is listed with (b) following it, that means the region being used is a backup region in use.