Scaling and Fly

Autoscaling as previously described here is no longer the default for new Fly deployments. Read this documentation for information about the new "count"-based scaling system which is the default. Autoscaling documentation now as its own page.

The previous "autoscaling"-based scaling system has been deprecated and has been replaced with a "count"-based scaling system.

Scaling Dimensions

There are multiple dimensions of scaling on Fly.

Ensuring the application has instances running in one or more regions
Anchoring application instances using volumes in one or more regions
Increasing the cpu cores and memory size of application instances

Regions and Scaling

Your Fly application runs on servers in a pool of regions, selected from our available regions. That pool, which you can configure, represents the regions your app can be deployed in.

When you deploy an application for the first time, we pick a first region to deploy in, and some backup regions. Our selections are simple:

If you’re turbo-charging a Heroku application, we pick regions close to the Heroku application (currently, this means iad for US Heroku applications (howdy!), and ams for European applications (hallo!).
Otherwise, we pick regions close to you, the human running the flyctl command.

You can confirm this by running flyctl regions list.

flyctl regions list

Region Pool:
lhr
Backup Region:
ams
fra

The create command, in this case, was issued in the UK, so London (LHR-London Heathrow) is the closest region.

Backup Regions

Continuing our previous example: if for any reason your application can't be deployed in LHR, Fly will attempt to bring it up in either ams (Amsterdam) or fra (Frankfurt). Users won't notice this! They’re directed to the nearest running instance automatically. Backup Regions are selected based on the geographical closeness of the regions selected for your region pool.

Modifying The Region Pool

You can build your own region pool easily.

flyctl regions add ord iad adds ord and iad to your region pool.
flyctl regions remove ord removes ord from your region pool.

Both commands simply take a space-separated list of regions to add or remove.

Count Scaling

Now that we have control over where our application runs, we can talk about how many instances of your app are running.

Your application has a “scale count”. The scale count defaults to 1, meaning 1 instance of your application runs on Fly, in one of the regions in your pool.

If you want to run more than 1 instance, change your scale count with the flyctl scale count command. flyctl scale count 3 tells us to run 3 instances of your application.

When you bump up your scale count, we’ll place your app in different regions (based on your region pool). If there are three regions in the pool and the count is set to six, there will be two app instances in each region.

You can see your current scaling parameters with flyctl scale show.

flyctl scale show

        VM Size: micro-2x
      VM Memory: 512 MB
          Count: 4

Anchor Scaling

Fly optimizes the placement of your app based on conditions on our network. But if you want finer-grained control over where your apps run, you can achieve that with Fly volumes.

If your app is associated with Fly volume persistent storage, you can use them to anchor your app to specific regions. So if you want three instances of an app in one region (say LHR), and one instance in another (say FRA), and the app looked for a volume named example you can:

Create three volumes in LHR named example
Create one volume in FRA named example
Set the scale count to 4

When an app which has storage starts up, it looks for a volume with a particular name. We’ll place your app to fit your available volumes. In that way, volumes act as anchors that attach your app to specific regions in specific numbers.

Scaling Virtual Machines

Each application instance on Fly runs in its own virtual machine. The number of cores and memory available in the virtual machine can be set for all application instances using the flyctl scale vm command.

Viewing The Current VM Size

Using flyctl scale vm on its own will display the details of the application's current VM sizing.

flyctl scale vm

           Size: shared-cpu-1x
      CPU Cores: 1
         Memory: 256 MB

It shows the size name (shared-cpu-1x), number of CPUs, memory (in GB or MB).

Viewing Available VM Sizes

The flyctl platform vm-sizes command will display the various sizes with cores and memory and current pricing:

flyctl platform vm-sizes

NAME             CPU CORES MEMORY 
shared-cpu-1x    1         256 MB 
dedicated-cpu-1x 1         2 GB   
dedicated-cpu-2x 2         4 GB   
dedicated-cpu-4x 4         8 GB   
dedicated-cpu-8x 8         16 GB

The CPU Cores column shows how many vCPU cores will be allocated to the virtual machine.

CPU Types are either shared or dedicated.. In a nutshell: shared CPU instances run lighter-weight tasks, and can have up to 2G of memory. Dedicated CPU instances handle more demanding applications and can scale to 64G.

Upgrading a VM

You can easily change the size of your VMs. Just add the required size name to flyctl scale vm and we’ll take care of the rest. For example, if we want to double the VM size for our application, from shared-cpu-1x to dedicated-cpu-1x, and upgrade to 8GB of memory, we would run:

flyctl scale vm dedicated-cpu-1x --memory=8096

Scaled VM size to dedicated-cpu-1x
      CPU Cores: 1
         Memory: 8 GB

Viewing The Application's Scaled Status

To view where the instances of a Fly application are currently running, use flyctl status:

flyctl status
```out
App
  Name     = hellofly
  Owner    = personal
  Version  = 318
  Status   = running
  Hostname = hellofly.fly.dev

Deployment Status
  ID          = 8c3137c2-94d2-5fb6-e1ff-d46608def053
  Version     = v318
  Status      = running
  Description = Deployment is running
  Instances   = 3 desired, 3 placed, 2 healthy, 0 unhealthy

Instances
ID       VERSION REGION DESIRED STATUS  HEALTH CHECKS      RESTARTS CREATED
a592ecf4 318     iad    run     running 1 total, 1 passing 0        1m7s ago
382dd4ce 318     lhr    run     running 1 total, 1 passing 0        1m33s ago
075c8c53 318     sjc    run     running 1 total, 1 passing 0        3m1s ago

Deprecated: Autoscaling

Autoscaling is based on a pool of regions where the application can be run. Using a selected model, the system will then create at least the minimum number of application instances across those regions. The model will then be able create instances up to the maximum count. The min and max are global parameters for the scaling. There are two scaling modes, Standard and Balanced.

Standard: Instances of the application, up to the minimum count, are evenly distributed among the regions in the pool. They are not relocated in response to traffic. New instances are added where there is demand, up to the maximum count.
Balanced: Instances of the application are, at first, evenly distributed among the regions in the pool up to the minimum count. Where traffic is high in a particular region, new instances will be created there and then, when the maximum count of instances has been used, instances will be moved from other regions to that region. This movement of instances is designed to balance supply of compute power with demand for it.
Disabled: By default autoscaling is in Disabled mode and count-based scaling is in operation. You can turn autoscaling on by setting the autoscale mode to standard or balanced

To determine what the current settings of an application are, run flyctl autoscale show:

flyctl autoscale show

     Scale Mode: Standard
      Min Count: 1
      Max Count: 10
        VM Size: shared-cpu-1x

This scaling plan sees standard, even distribution on instances, with a minimum of 1 instance and up to 10 instances that can be created on demand.

Modifying The Scaling Plan

Both commands simply take a space-separated list of regions to add or remove.

flyctl autoscale standard

Now that we have control over where our application runs, we can talk about how many instances of your app are running.

flyctl autoscale balanced

If you want to run more than 1 instance, change your scale count with the flyctl scale count command. flyctl scale count 3 tells us to run 3 instances of your application.

flyctl autoscale balanced min=5

You can see your current scaling parameters with flyctl scale show.

flyctl scale show

flyctl autoscale balanced min=5 max=10

flyctl autoscale set min=5 max=10


You can also turn off autoscaling and return to the recommended count-scaling option by disabling autoscaling:

flyctl autoscale disable


Fly optimizes the placement of your app based on conditions on our network. But if you want finer-grained control over where your apps run, you can achieve that with [Fly volumes]().

If your app is associated with Fly volume persistent storage, you can use them to anchor your app to specific regions.  So if you want three instances of an app in one region (say LHR), and one instance in another (say FRA), and the app looked for a volume named `example` you can:

* Create three volumes in LHR named `example`
* Create one volume in FRA named `example`
* Set the scale count to 4

When an app which has storage starts up, it looks for a volume with a particular name. We’ll place your app to fit your available volumes. In that way, volumes act as anchors that attach your app to specific regions in specific numbers.


---