Migration Forecaster

Benchmark storage throughput between VMware datastore pairs to forecast migration time

Migration Forecaster

The Migration Forecaster measures actual storage throughput between VMware datastore pairs by running disk-copy benchmarks directly on your vCenter infrastructure. The results provide data-driven migration time estimates based on your real environment rather than theoretical calculations.

Applies to: V0.11.0 Last update (dd-mm-yyyy): 03-05-2026


How It Works

The forecaster creates temporary resources in your vCenter environment to benchmark disk copy speed between a source and target datastore. It repeats the copy multiple times and computes statistics (mean, median, confidence intervals) to produce a reliable throughput estimate.

High-Level Flow


Important: vCenter Resources Created

Resources Created Per Datastore Pair

ResourceLocationPurposeLifetime
Filler VMvCenter inventoryBoots Alpine Linux to fill benchmark disk with random data~minutes; destroyed after disk fill completes
Alpine boot diskSource datastore256 MB thin VMDK for the filler VM OSDeleted after pair completes
Seed ISOSource datastoreCloud-init configuration for the filler VMDeleted after pair completes
Benchmark diskSource datastoreThin VMDK (user-specified size, default 10 GB) filled with random dataDeleted after pair completes
Clone diskTarget datastoreCopy of benchmark disk created during each iterationDeleted after each iteration
DirectoriesSource and target datastoresforecaster-* directories to hold temporary filesDeleted after pair completes

Filler VM Specification

The temporary VM used to fill the benchmark disk:

  • Name: forecaster-filler-{timestamp}
  • OS: Alpine Linux (minimal, ~256 MB)
  • Resources: 1 vCPU, 256 MB RAM
  • Behavior: Boots, fills the benchmark disk with random data via dd, then powers off automatically
  • No network access: The VM does not require or use network connectivity

Cleanup Guarantees

  • All resources are cleaned up automatically when a benchmark pair completes (success or failure)
  • If the agent is terminated mid-benchmark, some temporary files or directories may remain on the datastores and require manual cleanup
  • Temporary resource names are prefixed with forecaster- or filler-image- for easy identification
  • The filler VM is always destroyed, even if the benchmark encounters errors

Required vSphere Privileges

The vCenter credentials used for forecasting must have the following privileges:

PrivilegeWhy
Datastore.AllocateSpaceCreate benchmark disks and directories
Datastore.BrowseList datastore contents
Datastore.DeleteFileClean up temporary files and directories
Datastore.FileManagementUpload filler image and seed ISO
VirtualMachine.Inventory.CreateCreate the temporary filler VM
VirtualMachine.Inventory.DeleteDestroy the filler VM after use
VirtualMachine.Provisioning.CloneCopy disks between datastores
VirtualMachine.Interact.PowerOnBoot the filler VM
VirtualMachine.Config.AddRemoveDeviceAttach/detach disks to the filler VM

The credential verification endpoint (PUT /forecaster/credentials) checks these privileges before benchmarking starts and reports any missing ones.


API Workflow

Step 1: Verify Credentials

Validate that the vCenter credentials have the required privileges.

curl -X PUT http://agent:3443/api/v1/forecaster/credentials \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://vcenter.example.com",
    "username": "administrator@vsphere.local",
    "password": "your-password"
  }'

200 OK — credentials are valid. 403 — missing privileges (response includes missingPrivileges list).

Step 2: List Datastores

Retrieve the datastores discovered from the environment inventory.

curl -X POST http://agent:3443/api/v1/forecaster/datastores

Returns a list of datastores with their type, capacity, vendor, and offload capabilities.

Step 3: Check Pair Capabilities (Optional)

Check what storage offload capabilities are available for specific datastore pairs.

curl -X POST http://agent:3443/api/v1/forecaster/capabilities \
  -H "Content-Type: application/json" \
  -d '{
    "pairs": [
      {
        "name": "local-to-nfs",
        "sourceDatastore": "datastore1",
        "targetDatastore": "datastore2"
      }
    ]
  }'

Step 4: Start Benchmark

Start the benchmark for one or more datastore pairs.

curl -X POST http://agent:3443/api/v1/forecaster \
  -H "Content-Type: application/json" \
  -d '{
    "pairs": [
      {
        "name": "local-to-nfs",
        "sourceDatastore": "datastore1",
        "targetDatastore": "datastore2",
        "host": "esxi-01.example.com"
      }
    ],
    "diskSizeGb": 10,
    "iterations": 5,
    "concurrency": 1
  }'
ParameterDefaultDescription
pairs(required)List of datastore pairs to benchmark
pairs[].host(auto-selected)Pin to a specific ESXi host (optional)
diskSizeGb10Size of the benchmark disk in GB
iterations5Number of copy iterations per pair
concurrency1Number of pairs to benchmark in parallel

Returns 202 Accepted with the initial status. Returns 409 Conflict if a benchmark is already running.

Step 5: Poll Status

Monitor benchmark progress.

curl http://agent:3443/api/v1/forecaster

Each pair reports its current state:

StateMeaning
pendingQueued, not yet started
preparingCreating and filling the benchmark disk
runningCopying disk (iterations in progress)
completedAll iterations finished
errorFailed (see error field)
canceledStopped by user

During preparing, prepBytesTotal and prepBytesUploaded track the disk fill progress. During running, completedRuns and totalRuns track iteration progress.

Step 6: View Results

Get benchmark run data.

curl "http://agent:3443/api/v1/forecaster/runs?pairName=local-to-nfs"

Get computed statistics and time estimates.

curl "http://agent:3443/api/v1/forecaster/stats?pairName=local-to-nfs"

The stats response includes mean/median/min/max throughput, standard deviation, 95% confidence interval, and estimated migration time per 1 TB (best case, expected, worst case).

Canceling a Benchmark

Cancel all running pairs:

curl -X DELETE http://agent:3443/api/v1/forecaster

Cancel a single pair:

curl -X DELETE http://agent:3443/api/v1/forecaster/pairs/local-to-nfs

Canceling stops the active benchmark. Resources are cleaned up before the operation completes.


Understanding Results

Throughput Statistics

After a benchmark completes, the stats endpoint provides:

MetricDescription
meanMbpsAverage throughput across all successful iterations
medianMbpsMiddle value (less affected by outliers)
minMbps / maxMbpsRange of observed throughput
stddevMbpsStandard deviation (consistency measure)
ci95LowerMbps / ci95UpperMbps95% confidence interval for true throughput

Time Estimates

The estimatePer1TB field provides migration time forecasts for 1 TB of data:

EstimateBased On
bestCaseUpper bound of 95% confidence interval
expectedMedian throughput
worstCaseLower bound of 95% confidence interval

Scale these estimates linearly for your actual data volume.


Tips

  • Disk size: Larger benchmark disks produce more realistic results but take longer to prepare. 10 GB is a good default for quick estimates.
  • Iterations: More iterations improve statistical confidence. 5 is a reasonable default; use 10+ for production planning.
  • Host pinning: If your migration will use a specific ESXi host, pin the benchmark to that host for more accurate results.
  • Off-peak testing: Run benchmarks during representative load conditions to get realistic throughput numbers.
  • Multiple pairs: If migrating across different storage backends, benchmark each unique source-target combination separately.