Instance Types

Instance types define the CPU and memory resource limits for compute instances.

Name	CPU	Memory
`nano`	0.5 cores	256 MB
`micro`	1 core	512 MB
`small`	1 core	1 GB
`medium`	2 cores	2 GB
`large`	4 cores	4 GB
`xlarge`	8 cores	8 GB

List available types via CLI:

nimbus compute instance-types

The instance type is set with --type when creating a compute instance:

nimbus compute run --name web --swarm SWARM_ID --image nginx --type medium

In manifests, use the type field:

compute:
  web:
    swarm: prod
    image: nginx
    type: medium

GPU support

GPUs are requested independently from instance types using the --gpu flag. Any instance type can be combined with GPU requests:

nimbus compute run --name ml-train --swarm SWARM_ID \
  --image pytorch/pytorch --type large --gpu 2

In manifests, use the gpu field:

compute:
  ml-train:
    swarm: prod
    image: pytorch/pytorch
    type: large
    gpu: 2

GPU requests are separate from CPU/memory tiers because GPU workloads have widely varying resource profiles. A lightweight inference model might need small + 1 GPU, while distributed training might need xlarge + 4 GPUs.

What is a GPU slot?

When you request --gpu 1, you are requesting one GPU slot on a node. What a slot represents depends on the node's overcommit factor:

Overcommit	Physical GPUs	Slots	`--gpu 1` means
1 (default)	1	1	Exclusive access to the entire physical GPU. No other container can use it.
2	1	2	Time-shared access — 2 containers share 1 physical GPU. Each sees full VRAM but shares compute time.
4	1	4	4 containers share 1 physical GPU.
8	1	8	8 containers share 1 physical GPU (aggressive sharing).

Example: A node with 2x RTX 3090 and overcommit=4 has 2 × 4 = 8 GPU slots. You can run 8 containers each requesting --gpu 1, with pairs of containers time-sharing each physical GPU.

The overcommit factor is set per node:

nimbus node gpu-overcommit NODE_ID 4

Important: Time-slicing provides NO memory isolation. All containers sharing a GPU see the full VRAM but share it. If two containers each try to use 100% of the VRAM, one will OOM. Set the overcommit factor based on your workloads' known VRAM usage. For production training workloads, keep overcommit at 1 (default). For inference with small models, 2-4 is typical.

Failed GPUs

DockNimbus tracks individual GPU health. If a GPU is marked as failed, it is excluded from the allocatable count and won't receive workloads. Use nimbus node describe NODE_ID to see per-device health status.

Only NVIDIA GPUs are supported. See the GPU acceleration guide for details on provisioning and availability.

GPU support​

What is a GPU slot?​

Failed GPUs​

GPU support

What is a GPU slot?

Failed GPUs