Parallel Works

Nodes & GPUs

ACTIVATE provides visibility into the nodes running across your Kubernetes clusters and tools for managing NVIDIA GPU configurations, including Multi-Instance GPU (MIG) partitioning.

Admin Only

The Nodes view is available to organization admins and platform admins only.

Viewing Cluster Nodes

Navigate to Kubernetes > Nodes in the sidebar to view all nodes across your connected clusters.

Node Table Columns

ColumnDescription
NameThe node hostname. Click to open the node detail page.
ClusterThe cluster the node belongs to (hidden when filtering by a single cluster).
Kubernetes VersionThe kubelet version running on the node (e.g., v1.28.4).
Container RuntimeThe container runtime and version (e.g., containerd://1.7.2).
Internal IPThe node's internal network IP address.
ArchitectureThe CPU architecture (e.g., amd64, arm64).

Filtering Nodes

Use the filter bar to narrow results:

  • Clusters — Show nodes from specific clusters only
  • Search — Free-text search across node name and cluster name

Node Detail Page

Click a node name to open its detail page. This page displays comprehensive information about the selected node.

System Information

The detail page shows the following system-level properties:

PropertyDescription
OS ImageThe operating system image (e.g., Ubuntu 22.04.3 LTS).
Kernel VersionThe Linux kernel version.
Operating SystemThe OS type (e.g., linux).
ArchitectureThe CPU architecture.
Container Runtime VersionThe container runtime and version.
Kubernetes VersionThe kubelet version.
Internal IPThe node's internal IP address.

Capacity and Allocatable Resources

Each node reports two sets of resource quantities:

  • Capacity — The total physical resources available on the node
  • Allocatable — The resources available for pod scheduling (capacity minus system-reserved resources)

Both sets include:

ResourceFormatExample
CPUNumber of cores8
MemoryGigabytes32Gi
Ephemeral StorageGigabytes100Gi
PodsMaximum pod count110
NVIDIA GPUsGPU count (if present)4

Resource Overhead

Compare the capacity and allocatable values to understand how much overhead is reserved for system components like the kubelet and OS processes.

Node Labels

The detail page displays all labels assigned to the node. Labels commonly include:

  • kubernetes.io/hostname — The node hostname
  • kubernetes.io/arch — CPU architecture
  • kubernetes.io/os — Operating system
  • node.kubernetes.io/instance-type — Instance type (on cloud providers)
  • nvidia.com/gpu.product — GPU model name (on GPU nodes)
  • nvidia.com/mig.config — Current MIG configuration label

GPU Management

For nodes equipped with NVIDIA GPUs, ACTIVATE provides tools to install and manage the NVIDIA GPU Operator and configure MIG partitioning directly from the node detail page.

NVIDIA GPU Operator

The GPU Operator automates the management of GPU drivers, container toolkits, and device plugins on Kubernetes. From the node detail page, you can install, upgrade, or roll back the GPU Operator Helm chart.

Installing the GPU Operator

  1. Navigate to the node detail page for a GPU-equipped node
  2. Click the GPU Operator button in the action bar
  3. Fill in the installation form:
FieldDescriptionDefault
Helm Chart VersionThe GPU Operator chart version to installv25.3.0
NamespaceThe namespace for the GPU Operator deploymentgpu-operator
Create NamespaceWhether to create the namespace if it does not existtrue
Containerd ConfigPath to the containerd configuration file (optional)
Containerd SocketPath to the containerd socket (optional)
  1. Click Install NVIDIA GPU Operator

The operator is installed from the https://helm.ngc.nvidia.com/nvidia Helm repository using the nvidia/gpu-operator chart.

Upgrading the GPU Operator

If the GPU Operator is already installed, the same form appears with an Upgrade NVIDIA GPU Operator button instead. The upgrade uses the same Helm chart configuration.

Rolling Back

When the GPU Operator is installed, the drawer also shows the Release History table with all previous revisions. Click the rollback button next to any revision to revert to that version.

MIG Configuration

NVIDIA Multi-Instance GPU (MIG) allows a single physical GPU to be partitioned into multiple isolated GPU instances, each with dedicated compute, memory, and bandwidth resources.

MIG Strategies

ACTIVATE supports two MIG strategies:

StrategyDescription
SingleAll GPU instances on the node use the same MIG profile. Use this when all workloads on the node have identical GPU requirements.
MixedDifferent MIG profiles can coexist on the same GPU. Use this for heterogeneous workloads with varying GPU requirements.

Configuring MIG

  1. Navigate to the node detail page for a GPU node that has the GPU Operator installed

  2. Click the NVIDIA MIG button in the action bar

  3. Select a MIG Strategy (single or mixed)

  4. Enter a MIG Strategy Config value that specifies the MIG profile to apply

    Common configuration values include:

    • all-1g.6gb — All instances configured as 1g.6gb (smallest slice)
    • all-2g.12gb — All instances configured as 2g.12gb
    • all-3g.24gb — All instances configured as 3g.24gb
    • all-balanced — A balanced mix of MIG instance sizes (for mixed strategy)
  5. Click Configure MIG

GPU-Dependent Profiles

The available MIG profiles depend on the GPU model. The configuration drawer displays the default MIG partitioning options for the detected GPU type, loaded from the default-mig-parted-config ConfigMap managed by the GPU Operator.

What Happens During MIG Configuration

When you apply a MIG configuration, ACTIVATE performs two operations:

  1. Patches the cluster policy — Updates the clusterpolicies.nvidia.com/cluster-policy CRD to set the MIG strategy (e.g., mixed or single) at /spec/mig/strategy
  2. Labels the node — Applies the nvidia.com/mig.config label to the target node with the specified configuration value (e.g., all-balanced)

The GPU Operator detects these changes and automatically reconfigures the GPU partitioning on the node.

Workload Disruption

Changing MIG configuration may temporarily disrupt GPU workloads running on the node. Plan MIG reconfiguration during maintenance windows when possible.

Cross-Cluster Queries

The nodes list view aggregates data from all connected clusters by default. The response metadata includes:

  • Total clusters queried — How many clusters were contacted
  • Successful clusters — How many responded successfully
  • Total nodes — The combined count of nodes returned

If a cluster is unreachable, the remaining clusters still return their results.

See Also