Nodes & GPUs

ACTIVATE provides visibility into the nodes running across your Kubernetes clusters and tools for managing NVIDIA GPU configurations, including Multi-Instance GPU (MIG) partitioning.

Admin Only

The Nodes view is available to organization admins and platform admins only.

Viewing Cluster Nodes

Navigate to Kubernetes > Nodes in the sidebar to view all nodes across your connected clusters.

Node Table Columns

Column	Description
Name	The node hostname. Click to open the node detail page.
Cluster	The cluster the node belongs to (hidden when filtering by a single cluster).
Kubernetes Version	The kubelet version running on the node (e.g., `v1.28.4`).
Container Runtime	The container runtime and version (e.g., `containerd://1.7.2`).
Internal IP	The node's internal network IP address.
Architecture	The CPU architecture (e.g., `amd64`, `arm64`).

Filtering Nodes

Use the filter bar to narrow results:

Clusters — Show nodes from specific clusters only
Search — Free-text search across node name and cluster name

Node Detail Page

Click a node name to open its detail page. This page displays comprehensive information about the selected node.

System Information

The detail page shows the following system-level properties:

Property	Description
OS Image	The operating system image (e.g., `Ubuntu 22.04.3 LTS`).
Kernel Version	The Linux kernel version.
Operating System	The OS type (e.g., `linux`).
Architecture	The CPU architecture.
Container Runtime Version	The container runtime and version.
Kubernetes Version	The kubelet version.
Internal IP	The node's internal IP address.

Capacity and Allocatable Resources

Each node reports two sets of resource quantities:

Capacity — The total physical resources available on the node
Allocatable — The resources available for pod scheduling (capacity minus system-reserved resources)

Both sets include:

Resource	Format	Example
CPU	Number of cores	`8`
Memory	Gigabytes	`32Gi`
Ephemeral Storage	Gigabytes	`100Gi`
Pods	Maximum pod count	`110`
NVIDIA GPUs	GPU count (if present)	`4`

Resource Overhead

Compare the capacity and allocatable values to understand how much overhead is reserved for system components like the kubelet and OS processes.

Node Labels

The detail page displays all labels assigned to the node. Labels commonly include:

kubernetes.io/hostname — The node hostname
kubernetes.io/arch — CPU architecture
kubernetes.io/os — Operating system
node.kubernetes.io/instance-type — Instance type (on cloud providers)
nvidia.com/gpu.product — GPU model name (on GPU nodes)
nvidia.com/mig.config — Current MIG configuration label

GPU Management

For nodes equipped with NVIDIA GPUs, ACTIVATE provides tools to install and manage the NVIDIA GPU Operator and configure MIG partitioning directly from the node detail page.

NVIDIA GPU Operator

The GPU Operator automates the management of GPU drivers, container toolkits, and device plugins on Kubernetes. From the node detail page, you can install, upgrade, or roll back the GPU Operator Helm chart.

Installing the GPU Operator

Navigate to the node detail page for a GPU-equipped node
Click the GPU Operator button in the action bar
Fill in the installation form:

Field	Description	Default
Helm Chart Version	The GPU Operator chart version to install	`v25.3.0`
Namespace	The namespace for the GPU Operator deployment	`gpu-operator`
Create Namespace	Whether to create the namespace if it does not exist	`true`
Containerd Config	Path to the containerd configuration file (optional)	—
Containerd Socket	Path to the containerd socket (optional)	—

Click Install NVIDIA GPU Operator

The operator is installed from the https://helm.ngc.nvidia.com/nvidia Helm repository using the nvidia/gpu-operator chart.

Upgrading the GPU Operator

If the GPU Operator is already installed, the same form appears with an Upgrade NVIDIA GPU Operator button instead. The upgrade uses the same Helm chart configuration.

Rolling Back

When the GPU Operator is installed, the drawer also shows the Release History table with all previous revisions. Click the rollback button next to any revision to revert to that version.

MIG Configuration

NVIDIA Multi-Instance GPU (MIG) allows a single physical GPU to be partitioned into multiple isolated GPU instances, each with dedicated compute, memory, and bandwidth resources.

MIG Strategies

ACTIVATE supports two MIG strategies:

Strategy	Description
Single	All GPU instances on the node use the same MIG profile. Use this when all workloads on the node have identical GPU requirements.
Mixed	Different MIG profiles can coexist on the same GPU. Use this for heterogeneous workloads with varying GPU requirements.

Configuring MIG

Navigate to the node detail page for a GPU node that has the GPU Operator installed
Click the NVIDIA MIG button in the action bar
Select a MIG Strategy (single or mixed)
Enter a MIG Strategy Config value that specifies the MIG profile to apply

Common configuration values include:
- all-1g.6gb — All instances configured as 1g.6gb (smallest slice)
- all-2g.12gb — All instances configured as 2g.12gb
- all-3g.24gb — All instances configured as 3g.24gb
- all-balanced — A balanced mix of MIG instance sizes (for mixed strategy)
Click Configure MIG

GPU-Dependent Profiles

The available MIG profiles depend on the GPU model. The configuration drawer displays the default MIG partitioning options for the detected GPU type, loaded from the default-mig-parted-config ConfigMap managed by the GPU Operator.

What Happens During MIG Configuration

When you apply a MIG configuration, ACTIVATE performs two operations:

Patches the cluster policy — Updates the clusterpolicies.nvidia.com/cluster-policy CRD to set the MIG strategy (e.g., mixed or single) at /spec/mig/strategy
Labels the node — Applies the nvidia.com/mig.config label to the target node with the specified configuration value (e.g., all-balanced)

The GPU Operator detects these changes and automatically reconfigures the GPU partitioning on the node.

Workload Disruption

Changing MIG configuration may temporarily disrupt GPU workloads running on the node. Plan MIG reconfiguration during maintenance windows when possible.

Cross-Cluster Queries

The nodes list view aggregates data from all connected clusters by default. The response metadata includes:

Total clusters queried — How many clusters were contacted
Successful clusters — How many responded successfully
Total nodes — The combined count of nodes returned

If a cluster is unreachable, the remaining clusters still return their results.

Viewing Cluster Nodes

Node Table Columns

Filtering Nodes

Node Detail Page

System Information

Capacity and Allocatable Resources

Node Labels

GPU Management

NVIDIA GPU Operator

Installing the GPU Operator

Upgrading the GPU Operator

Rolling Back

MIG Configuration

MIG Strategies

Configuring MIG

What Happens During MIG Configuration

Cross-Cluster Queries

See Also

Viewing Cluster Nodes

Node Table Columns

Filtering Nodes

Node Detail Page

System Information

Capacity and Allocatable Resources

Node Labels

GPU Management

NVIDIA GPU Operator

Installing the GPU Operator

Upgrading the GPU Operator

Rolling Back

MIG Configuration

MIG Strategies

Configuring MIG

What Happens During MIG Configuration

Cross-Cluster Queries

See Also