Nodes & GPUs
ACTIVATE provides visibility into the nodes running across your Kubernetes clusters and tools for managing NVIDIA GPU configurations, including Multi-Instance GPU (MIG) partitioning.
Admin Only
The Nodes view is available to organization admins and platform admins only.
Viewing Cluster Nodes
Navigate to Kubernetes > Nodes in the sidebar to view all nodes across your connected clusters.
Node Table Columns
| Column | Description |
|---|---|
| Name | The node hostname. Click to open the node detail page. |
| Cluster | The cluster the node belongs to (hidden when filtering by a single cluster). |
| Kubernetes Version | The kubelet version running on the node (e.g., v1.28.4). |
| Container Runtime | The container runtime and version (e.g., containerd://1.7.2). |
| Internal IP | The node's internal network IP address. |
| Architecture | The CPU architecture (e.g., amd64, arm64). |
Filtering Nodes
Use the filter bar to narrow results:
- Clusters — Show nodes from specific clusters only
- Search — Free-text search across node name and cluster name
Node Detail Page
Click a node name to open its detail page. This page displays comprehensive information about the selected node.
System Information
The detail page shows the following system-level properties:
| Property | Description |
|---|---|
| OS Image | The operating system image (e.g., Ubuntu 22.04.3 LTS). |
| Kernel Version | The Linux kernel version. |
| Operating System | The OS type (e.g., linux). |
| Architecture | The CPU architecture. |
| Container Runtime Version | The container runtime and version. |
| Kubernetes Version | The kubelet version. |
| Internal IP | The node's internal IP address. |
Capacity and Allocatable Resources
Each node reports two sets of resource quantities:
- Capacity — The total physical resources available on the node
- Allocatable — The resources available for pod scheduling (capacity minus system-reserved resources)
Both sets include:
| Resource | Format | Example |
|---|---|---|
| CPU | Number of cores | 8 |
| Memory | Gigabytes | 32Gi |
| Ephemeral Storage | Gigabytes | 100Gi |
| Pods | Maximum pod count | 110 |
| NVIDIA GPUs | GPU count (if present) | 4 |
Resource Overhead
Compare the capacity and allocatable values to understand how much overhead is reserved for system components like the kubelet and OS processes.
Node Labels
The detail page displays all labels assigned to the node. Labels commonly include:
kubernetes.io/hostname— The node hostnamekubernetes.io/arch— CPU architecturekubernetes.io/os— Operating systemnode.kubernetes.io/instance-type— Instance type (on cloud providers)nvidia.com/gpu.product— GPU model name (on GPU nodes)nvidia.com/mig.config— Current MIG configuration label
GPU Management
For nodes equipped with NVIDIA GPUs, ACTIVATE provides tools to install and manage the NVIDIA GPU Operator and configure MIG partitioning directly from the node detail page.
NVIDIA GPU Operator
The GPU Operator automates the management of GPU drivers, container toolkits, and device plugins on Kubernetes. From the node detail page, you can install, upgrade, or roll back the GPU Operator Helm chart.
Installing the GPU Operator
- Navigate to the node detail page for a GPU-equipped node
- Click the GPU Operator button in the action bar
- Fill in the installation form:
| Field | Description | Default |
|---|---|---|
| Helm Chart Version | The GPU Operator chart version to install | v25.3.0 |
| Namespace | The namespace for the GPU Operator deployment | gpu-operator |
| Create Namespace | Whether to create the namespace if it does not exist | true |
| Containerd Config | Path to the containerd configuration file (optional) | — |
| Containerd Socket | Path to the containerd socket (optional) | — |
- Click Install NVIDIA GPU Operator
The operator is installed from the https://helm.ngc.nvidia.com/nvidia Helm repository using the nvidia/gpu-operator chart.
Upgrading the GPU Operator
If the GPU Operator is already installed, the same form appears with an Upgrade NVIDIA GPU Operator button instead. The upgrade uses the same Helm chart configuration.
Rolling Back
When the GPU Operator is installed, the drawer also shows the Release History table with all previous revisions. Click the rollback button next to any revision to revert to that version.
MIG Configuration
NVIDIA Multi-Instance GPU (MIG) allows a single physical GPU to be partitioned into multiple isolated GPU instances, each with dedicated compute, memory, and bandwidth resources.
MIG Strategies
ACTIVATE supports two MIG strategies:
| Strategy | Description |
|---|---|
| Single | All GPU instances on the node use the same MIG profile. Use this when all workloads on the node have identical GPU requirements. |
| Mixed | Different MIG profiles can coexist on the same GPU. Use this for heterogeneous workloads with varying GPU requirements. |
Configuring MIG
-
Navigate to the node detail page for a GPU node that has the GPU Operator installed
-
Click the NVIDIA MIG button in the action bar
-
Select a MIG Strategy (
singleormixed) -
Enter a MIG Strategy Config value that specifies the MIG profile to apply
Common configuration values include:
all-1g.6gb— All instances configured as 1g.6gb (smallest slice)all-2g.12gb— All instances configured as 2g.12gball-3g.24gb— All instances configured as 3g.24gball-balanced— A balanced mix of MIG instance sizes (for mixed strategy)
-
Click Configure MIG
GPU-Dependent Profiles
The available MIG profiles depend on the GPU model. The configuration drawer displays the default MIG partitioning options for the detected GPU type, loaded from the default-mig-parted-config ConfigMap managed by the GPU Operator.
What Happens During MIG Configuration
When you apply a MIG configuration, ACTIVATE performs two operations:
- Patches the cluster policy — Updates the
clusterpolicies.nvidia.com/cluster-policyCRD to set the MIG strategy (e.g.,mixedorsingle) at/spec/mig/strategy - Labels the node — Applies the
nvidia.com/mig.configlabel to the target node with the specified configuration value (e.g.,all-balanced)
The GPU Operator detects these changes and automatically reconfigures the GPU partitioning on the node.
Workload Disruption
Changing MIG configuration may temporarily disrupt GPU workloads running on the node. Plan MIG reconfiguration during maintenance windows when possible.
Cross-Cluster Queries
The nodes list view aggregates data from all connected clusters by default. The response metadata includes:
- Total clusters queried — How many clusters were contacted
- Successful clusters — How many responded successfully
- Total nodes — The combined count of nodes returned
If a cluster is unreachable, the remaining clusters still return their results.
See Also
- Resource Quotas — Set GPU limits per namespace
- Helm Charts — Manage Helm releases including the GPU Operator
- Managing Workloads — View GPU workloads running across your clusters