HAMi Introduces Dynamic MIG Support for NVIDIA GPUs

2025-02-10


HAMi Introduces Dynamic MIG Support for NVIDIA GPUs

Introduction

We now support dynamic-mig by using mig-parted to adjust mig-devices dynamically, including:

  • Dynamic MIG instance management: User don't need to operate on GPU node, using 'nvidia-smi -i 0 -mig 1' or other command to manage MIG instance, all will be done by HAMi-device-plugin.
  • Dynamic MIG Adjustment: Each MIG device managed by HAMi will dyamically adjust their MIG template according to tasks submitted when necessary.
  • Device MIG Observation: Each MIG instance generated by HAMi will be shown in scheduler-monitor, including task information. user can get a clear overview of MIG nodes.
  • Compatable with HAMi-core nodes: HAMi can manage a unified GPU pool of HAMi-core node and mig node. A task can be scheduled to either node if not appointed manually by using nvidia.com/vgpu-mode annotation.
  • Unified API with HAMi-core: Zero work needs to be done to make the job compatible with dynamic-mig feature.

Design

How HAMi support Nvidia MIG

Prerequisites

  • NVIDIA Blackwell and Hopper™ and Ampere Devices
  • HAMi >= v2.5.0, HAMi v2.5.0 release
  • Nvidia-container-toolkit

Enabling Dynamic-mig Support

step 1: Install the chart using helm See 'enabling vGPU support in kubernetes' section here

step 2: Configure mode in device-plugin configMap to mig for MIG nodes

kubectl describe cm  hami-device-plugin -n kube-system
{
    "nodeconfig": [
        {
            "name": "MIG-NODE-A",
            "operatingmode": "mig",
            "filterdevices": {
              "uuid": [],
              "index": []
            }
        }
    ]
}

step 3: Restart the following pods for the change to take effect:

  • hami-scheduler
  • hami-device-plugin on 'MIG-NODE-A'

Custom mig configuration (Optional)

HAMi currently has a built-in mig configuration for MIG.

You can customize the mig configuration by following the steps below:

Change the content of 'device-configmap.yaml' in charts/hami/templates/scheduler, the as follows

    nvidia:
      resourceCountName: {{ .Values.resourceName }}
      resourceMemoryName: {{ .Values.resourceMem }}
      resourceMemoryPercentageName: {{ .Values.resourceMemPercentage }}
      resourceCoreName: {{ .Values.resourceCores }}
      resourcePriorityName: {{ .Values.resourcePriority }}
      overwriteEnv: false
      defaultMemory: 0
      defaultCores: 0
      defaultGPUNum: 1
      deviceSplitCount: {{ .Values.devicePlugin.deviceSplitCount }}
      deviceMemoryScaling: {{ .Values.devicePlugin.deviceMemoryScaling }}
      deviceCoreScaling: {{ .Values.devicePlugin.deviceCoreScaling }}
      knownMigGeometries:
      - models: [ "A30" ]
        allowedGeometries:
          - 
            - name: 1g.6gb
              memory: 6144
              count: 4
          - 
            - name: 2g.12gb
              memory: 12288
              count: 2
          - 
            - name: 4g.24gb
              memory: 24576
              count: 1
      - models: [ "A100-SXM4-40GB", "A100-40GB-PCIe", "A100-PCIE-40GB", "A100-SXM4-40GB" ]
        allowedGeometries:
          - 
            - name: 1g.5gb
              memory: 5120
              count: 7
          - 
            - name: 2g.10gb
              memory: 10240
              count: 3
            - name: 1g.5gb
              memory: 5120
              count: 1
          - 
            - name: 3g.20gb
              memory: 20480
              count: 2
          - 
            - name: 7g.40gb
              memory: 40960
              count: 1
      - models: [ "A100-SXM4-80GB", "A100-80GB-PCIe", "A100-PCIE-80GB"]
        allowedGeometries:
          - 
            - name: 1g.10gb
              memory: 10240
              count: 7
          - 
            - name: 2g.20gb
              memory: 20480
              count: 3
            - name: 1g.10gb
              memory: 10240
              count: 1
          - 
            - name: 3g.40gb
              memory: 40960
              count: 2
          - 
            - name: 7g.79gb
              memory: 80896
              count: 1

Note Helm installation and updates will be based on the configuration in this file, overwriting the built-in configuration of Helm

Note Be aware HAMi will find and use the first MIG template suitable to the task in the order of this configMap

Running MIG jobs

MIG instance can now be requested by a container the same way as using hami-core simply by specifying the nvidia.com/gpu and nvidia.com/gpumem resource type.

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
  annotations:
    nvidia.com/vgpu-mode: "mig" #(Optional), if not set, this pod can be assigned to a MIG instance or a hami-core instance
spec:
  containers:
    - name: ubuntu-container
      image: ubuntu:18.04
      command: ["bash", "-c", "sleep 86400"]
      resources:
        limits:
          nvidia.com/gpu: 2 
          nvidia.com/gpumem: 8000

In this example above, the task allocates two mig instances, each with at least 8G device memory.

Monitor MIG Instance

MIG Instance managed by HAMi will be displayed in scheduler monitor(scheduler node ip:31993/metrics), as follows:

# HELP nodeGPUMigInstance GPU Sharing mode. 0 for hami-core, 1 for mig, 2 for mps
# TYPE nodeGPUMigInstance gauge
nodeGPUMigInstance{deviceidx="0",deviceuuid="GPU-936619fc-f6a1-74a8-0bc6-ecf6b3269313",migname="3g.20gb-0",nodeid="aio-node15",zone="vGPU"} 1
nodeGPUMigInstance{deviceidx="0",deviceuuid="GPU-936619fc-f6a1-74a8-0bc6-ecf6b3269313",migname="3g.20gb-1",nodeid="aio-node15",zone="vGPU"} 0
nodeGPUMigInstance{deviceidx="1",deviceuuid="GPU-30f90f49-43ab-0a78-bf5c-93ed41ef2da2",migname="3g.20gb-0",nodeid="aio-node15",zone="vGPU"} 1
nodeGPUMigInstance{deviceidx="1",deviceuuid="GPU-30f90f49-43ab-0a78-bf5c-93ed41ef2da2",migname="3g.20gb-1",nodeid="aio-node15",zone="vGPU"} 1

Notes

  1. You don't need to do anything on MIG node, all are managed by mig-parted in hami-device-plugin.
  2. Nvidia devices before Ampere architect can't use 'mig' mode
  3. You won't see any mig resources(ie, nvidia.com/mig-1g.10gb) on node, hami uses a unified resource name for both 'mig' and 'hami-core' node

reference

To learn more about RiseUnion's GPU virtualization and computing power management solutions, contact@riseunion.io