Project HAMi: Heterogeneous AI Computing Virtualization Middleware

2024-11-29


Project HAMi: Heterogeneous AI Computing Virtualization Middleware

LICENSE Releases OpenSSF Best Practices Go Report Card codecov FOSSA Status docker pulls slack

Introduction

HAMi, formerly known as 'k8s-vGPU-scheduler', is a Heterogeneous device management middleware for Kubernetes. It can manage different types of heterogeneous devices (like GPU, NPU, etc.), share heterogeneous devices among pods, and make better scheduling decisions based on topology of devices and schedule policies.

HAMi aims to remove the gap between different heterogeneous devices and provide a unified interface for users to manage with no change to their applications. Until June 2024, HAMi has been widely used around the world in various industries such as Internet, Cloud Computing, Finance, and Manufacturing. More than 40 companies or institutions are not only end users but also active contributors.

cncf_logo

HAMi is a Cloud Native Computing Foundation(CNCF) sandbox and landscape project, as well as a CNAI Landscape project.

Device Virtualization

HAMi provides device virtualization for several heterogeneous devices including GPU, supporting device sharing and device resource isolation. For the list of devices supporting device virtualization, please refer to supported devices.

Device Sharing Capabilities

  • Allows partial device allocation by specifying device memory
  • Hard isolation of computing resources
  • Permits partial device allocation by specifying device core usage
  • Requires zero changes to existing programs

example

Device Resource Isolation

HAMi supports hard isolation of device resources. Here's a simple demonstration using NVIDIA GPU as an example: After submitting a task defined as follows:

      resources:
        limits:
          nvidia.com/gpu: 1 # requesting 1 vGPU
          nvidia.com/gpumem: 3000 # Each vGPU contains 3000m device memory

Only 3G of visible memory will be available

hard_limit

Project Architecture

hami_arch

HAMi consists of several components, including a unified mutatingwebhook, a unified scheduler, and device plugins along with in-container control components for various heterogeneous computing devices. The overall architectural features are shown in the diagram above.

Quick Start

Choose Your Cluster Scheduler

kube-scheduler volcano-scheduler

Installation Requirements

  • NVIDIA drivers >= 440
  • nvidia-docker version > 2.0
  • config default runtime is nvidia for containerd/docker/cri-o container runtime
  • Kubernetes version >= 1.16
  • glibc >= 2.17 & glibc < 2.3.0
  • kernel version >= 3.10
  • helm > 3.0

Installation

  1. Label your GPU nodes for scheduling with HAMi by adding the label "gpu=on". Without this label, the nodes cannot be managed by our scheduler.
kubectl label nodes {nodeid} gpu=on
  1. Add our repo in helm
helm repo add hami-charts https://project-hami.github.io/HAMi/
  1. Use the following command for deployment
helm install hami hami-charts/hami -n kube-system

You can customize your installation by adjusting the configs

  1. Verify your installation using the following command
kubectl get pods -n kube-system

If both vgpu-device-plugin and vgpu-scheduler pods are in the Running state, your installation is successful.

WebUI

HAMi-WebUI is available after HAMi v2.4.0. For deployment instructions.

webui-1 webui-2

Example Task Submission

NVIDIA vGPUs can now be requested by containers using the resource type nvidia.com/gpu:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-pod
spec:
  containers:
    - name: ubuntu-container
      image: ubuntu:18.04
      command: ["bash", "-c", "sleep 86400"]
      resources:
        limits:
          nvidia.com/gpu: "2" # requesting 2 vGPUs
          nvidia.com/gpumem: "3000" # Each vGPU contains 3000m device memory (optional, integer)
          nvidia.com/gpucores: "30" # Each vGPU uses 30% of actual GPU computing power (optional, integer)

If your task cannot run on any node (e.g., if the task's nvidia.com/gpu is greater than the actual GPU count of any GPU node in the cluster), the task will remain in pending state.

You can now execute the nvidia-smi command in the container to compare the difference between vGPU and actual GPU memory size.

Note:

  • If you use the privileged field, this task will not be scheduled as it can see all GPUs and will affect other tasks.
  • Do not set the nodeName field; use nodeSelector for similar requirements.

For more examples, click here: Examples

Monitoring

Monitoring is automatically enabled after installation. Obtain cluster information overview by visiting:

http://{scheduler ip}:{monitorPort}/metrics

The default monitorPort is 31993; other values can be set using --set devicePlugin.service.httpPort during installation.

Grafana dashboard example

Note The vGPU status of a node will only be collected after it uses vGPU

Notes

  • If you don't request vGPUs when using the device plugin with NVIDIA images, all GPUs on the machine may be exposed inside your container
  • Currently, A100 MIG mode is only supported in "none" and "mixed" modes
  • Tasks with the "nodeName" field cannot be scheduled at the moment; please use "nodeSelector" instead

Contributing

RiseUnion, as one of the core developers of the HAMi open source community, continues to promote the co-construction and development of the HAMi community.

If you want to become a contributor to HAMi, please refer to: Contributor Guide.

For more details, please refer to: HAMi github.

Star History

Star History Chart

HAMi Enterprise Edition

Rise VAST is the HAMi Enterprise Edition launched by RiseUnion in collaboration with 4Paradigm, building upon the open-source version. It introduces numerous enterprise-level features, including the superposition of computing power and memory, computing power expansion and preemption, computing power specification definition, nvlink topology awareness, differentiated scheduling strategies, enterprise-level isolation, resource quota control, multi-cluster management, audit logs, high availability assurance, and refined operation analysis, among many other core functionalities. By providing unified management, shared allocation, on-demand distribution, and rapid scheduling of computing power clusters, it fully unleashes the potential of heterogeneous computing power, accelerating the modernization and intelligent transformation of AI infrastructure. Click to view the related report on the HAMi enterprise edition signed by RiseUnion and 4Paradigm.