2026-02-04
HAMi v2.8.0 is here! Since the release of v2.7, the project has made significant strides in architectural integrity, scheduling reliability, and ecosystem alignment. v2.8 delivers systemic enhancements in Kubernetes native standardization, heterogeneous device support, and production-grade observability, making HAMi even more suitable for long-running, stability-critical AI production clusters.
Overview of key features in v2.8:
HAMi-DRA, driving HAMi's evolution from "custom scheduling logic" to Kubernetes native standard interfaces.HAMi-DRA, mock-device-plugin, ascend-device-plugin, HAMi-WebUI, and more.DRA is the next-generation device resource declaration and allocation mechanism being advanced by the Kubernetes community, aiming to provide a more standardized, composable, and scalable resource management model for devices like GPUs and AI accelerators.
Traditional Kubernetes device management has limitations:
limits[nvidia.com/gpu], unable to express complex needs like separate memory and compute requirements.DRA introduces new APIs like ResourceClaim and DeviceClass to standardize declaration, allocation, and management, offering greater flexibility and scalability.
HAMi-DRA is the HAMi community's standalone DRA implementation. It uses a Mutating Webhook architecture to automatically convert traditional GPU resource requests into DRA ResourceClaims.
nvidia.com/gpu, nvidia.com/gpumem, nvidia.com/gpucores requests to DRA ResourceClaims.DRA Usage Example
When submitting a Pod, the HAMi-DRA Webhook automatically converts it to use DRA ResourceClaim.
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
spec:
containers:
- name: gpu-container
image: nvidia/cuda:11.8.0-base-ubuntu22.04
command: ["nvidia-smi"]
resources:
limits:
nvidia.com/gpu: 2
nvidia.com/gpumem: 4096
nvidia.com/gpucores: 80
For large-scale clusters or HA deployments, HAMi v2.8.0 introduces Leader Election for multiple Scheduler instances. Using Kubernetes Lease mechanisms, it ensures only one Scheduler instance is active at any given time to make scheduling decisions.
Key Benefits:
HAMi v2.8.0 adds support for NVIDIA CDI mode. CDI is a container device interface standard maintained by CNCF TAG, providing a standardized way to inject devices. Users can enable this via global.deviceListStrategy: cdi-annotations.
HAMi v2.8.0 introduces the Mock Device Plugin, lowering the barrier for developers and CI/test environments to simulate devices.
Capabilities:
HAMi v2.8.0 systemically enhances observability, adding build info metrics and deprecating obsolete ones.
hami_build_info containing version, build time, and Git commit.vGPUMemoryAllocated and vGPUCoreAllocated.HAMi v2.8 enhances Iluvatar GPU support:
Device Info: Added podInfos to DeviceUsage for better scheduling decisions.
[Thanks] @qiangwei1983 @Kyrie336 for contributions to Iluvatar support!
Continued enhancements for MetaX GPUs:
[Thanks] @Kyrie336 for contributions to MetaX support!
The ascend-device-plugin project now supports vNPU (virtual NPU) features, compatible with both HAMi and Volcano schedulers.
[Thanks] @DSFans2014 @archlitchi for contributions to Ascend support!
Kueue is a batch job queue management project by Kubernetes SIG Scheduling. The HAMi community contributed enhancements to Kueue to natively support HAMi's device resource management model. Kueue's ResourceTransformation can now automatically convert HAMi vGPU requests (e.g., converting nvidia.com/gpu + nvidia.com/gpucores to nvidia.com/total-gpucores) for unified management.
HAMi v2.8 fixes several vLLM compatibility issues:
CUDA_VISIBLE_DEVICES.[Thanks] @archlitchi for contributions to vLLM compatibility!
v2.8 addresses issues from real-world production environments:
[Thanks] @litaixun @luohua13 @FouoF @Shouren for contributions to stability fixes!
Once again, we would like to thank everyone who actively contributes to the community. It is because of you that HAMi continues to break through and grow.
Reference: https://mp.weixin.qq.com/s/hvpMl4bRpMENZAbdWR2peg.
To learn more about RiseUnion's vGPU resource pooling, virtualization, and AI compute management solutions:please contact us at contact@riseunion.io