2025-04-06
HAMi supports virtualization of Huawei Ascend 910A, 910B series devices (910B, 910B2, 910B3, 910B4), and 310P devices, providing several features similar to vGPU, including:
Install the chart using helm, See 'enabling vGPU support in kubernetes' section here
kubectl label node {ascend-node} accelerator=huawei-Ascend910
Deploy Ascend docker runtime
wget https://raw.githubusercontent.com/Project-HAMi/ascend-device-plugin/master/build/ascendplugin-910-hami.yaml
kubectl apply -f ascendplugin-910-hami.yaml
HAMi includes a built-in virtualization configuration file for NPUs.
HAMi also supports customizing virtualization parameters through the following method:
The directory structure is as follows:
tree -L 1
.
├── Chart.yaml
├── files
├── templates
└── values.yaml
The content is as follows:
vnpus:
- chipName: 910B
commonWord: Ascend910A
resourceName: huawei.com/Ascend910A
resourceMemoryName: huawei.com/Ascend910A-memory
memoryAllocatable: 32768
memoryCapacity: 32768
aiCore: 30
templates:
- name: vir02
memory: 2184
aiCore: 2
- name: vir04
memory: 4369
aiCore: 4
- name: vir08
memory: 8738
aiCore: 8
- name: vir16
memory: 17476
aiCore: 16
- chipName: 910B2
commonWord: Ascend910B2
resourceName: huawei.com/Ascend910B2
resourceMemoryName: huawei.com/Ascend910B2-memory
memoryAllocatable: 65536
memoryCapacity: 65536
aiCore: 24
aiCPU: 6
templates:
- name: vir03_1c_8g
memory: 8192
aiCore: 3
aiCPU: 1
- name: vir06_1c_16g
memory: 16384
aiCore: 6
aiCPU: 1
- name: vir12_3c_32g
memory: 32768
aiCore: 12
aiCPU: 3
- chipName: 910B3
commonWord: Ascend910B
resourceName: huawei.com/Ascend910B
resourceMemoryName: huawei.com/Ascend910B-memory
memoryAllocatable: 65536
memoryCapacity: 65536
aiCore: 20
aiCPU: 7
templates:
- name: vir05_1c_16g
memory: 16384
aiCore: 5
aiCPU: 1
- name: vir10_3c_32g
memory: 32768
aiCore: 10
aiCPU: 3
- chipName: 910B4
commonWord: Ascend910B4
resourceName: huawei.com/Ascend910B4
resourceMemoryName: huawei.com/Ascend910B4-memory
memoryAllocatable: 32768
memoryCapacity: 32768
aiCore: 20
aiCPU: 7
templates:
- name: vir05_1c_8g
memory: 8192
aiCore: 5
aiCPU: 1
- name: vir10_3c_16g
memory: 16384
aiCore: 10
aiCPU: 3
- chipName: 310P3
commonWord: Ascend310P
resourceName: huawei.com/Ascend310P
resourceMemoryName: huawei.com/Ascend310P-memory
memoryAllocatable: 21527
memoryCapacity: 24576
aiCore: 8
aiCPU: 7
templates:
- name: vir01
memory: 3072
aiCore: 1
aiCPU: 1
- name: vir02
memory: 6144
aiCore: 2
aiCPU: 2
- name: vir04
memory: 12288
aiCore: 4
aiCPU: 4
Helm installation and updates will be based on the configuration in this file, overwriting the built-in configuration of Helm.
HAMi supports configuring NPU resource allocation through predefined device templates. Each template includes the following:
When a user requests a specific memory size, the system automatically aligns the requested memory to the nearest template size. For example, if a user requests 2000MB of memory, the system will select the smallest template with memory size greater than or equal to 2000MB.
For specific configurations, refer to the official Ascend virtualization templates.
Refer to the aiCore ratio in each type configuration (chipName) and the aiCore under the template.
Ascend310P devices (Atlas inference series products) support multiple granularity partitions, including 1/8, 1/4, and 1/2 of a card. Allocated memory automatically aligns to the nearest granularity above the requested amount.
You can request Ascend 910B resources using the huawei.com/ascend910B
and huawei.com/ascend910B-memory
resource types:
apiVersion: v1
kind: Pod
metadata:
name: ascend910A-pod
spec:
containers:
- name: ubuntu-container
image: ascendhub.huawei.com/public-ascendhub/ascend-mindspore:23.0.RC3-centos7
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
huawei.com/Ascend910A: 1 # requesting 1 vGPUs
huawei.com/Ascend910A-memory: 2000 # requesting 2000m device memory
---
apiVersion: v1
kind: Pod
metadata:
name: ascend910B2-pod
spec:
containers:
- name: ubuntu-container
image: ascendhub.huawei.com/public-ascendhub/ascend-mindspore:23.0.RC3-centos7
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
huawei.com/Ascend910B2: 1 # requesting 1 vGPUs
huawei.com/Ascend910B2-memory: 2000 # requesting 2000m device memory
---
apiVersion: v1
kind: Pod
metadata:
name: ascend910B-pod
spec:
containers:
- name: ubuntu-container
image: ascendhub.huawei.com/public-ascendhub/ascend-mindspore:23.0.RC3-centos7
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
huawei.com/Ascend910B: 1 # requesting 1 vGPUs
huawei.com/Ascend910B-memory: 2000 # requesting 2000m device memory
---
apiVersion: v1
kind: Pod
metadata:
name: ascend910B4-pod
spec:
containers:
- name: ubuntu-container
image: ascendhub.huawei.com/public-ascendhub/ascend-mindspore:23.0.RC3-centos7
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
huawei.com/Ascend910B4: 1 # requesting 1 vGPUs
huawei.com/Ascend910B4-memory: 2000 # requesting 2000m device memory
---
apiVersion: v1
kind: Pod
metadata:
name: ascend310P-pod
spec:
containers:
- name: ubuntu-container
image: ascendhub.huawei.com/public-ascendhub/ascend-mindspore:23.0.RC3-centos7
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
huawei.com/Ascend310P: 1 # requesting 1 vGPUs
huawei.com/Ascend310P-memory: 2000 # requesting 2000m device memory
HAMi supports health monitoring for Ascend NPU devices, ensuring only healthy devices are allocated to Pods. Health monitoring includes:
HAMi supports statistics collection for Ascend NPU device resource usage, including:
These statistics can be used for resource scheduling decisions and performance optimization.
HAMi implements a node locking mechanism to prevent resource allocation conflicts. When a Pod requests Ascend NPU resources, the system locks the corresponding node to prevent other Pods from using the same device resources simultaneously.
You can specify which Ascend devices to use or exclude using Pod annotations:
apiVersion: v1
kind: Pod
metadata:
name: ascend-pod
annotations:
# Use specific Ascend devices (comma-separated list)
hami.io/use-Ascend910B-uuid: "device-uuid-1,device-uuid-2"
# Or exclude specific Ascend devices (comma-separated list)
hami.io/no-use-Ascend910B-uuid: "device-uuid-3,device-uuid-4"
spec:
# ... rest of pod spec
NOTE: The device UUID format depends on the device type, such as
Ascend910B
,Ascend910B2
,Ascend910B3
,Ascend910B4
, etc. You can find the available device UUIDs in the node status.
Here is a complete example demonstrating how to use the UUID selection feature:
apiVersion: v1
kind: Pod
metadata:
name: ascend-pod
annotations:
hami.io/use-Ascend910B-uuid: "device-uuid-1,device-uuid-2"
spec:
containers:
- name: ubuntu-container
image: ascendhub.huawei.com/public-ascendhub/ascend-mindspore:23.0.RC3-centos7
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
huawei.com/Ascend910B: 1
huawei.com/Ascend910B-memory: 2000
In this example, the Pod will only run on Ascend910B devices with UUIDs device-uuid-1
or device-uuid-2
.
You can find the Ascend device UUIDs on a node using the following command:
kubectl describe node <node-name> | grep -A 10 "Allocated resources"
Or by viewing the node annotations:
kubectl get node <node-name> -o yaml | grep -A 10 "annotations:"
In the node annotations, look for hami.io/node-register-Ascend910B
or similar annotations, which contain the device UUID information.
huawei.com/Ascend910B-memory
is only effective when huawei.com/Ascend910B=1
.huawei.com/Ascend910B > 1
) do not support vNPU mode.To learn more about RiseUnion's GPU pooling, virtualization and computing power management solutions, please contact us: contact@riseunion.io