This is the multi-page printable view of this section. Click here to print.
Assign Devices to Pods and Containers
1 - Set Up DRA in a Cluster
Kubernetes v1.34 [stable] (enabled by default: true)
            This page shows you how to configure dynamic resource allocation (DRA) in a Kubernetes cluster by enabling API groups and configuring classes of devices. These instructions are for cluster administrators.
About DRA
A Kubernetes feature that lets you request and share resources among Pods. These resources are often attached devices like hardware accelerators.
With DRA, device drivers and cluster admins define device classes that are available to claim in workloads. Kubernetes allocates matching devices to specific claims and places the corresponding Pods on nodes that can access the allocated devices.
Ensure that you're familiar with how DRA works and with DRA terminology like DeviceClasses, ResourceClaims, and ResourceClaimTemplates. For details, see Dynamic Resource Allocation (DRA).
Before you begin
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds:
Your Kubernetes server must be at or later than version v1.34.To check the version, enter  kubectl version.
- Directly or indirectly attach devices to your cluster. To avoid potential issues with drivers, wait until you set up the DRA feature for your cluster before you install drivers.
Optional: enable legacy DRA API groups
DRA graduated to stable in Kubernetes 1.34 and is enabled by default. Some older DRA drivers or workloads might still need the v1beta1 API from Kubernetes 1.30 or v1beta2 from Kubernetes 1.32. If and only if support for those is desired, then enable the following API groups:
* `resource.k8s.io/v1beta1`
* `resource.k8s.io/v1beta2`
For more information, see Enabling or disabling API groups.
Verify that DRA is enabled
To verify that the cluster is configured correctly, try to list DeviceClasses:
kubectl get deviceclasses
If the component configuration was correct, the output is similar to the following:
No resources found
If DRA isn't correctly configured, the output of the preceding command is similar to the following:
error: the server doesn't have a resource type "deviceclasses"
Try the following troubleshooting steps:
- 
Reconfigure and restart the kube-apiservercomponent.
- 
If the complete .spec.resourceClaimsfield gets removed from Pods, or if Pods get scheduled without considering the ResourceClaims, then verify that theDynamicResourceAllocationfeature gate is not turned off for kube-apiserver, kube-controller-manager, kube-scheduler or the kubelet.
Install device drivers
After you enable DRA for your cluster, you can install the drivers for your attached devices. For instructions, check the documentation of your device owner or the project that maintains the device drivers. The drivers that you install must be compatible with DRA.
To verify that your installed drivers are working as expected, list ResourceSlices in your cluster:
kubectl get resourceslices
The output is similar to the following:
NAME                                                  NODE                DRIVER               POOL                             AGE
cluster-1-device-pool-1-driver.example.com-lqx8x      cluster-1-node-1    driver.example.com   cluster-1-device-pool-1-r1gc     7s
cluster-1-device-pool-2-driver.example.com-29t7b      cluster-1-node-2    driver.example.com   cluster-1-device-pool-2-446z     8s
Try the following troubleshooting steps:
- Check the health of the DRA driver and look for error messages about publishing ResourceSlices in its log output. The vendor of the driver may have further instructions about installation and troubleshooting.
Create DeviceClasses
You can define categories of devices that your application operators can claim in workloads by creating DeviceClasses. Some device driver providers might also instruct you to create DeviceClasses during driver installation.
The ResourceSlices that your driver publishes contain information about the devices that the driver manages, such as capacity, metadata, and attributes. You can use Common Expression Language to filter for properties in your DeviceClasses, which can make finding devices easier for your workload operators.
- 
To find the device properties that you can select in DeviceClasses by using CEL expressions, get the specification of a ResourceSlice: kubectl get resourceslice <resourceslice-name> -o yamlThe output is similar to the following: apiVersion: resource.k8s.io/v1 kind: ResourceSlice # lines omitted for clarity spec: devices: - attributes: type: string: gpu capacity: memory: value: 64Gi name: gpu-0 - attributes: type: string: gpu capacity: memory: value: 64Gi name: gpu-1 driver: driver.example.com nodeName: cluster-1-node-1 # lines omitted for clarityYou can also check the driver provider's documentation for available properties and values. 
- 
Review the following example DeviceClass manifest, which selects any device that's managed by the driver.example.comdevice driver:apiVersion: resource.k8s.io/v1 kind: DeviceClass metadata: name: example-device-class spec: selectors: - cel: expression: |- device.driver == "driver.example.com"
- 
Create the DeviceClass in your cluster: kubectl apply -f https://k8s.io/examples/dra/deviceclass.yaml
Clean up
To delete the DeviceClass that you created in this task, run the following command:
kubectl delete -f https://k8s.io/examples/dra/deviceclass.yaml
What's next
2 - Allocate Devices to Workloads with DRA
Kubernetes v1.34 [stable] (enabled by default: true)
            This page shows you how to allocate devices to your Pods by using dynamic resource allocation (DRA). These instructions are for workload operators. Before reading this page, familiarize yourself with how DRA works and with DRA terminology like ResourceClaims and ResourceClaimTemplates. For more information, see Dynamic Resource Allocation (DRA).
About device allocation with DRA
As a workload operator, you can claim devices for your workloads by creating ResourceClaims or ResourceClaimTemplates. When you deploy your workload, Kubernetes and the device drivers find available devices, allocate them to your Pods, and place the Pods on nodes that can access those devices.
Before you begin
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. It is recommended to run this tutorial on a cluster with at least two nodes that are not acting as control plane hosts. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds:
Your Kubernetes server must be at or later than version v1.34.To check the version, enter  kubectl version.
- Ensure that your cluster admin has set up DRA, attached devices, and installed drivers. For more information, see Set Up DRA in a Cluster.
Identify devices to claim
Your cluster administrator or the device drivers create DeviceClasses that define categories of devices. You can claim devices by using Common Expression Language to filter for specific device properties.
Get a list of DeviceClasses in the cluster:
kubectl get deviceclasses
The output is similar to the following:
NAME                 AGE
driver.example.com   16m
If you get a permission error, you might not have access to get DeviceClasses. Check with your cluster administrator or with the driver provider for available device properties.
Claim resources
You can request resources from a DeviceClass by using ResourceClaims. To create a ResourceClaim, do one of the following:
- Manually create a ResourceClaim if you want multiple Pods to share access to the same devices, or if you want a claim to exist beyond the lifetime of a Pod.
- Use a ResourceClaimTemplate to let Kubernetes generate and manage per-Pod ResourceClaims. Create a ResourceClaimTemplate if you want every Pod to have access to separate devices that have similar configurations. For example, you might want simultaneous access to devices for Pods in a Job that uses parallel execution.
If you directly reference a specific ResourceClaim in a Pod, that ResourceClaim must already exist in the cluster. If a referenced ResourceClaim doesn't exist, the Pod remains in a pending state until the ResourceClaim is created. You can reference an auto-generated ResourceClaim in a Pod, but this isn't recommended because auto-generated ResourceClaims are bound to the lifetime of the Pod that triggered the generation.
To create a workload that claims resources, select one of the following options:
Review the following example manifest:
apiVersion: resource.k8s.io/v1
kind: ResourceClaimTemplate
metadata:
  name: example-resource-claim-template
spec:
  spec:
    devices:
      requests:
      - name: gpu-claim
        exactly:
          deviceClassName: example-device-class
          selectors:
            - cel:
                expression: |-
                  device.attributes["driver.example.com"].type == "gpu" &&
                  device.capacity["driver.example.com"].memory == quantity("64Gi")                  
This manifest creates a ResourceClaimTemplate that requests devices in the
example-device-class DeviceClass that match both of the following parameters:
- Devices that have a driver.example.com/typeattribute with a value ofgpu.
- Devices that have 64Giof capacity.
To create the ResourceClaimTemplate, run the following command:
kubectl apply -f https://k8s.io/examples/dra/resourceclaimtemplate.yaml
Review the following example manifest:
apiVersion: resource.k8s.io/v1
kind: ResourceClaim
metadata:
  name: example-resource-claim
spec:
  devices:
    requests:
    - name: single-gpu-claim
      exactly:
        deviceClassName: example-device-class
        allocationMode: All
        selectors:
        - cel:
            expression: |-
              device.attributes["driver.example.com"].type == "gpu" &&
              device.capacity["driver.example.com"].memory == quantity("64Gi")              
This manifest creates ResourceClaim that requests devices in the
example-device-class DeviceClass that match both of the following parameters:
- Devices that have a driver.example.com/typeattribute with a value ofgpu.
- Devices that have 64Giof capacity.
To create the ResourceClaim, run the following command:
kubectl apply -f https://k8s.io/examples/dra/resourceclaim.yaml
Request devices in workloads using DRA
To request device allocation, specify a ResourceClaim or a ResourceClaimTemplate
in the resourceClaims field of the Pod specification. Then, request a specific
claim by name in the resources.claims field of a container in that Pod.
You can specify multiple entries in the resourceClaims field and use specific
claims in different containers.
- 
Review the following example Job: apiVersion: batch/v1 kind: Job metadata: name: example-dra-job spec: completions: 10 parallelism: 2 template: spec: restartPolicy: Never containers: - name: container0 image: ubuntu:24.04 command: ["sleep", "9999"] resources: claims: - name: separate-gpu-claim - name: container1 image: ubuntu:24.04 command: ["sleep", "9999"] resources: claims: - name: shared-gpu-claim - name: container2 image: ubuntu:24.04 command: ["sleep", "9999"] resources: claims: - name: shared-gpu-claim resourceClaims: - name: separate-gpu-claim resourceClaimTemplateName: example-resource-claim-template - name: shared-gpu-claim resourceClaimName: example-resource-claimEach Pod in this Job has the following properties: - Makes a ResourceClaimTemplate named separate-gpu-claimand a ResourceClaim namedshared-gpu-claimavailable to containers.
- Runs the following containers:
- container0requests the devices from the- separate-gpu-claimResourceClaimTemplate.
- container1and- container2share access to the devices from the- shared-gpu-claimResourceClaim.
 
 
- Makes a ResourceClaimTemplate named 
- 
Create the Job: kubectl apply -f https://k8s.io/examples/dra/dra-example-job.yaml
Try the following troubleshooting steps:
- When the workload does not start as expected, drill down from Job
to Pods to ResourceClaims and check the objects
at each level with kubectl describeto see whether there are any status fields or events which might explain why the workload is not starting.
- When creating a Pod fails with must specify one of: resourceClaimName, resourceClaimTemplateName, check that all entries inpod.spec.resourceClaimshave exactly one of those fields set. If they do, then it is possible that the cluster has a mutating Pod webhook installed which was built against APIs from Kubernetes < 1.32. Work with your cluster administrator to check this.
Clean up
To delete the Kubernetes objects that you created in this task, follow these steps:
- 
Delete the example Job: kubectl delete -f https://k8s.io/examples/dra/dra-example-job.yaml
- 
To delete your resource claims, run one of the following commands: - 
Delete the ResourceClaimTemplate: kubectl delete -f https://k8s.io/examples/dra/resourceclaimtemplate.yaml
- 
Delete the ResourceClaim: kubectl delete -f https://k8s.io/examples/dra/resourceclaim.yaml
 
-