package kubeletplugin
import "k8s.io/dynamic-resource-allocation/kubeletplugin"
Package kubeletplugin provides helper functions for running a dynamic resource allocation kubelet plugin.
A DRA driver using this package can be deployed as a DaemonSet on suitable nodes. Node labeling, for example through NFD (https://github.com/kubernetes-sigs/node-feature-discovery), can be used to run the driver only on nodes which have the necessary hardware.
The service account of the DaemonSet must have sufficient RBAC permissions to read ResourceClaims and to create and update ResourceSlices, if the driver intends to publish per-node ResourceSlices. It is good security practice (but not required) to limit access to ResourceSlices associated with the node a specific Pod is running on. This can be done with a Validating Admission Policy (VAP). For more information, see the deployment of the DRA example driver (https://github.com/kubernetes-sigs/dra-example-driver/tree/main/deployments/helm/dra-example-driver/templates).
Traditionally, the kubelet has not supported rolling updates of plugins. Therefore the DaemonSet must not set `maxSurge` to a value larger than zero. With the default `maxSurge: 0`, updating the DaemonSet of the driver will first shut down the old driver Pod, then start the replacement.
This leads to a short downtime for operations that need the driver:
- Pods cannot start unless the claims they depend on were already prepared for use.
- Cleanup after the last pod which used a claim gets delayed until the driver is available again. The pod is not marked as terminated. This prevents reusing the resources used by the pod for other pods.
- Running pods are *not* affected as far as Kubernetes is concerned. However, a DRA driver might provide required runtime services. Vendors need to document this.
Note that the second point also means that draining a node should first evict normal pods, then the driver DaemonSet Pod.
Starting with Kubernetes 1.33, the kubelet supports rolling updates such that old and new Pod run at the same time for a short while and hand over work gracefully, with no downtime. However, there is no mechanism for determining in advance whether the node the DaemonSet runs on supports that. Trying to do a rolling update with a kubelet which does not support it yet will fail because shutting down the old Pod unregisters the driver even though the new Pod is running. See https://github.com/kubernetes/kubernetes/pull/129832 for details (TODO: link to doc after merging instead).
A DRA driver can either require 1.33 as minimal Kubernetes version or provide two variants of its DaemonSet deployment. In the variant with support for rolling updates, `maxSurge` can be set to a non-zero value. Administrators have to be careful about running the right variant.
Index ¶
- Constants
- type DRAPlugin
- type Device
- type Helper
- func Start(ctx context.Context, plugin DRAPlugin, opts ...Option) (result *Helper, finalErr error)
- func (d *Helper) PublishResources(_ context.Context, resources resourceslice.DriverResources) error
- func (d *Helper) RegistrationStatus() *registerapi.RegistrationStatus
- func (d *Helper) Stop()
- type NamespacedObject
- type Option
- func DriverName(driverName string) Option
- func FlockDirectoryPath(path string) Option
- func GRPCInterceptor(interceptor grpc.UnaryServerInterceptor) Option
- func GRPCStreamInterceptor(interceptor grpc.StreamServerInterceptor) Option
- func GRPCVerbosity(level int) Option
- func KubeClient(kubeClient kubernetes.Interface) Option
- func NodeName(nodeName string) Option
- func NodeUID(nodeUID types.UID) Option
- func NodeV1beta1(enabled bool) Option
- func PluginDataDirectoryPath(path string) Option
- func PluginListener(listen func(ctx context.Context, path string) (net.Listener, error)) Option
- func RegistrarDirectoryPath(path string) Option
- func RegistrarListener(listen func(ctx context.Context, path string) (net.Listener, error)) Option
- func RegistrarSocketFilename(name string) Option
- func RollingUpdate(uid types.UID) Option
- func Serialize(enabled bool) Option
- type PrepareResult
Constants ¶
const ( // KubeletPluginsDir is the default directory for [PluginDataDirectoryPath]. KubeletPluginsDir = "/var/lib/kubelet/plugins" // KubeletRegistryDir is the default for [RegistrarDirectoryPath] KubeletRegistryDir = "/var/lib/kubelet/plugins_registry" )
Types ¶
type DRAPlugin ¶
type DRAPlugin interface { // PrepareResourceClaims is called to prepare all resources allocated // for the given ResourceClaims. This is used to implement // the gRPC NodePrepareResources call. // // It gets called with the complete list of claims that are needed // by some pod. In contrast to the gRPC call, the helper has // already retrieved the actual ResourceClaim objects. // // In addition to that, the helper also: // - verifies that all claims are really allocated // - increments a numeric counter for each call and // adds its value to a per-context logger with "requestID" as key // - adds the method name with "method" as key to that logger // - logs the gRPC call and response (configurable with GRPCVerbosity) // - serializes all gRPC calls unless the driver explicitly opted out of that // // This call must be idempotent because the kubelet might have to ask // for preparation multiple times, for example if it gets restarted. // // A DRA driver should verify that all devices listed in a // [resourceapi.DeviceRequestAllocationResult] are not already in use // for some other ResourceClaim. Kubernetes tries very hard to ensure // that, but if something went wrong, then the DRA driver is the last // line of defense against using the same device for two different // unrelated workloads. // // If an error is returned, the result is ignored. Otherwise the result // must have exactly one entry for each claim, identified by the UID of // the corresponding ResourceClaim. For each claim, preparation // can be either successful (no error set in the per-ResourceClaim PrepareResult) // or can be reported as failed. // // It is possible to create the CDI spec files which define the CDI devices // on-the-fly in PrepareResourceClaims. UnprepareResourceClaims then can // remove them. Container runtimes may cache CDI specs but must reload // files in case of a cache miss. To avoid false cache hits, the unique // name in the CDI device ID should not be reused. A DRA driver can use // the claim UID for it. PrepareResourceClaims(ctx context.Context, claims []*resourceapi.ResourceClaim) (result map[types.UID]PrepareResult, err error) // UnprepareResourceClaims must undo whatever work PrepareResourceClaims did. // // At the time when this gets called, the original ResourceClaims may have // been deleted already. They also don't get cached by the kubelet. Therefore // parameters for each ResourcClaim are only the UID, namespace and name. // It is the responsibility of the DRA driver to cache whatever additional // information it might need about prepared resources. // // This call must be idempotent because the kubelet might have to ask // for un-preparation multiple times, for example if it gets restarted. // Therefore it is not an error if this gets called for a ResourceClaim // which is not currently prepared. // // As with PrepareResourceClaims, the helper takes care of logging // and serialization. // // The conventions for returning one overall error and several per-ResourceClaim // errors are the same as in PrepareResourceClaims. UnprepareResourceClaims(ctx context.Context, claims []NamespacedObject) (result map[types.UID]error, err error) }
DRAPlugin is the interface that needs to be implemented by a DRA driver to use this helper package. The helper package then implements the gRPC interface expected by the kubelet by wrapping the DRAPlugin implementation.
type Device ¶
type Device struct { // Requests lists the names of requests or subrequests in the // ResourceClaim that this device is associated with. The subrequest // name may be included here, but it is also okay to just return // the request name. // // A DRA driver can get this string from the Request field in // [resourceapi.DeviceRequestAllocationResult], which includes the // subrequest name if there is one. // // If empty, the device is associated with all requests. Requests []string // PoolName identifies the DRA driver's pool which contains the device. // Must not be empty. PoolName string // DeviceName identifies the device inside that pool. // Must not be empty. DeviceName string // CDIDeviceIDs lists all CDI devices associated with this DRA device. // Each ID must be of the form "<vendor ID>/<class>=<unique name>". // May be empty. CDIDeviceIDs []string }
Device provides the CDI device IDs for one request in a ResourceClaim.
type Helper ¶
type Helper struct {
// contains filtered or unexported fields
}
Helper combines the kubelet registration service and the DRA node plugin service and implements them by calling a DRAPlugin implementation.
func Start ¶
Start sets up two gRPC servers (one for registration, one for the DRA node client) and implements them by calling a DRAPlugin implementation.
The context and/or DRAPlugin.Stop can be used to stop all background activity. Stop also blocks. A logger can be stored in the context to add values or a name to all log entries.
If the plugin will be used to publish resources, KubeClient and NodeName options are mandatory. Otherwise only DriverName is mandatory.
func (*Helper) PublishResources ¶
func (d *Helper) PublishResources(_ context.Context, resources resourceslice.DriverResources) error
PublishResources may be called one or more times to publish resource information in ResourceSlice objects. If it never gets called, then the kubelet plugin does not manage any ResourceSlice objects.
PublishResources does not block, so it might still take a while after it returns before all information is actually written to the API server.
It is the responsibility of the caller to ensure that the pools and slices described in the driver resources parameters are valid according to the restrictions defined in the resource.k8s.io API.
Invalid ResourceSlices will be rejected by the apiserver during publishing, which happens asynchronously and thus does not get returned as error here. The only error returned here is when publishing was not set up properly, for example missing KubeClient or NodeName options.
The caller may modify the resources after this call returns.
func (*Helper) RegistrationStatus ¶
func (d *Helper) RegistrationStatus() *registerapi.RegistrationStatus
RegistrationStatus returns the result of registration, nil if none received yet.
func (*Helper) Stop ¶
func (d *Helper) Stop()
Stop ensures that all spawned goroutines are stopped and frees resources.
type NamespacedObject ¶
type NamespacedObject struct { types.NamespacedName UID types.UID }
NamespacedObject comprises a resource name with a mandatory namespace and optional UID. It gets rendered as "<namespace>/<name>:[<uid>]" (text output) or as an object (JSON output).
func (NamespacedObject) MarshalLog ¶
func (n NamespacedObject) MarshalLog() interface{}
MarshalLog emits a struct containing required key/value pair
func (NamespacedObject) String ¶
func (n NamespacedObject) String() string
String returns the general purpose string representation
type Option ¶
type Option func(o *options) error
Option implements the functional options pattern for Start.
func DriverName ¶
DriverName defines the driver name for the dynamic resource allocation driver. Must be set.
func FlockDirectoryPath ¶
FlockDir changes where lock files are created and locked. A lock file is needed when serializing gRPC calls and rolling updates are enabled. The directory must exist and be reserved for exclusive use by the driver. The default is the plugin data directory.
func GRPCInterceptor ¶
func GRPCInterceptor(interceptor grpc.UnaryServerInterceptor) Option
GRPCInterceptor is called for each incoming gRPC method call. This option may be used more than once and each interceptor will get called.
func GRPCStreamInterceptor ¶
func GRPCStreamInterceptor(interceptor grpc.StreamServerInterceptor) Option
GRPCStreamInterceptor is called for each gRPC streaming method call. This option may be used more than once and each interceptor will get called.
func GRPCVerbosity ¶
GRPCVerbosity sets the verbosity for logging gRPC calls. Default is 4. A negative value disables logging.
func KubeClient ¶
func KubeClient(kubeClient kubernetes.Interface) Option
KubeClient grants the plugin access to the API server. This is needed for syncing ResourceSlice objects. It's the responsibility of the DRA driver developer to ensure that this client has permission to read, write, patch and list such objects. It also needs permission to read node objects. Ideally, a validating admission policy should be used to limit write access to ResourceSlices which belong to the node.
func NodeName ¶
NodeName tells the plugin on which node it is running. This is needed for syncing ResourceSlice objects.
func NodeUID ¶
NodeUID tells the plugin the UID of the v1.Node object. This is used when syncing ResourceSlice objects, but doesn't have to be used. If not supplied, the controller will look up the object once.
func NodeV1beta1 ¶
NodeV1beta1 explicitly chooses whether the DRA gRPC API v1beta1 gets enabled. True by default.
This is used in Kubernetes for end-to-end testing. The default should be fine for DRA drivers.
func PluginDataDirectoryPath ¶
PluginDataDirectoryPath sets the path where the DRA driver creates the "dra.sock" socket that the kubelet connects to for the DRA-specific gRPC calls. It is also used to coordinate between different Pods when using rolling updates. It must not be shared with other kubelet plugins.
The default is /var/lib/kubelet/plugins/<driver name>. This directory does not need to be inside the kubelet data directory, as long as the kubelet can access it.
This path must be the same inside and outside of the driver's container. The directory must exist.
func PluginListener ¶
PluginListener configures how to create the registrar socket. The default is to remove the file if it exists and to then create a socket.
This is used in Kubernetes for end-to-end testing. The default should be fine for DRA drivers.
func RegistrarDirectoryPath ¶
RegistrarDirectoryPath sets the path to the directory where the kubelet expects to find registration sockets of plugins. Typically this is /var/lib/kubelet/plugins_registry with /var/lib/kubelet being the kubelet's data directory.
This is also the default. Some Kubernetes clusters may use a different data directory. This path must be the same inside and outside of the driver's container. The directory must exist.
func RegistrarListener ¶
RegistrarListener configures how to create the registrar socket. The default is to remove the file if it exists and to then create a socket.
This is used in Kubernetes for end-to-end testing. The default should be fine for DRA drivers.
func RegistrarSocketFilename ¶
RegistrarSocketFilename sets the name of the socket inside the directory where the kubelet watches for registration sockets (see RegistrarDirectoryPath).
Usually DRA drivers should not need this option. It is provided to support updates from an installation which used an older release of of the helper code.
The default is <driver name>-reg.sock. When rolling updates are enabled, it is <driver name>-<uid>-reg.sock.
This option and RollingUpdate are mutually exclusive.
func RollingUpdate ¶
RollingUpdate can be used to enable support for running two plugin instances in parallel while a newer instance replaces the older. When enabled, both instances must share the same plugin data directory and driver name. They create different sockets to allow the kubelet to connect to both at the same time.
There is no guarantee which of the two instances are used by kubelet. For example, it can happen that a claim gets prepared by one instance and then needs to be unprepared by the other. Kubelet then may fall back to the first one again for some other operation. In practice this means that each instance must be entirely stateless across method calls. Serialization (on by default, see Serialize) ensures that methods are serialized across all instances through file locking. The plugin implementation can load shared state from a file at the start of a call, execute and then store the updated shared state again.
Passing a non-empty uid enables rolling updates, an empty uid disables it. The uid must be the pod UID. A DaemonSet can pass that into the driver container via the downward API (https://kubernetes.io/docs/concepts/workloads/pods/downward-api/#downwardapi-fieldRef).
Because new instances cannot remove stale sockets of older instances, it is important that each pod shuts down cleanly: it must catch SIGINT/TERM and stop the helper instead of quitting immediately.
This depends on support in the kubelet which was added in Kubernetes 1.33. Don't use this if it is not certain that the kubelet has that support!
This option and RegistrarSocketFilename are mutually exclusive.
func Serialize ¶
Serialize overrides whether the helper serializes the prepare and unprepare calls. The default is to serialize.
A DRA driver can opt out of that to speed up parallel processing, but then must handle concurrency itself.
type PrepareResult ¶
type PrepareResult struct { // Err, if non-nil, describes a problem that occurred while preparing // the ResourceClaim. The devices are then ignored and the kubelet will // try to prepare the ResourceClaim again later. Err error // Devices contains the IDs of CDI devices associated with specific requests // in a ResourceClaim. Those IDs will be passed on to the container runtime // by the kubelet. // // The empty slice is also valid. Devices []Device }
PrepareResult contains the result of preparing one particular ResourceClaim.
Source Files ¶
doc.go draplugin.go endpoint.go namespacedobject.go noderegistrar.go nonblockinggrpcserver.go registrationserver.go
- Version
- v0.33.0 (latest)
- Published
- Apr 23, 2025
- Platform
- linux/amd64
- Imports
- 19 packages
- Last checked
- 1 minute ago –
Tools for package owners.