🔬This is a nightly-only experimental API. (
gpu_offload #131513)Expand description
This module provides support for gpu offloading. For technical details regarding the offload_kernel
and how to use it, see their respective documentation.
§General usage
The offload_kernel macro can be applied to a function to generate the necessary code to launch a
kernel on the target device.
ⓘ
#[offload_kernel]
fn kernel(x: *mut [f64; 256]) {
// SAFETY:
// calling our `arch` functions and dereferencing a raw pointer is unsafe
unsafe {
let n = (*x).len();
let i = (thread_idx_x() + block_idx_x() * block_dim_x()) as usize;
if i < n {
(*x)[i] = i as f64;
}
}
}To launch an offloaded kernel, the only current way is to use the core::intrinsic::offload
intrinsic (note that intrinsics usage is discouraged outside the standard library). This
allows you to specify grid and block dimensions and pass the required arguments to the device.
ⓘ
let mut x = [0.0f64; 256];
core::intrinsics::offload::<_, _, ()>(kernel, [256, 1, 1], [1, 1, 1], (&mut x as *mut [f64; 256],));For precise information on the offload intrinsic, see its respective documentation.
§Current limitations:
- Usage is restricted to types supported by the current device-mapping implementation.
- Generics and functions accepting dyn Trait are not supported.
- Kernel execution is currently restricted to intrinsics usage, which is discouraged outside of the standard library.
Modules§
- offload
Experimental - This module provides support for gpu offloading. For technical details regarding the
offload_kerneland how to use it, see their respective documentation.
Attribute Macros§
- offload_
kernel Experimental - The
offload_kernelmacro is applied to a function to generate two separate definitions: a host-side wrapper for dispatch and a device-side kernel.