core::arch::x86_64

Function _tile_dpbuud

Source
pub unsafe fn _tile_dpbuud(const DST: i32, const A: i32, const B: i32)
🔬This is a nightly-only experimental API. (x86_amx_intrinsics #126622)
Available on x86-64 and target feature amx-int8 only.
Expand description

Compute dot-product of bytes in tiles with a source/destination accumulator. Multiply groups of 4 adjacent pairs of unsigned 8-bit integers in a with corresponding unsigned 8-bit integers in b, producing 4 intermediate 32-bit results. Sum these 4 results with the corresponding 32-bit integer in dst, and store the 32-bit result back to tile dst.

Intel’s documentation