Skip to main content

__tile_cmmrlfp16ps

Function __tile_cmmrlfp16ps 

Source
pub unsafe fn __tile_cmmrlfp16ps(
    dst: *mut __tile1024i,
    a: __tile1024i,
    b: __tile1024i,
)
🔬This is a nightly-only experimental API. (x86_amx_intrinsics #126622)
Available on x86-64 and target feature amx-complex only.
Expand description

Perform matrix multiplication of two tiles containing complex elements and accumulate the results into a packed single precision tile. Each dword element in input tiles a and b is interpreted as a complex number with FP16 real part and FP16 imaginary part. Calculates the real part of the result. For each possible combination of (row of a, column of b), it performs a set of multiplication and accumulations on all corresponding complex numbers (one from a and one from b). The real part of the a element is multiplied with the real part of the corresponding b element, and the negated imaginary part of the a element is multiplied with the imaginary part of the corresponding b elements. The two accumulated results are added, and then accumulated into the corresponding row and column of dst. The shape of the tile is specified in the struct of __tile1024i. The register of the tile is allocated by the compiler.

Intel’s documentation