pub unsafe fn _mm256_clmulepi64_epi128(
    a: __m256i,
    b: __m256i,
    const IMM8: i32
) -> __m256i
🔬This is a nightly-only experimental API. (stdarch_x86_avx512 #111137)
Available on (x86 or x86-64) and target feature vpclmulqdq and x86 only.
Expand description

Performs a carry-less multiplication of two 64-bit polynomials over the finite field GF(2) - in each of the 2 128-bit lanes.

The immediate byte is used for determining which halves of each lane a and b should be used. Immediate bits other than 0 and 4 are ignored. All lanes share immediate byte.

Intel’s documentation