pub fn _mm256_dp_ps(a: __m256, b: __m256, const IMM8: i32) -> __m256Available on x86-64 and target feature 
avx only.Expand description
Conditionally multiplies the packed single-precision (32-bit) floating-point
elements in a and b using the high 4 bits in imm8,
sum the four products, and conditionally return the sum
using the low 4 bits of imm8.