1.27.0[−][src]Module core::arch::x86_64
Platformspecific intrinsics for the x86_64
platform.
See the module documentation for more details.
Structs
__m512  Experimental 512bit wide set of sixteen 
__m512d  Experimental 512bit wide set of eight 
__m512i  Experimental 512bit wide integer vector type, x86specific 
CpuidResult  Result of the 
__m128  128bit wide set of four 
__m128d  128bit wide set of two 
__m128i  128bit wide integer vector type, x86specific 
__m256  256bit wide set of eight 
__m256d  256bit wide set of four 
__m256i  256bit wide integer vector type, x86specific 
Constants
_MM_CMPINT_EQ  Experimental Equal 
_MM_CMPINT_FALSE  Experimental False 
_MM_CMPINT_LE  Experimental Lessthanorequal 
_MM_CMPINT_LT  Experimental Lessthan 
_MM_CMPINT_NE  Experimental Notequal 
_MM_CMPINT_NLE  Experimental Not lessthanorequal 
_MM_CMPINT_NLT  Experimental Not lessthan 
_MM_CMPINT_TRUE  Experimental True 
_MM_MANT_NORM_1_2  Experimental interval [1, 2) 
_MM_MANT_NORM_P5_1  Experimental interval [0.5, 1) 
_MM_MANT_NORM_P5_2  Experimental interval [0.5, 2) 
_MM_MANT_NORM_P75_1P5  Experimental interval [0.75, 1.5) 
_MM_MANT_SIGN_NAN  Experimental DEST = NaN if sign(SRC) = 1 
_MM_MANT_SIGN_SRC  Experimental sign = sign(SRC) 
_MM_MANT_SIGN_ZERO  Experimental sign = 0 
_MM_PERM_AAAA  Experimental 
_MM_PERM_AAAB  Experimental 
_MM_PERM_AAAC  Experimental 
_MM_PERM_AAAD  Experimental 
_MM_PERM_AABA  Experimental 
_MM_PERM_AABB  Experimental 
_MM_PERM_AABC  Experimental 
_MM_PERM_AABD  Experimental 
_MM_PERM_AACA  Experimental 
_MM_PERM_AACB  Experimental 
_MM_PERM_AACC  Experimental 
_MM_PERM_AACD  Experimental 
_MM_PERM_AADA  Experimental 
_MM_PERM_AADB  Experimental 
_MM_PERM_AADC  Experimental 
_MM_PERM_AADD  Experimental 
_MM_PERM_ABAA  Experimental 
_MM_PERM_ABAB  Experimental 
_MM_PERM_ABAC  Experimental 
_MM_PERM_ABAD  Experimental 
_MM_PERM_ABBA  Experimental 
_MM_PERM_ABBB  Experimental 
_MM_PERM_ABBC  Experimental 
_MM_PERM_ABBD  Experimental 
_MM_PERM_ABCA  Experimental 
_MM_PERM_ABCB  Experimental 
_MM_PERM_ABCC  Experimental 
_MM_PERM_ABCD  Experimental 
_MM_PERM_ABDA  Experimental 
_MM_PERM_ABDB  Experimental 
_MM_PERM_ABDC  Experimental 
_MM_PERM_ABDD  Experimental 
_MM_PERM_ACAA  Experimental 
_MM_PERM_ACAB  Experimental 
_MM_PERM_ACAC  Experimental 
_MM_PERM_ACAD  Experimental 
_MM_PERM_ACBA  Experimental 
_MM_PERM_ACBB  Experimental 
_MM_PERM_ACBC  Experimental 
_MM_PERM_ACBD  Experimental 
_MM_PERM_ACCA  Experimental 
_MM_PERM_ACCB  Experimental 
_MM_PERM_ACCC  Experimental 
_MM_PERM_ACCD  Experimental 
_MM_PERM_ACDA  Experimental 
_MM_PERM_ACDB  Experimental 
_MM_PERM_ACDC  Experimental 
_MM_PERM_ACDD  Experimental 
_MM_PERM_ADAA  Experimental 
_MM_PERM_ADAB  Experimental 
_MM_PERM_ADAC  Experimental 
_MM_PERM_ADAD  Experimental 
_MM_PERM_ADBA  Experimental 
_MM_PERM_ADBB  Experimental 
_MM_PERM_ADBC  Experimental 
_MM_PERM_ADBD  Experimental 
_MM_PERM_ADCA  Experimental 
_MM_PERM_ADCB  Experimental 
_MM_PERM_ADCC  Experimental 
_MM_PERM_ADCD  Experimental 
_MM_PERM_ADDA  Experimental 
_MM_PERM_ADDB  Experimental 
_MM_PERM_ADDC  Experimental 
_MM_PERM_ADDD  Experimental 
_MM_PERM_BAAA  Experimental 
_MM_PERM_BAAB  Experimental 
_MM_PERM_BAAC  Experimental 
_MM_PERM_BAAD  Experimental 
_MM_PERM_BABA  Experimental 
_MM_PERM_BABB  Experimental 
_MM_PERM_BABC  Experimental 
_MM_PERM_BABD  Experimental 
_MM_PERM_BACA  Experimental 
_MM_PERM_BACB  Experimental 
_MM_PERM_BACC  Experimental 
_MM_PERM_BACD  Experimental 
_MM_PERM_BADA  Experimental 
_MM_PERM_BADB  Experimental 
_MM_PERM_BADC  Experimental 
_MM_PERM_BADD  Experimental 
_MM_PERM_BBAA  Experimental 
_MM_PERM_BBAB  Experimental 
_MM_PERM_BBAC  Experimental 
_MM_PERM_BBAD  Experimental 
_MM_PERM_BBBA  Experimental 
_MM_PERM_BBBB  Experimental 
_MM_PERM_BBBC  Experimental 
_MM_PERM_BBBD  Experimental 
_MM_PERM_BBCA  Experimental 
_MM_PERM_BBCB  Experimental 
_MM_PERM_BBCC  Experimental 
_MM_PERM_BBCD  Experimental 
_MM_PERM_BBDA  Experimental 
_MM_PERM_BBDB  Experimental 
_MM_PERM_BBDC  Experimental 
_MM_PERM_BBDD  Experimental 
_MM_PERM_BCAA  Experimental 
_MM_PERM_BCAB  Experimental 
_MM_PERM_BCAC  Experimental 
_MM_PERM_BCAD  Experimental 
_MM_PERM_BCBA  Experimental 
_MM_PERM_BCBB  Experimental 
_MM_PERM_BCBC  Experimental 
_MM_PERM_BCBD  Experimental 
_MM_PERM_BCCA  Experimental 
_MM_PERM_BCCB  Experimental 
_MM_PERM_BCCC  Experimental 
_MM_PERM_BCCD  Experimental 
_MM_PERM_BCDA  Experimental 
_MM_PERM_BCDB  Experimental 
_MM_PERM_BCDC  Experimental 
_MM_PERM_BCDD  Experimental 
_MM_PERM_BDAA  Experimental 
_MM_PERM_BDAB  Experimental 
_MM_PERM_BDAC  Experimental 
_MM_PERM_BDAD  Experimental 
_MM_PERM_BDBA  Experimental 
_MM_PERM_BDBB  Experimental 
_MM_PERM_BDBC  Experimental 
_MM_PERM_BDBD  Experimental 
_MM_PERM_BDCA  Experimental 
_MM_PERM_BDCB  Experimental 
_MM_PERM_BDCC  Experimental 
_MM_PERM_BDCD  Experimental 
_MM_PERM_BDDA  Experimental 
_MM_PERM_BDDB  Experimental 
_MM_PERM_BDDC  Experimental 
_MM_PERM_BDDD  Experimental 
_MM_PERM_CAAA  Experimental 
_MM_PERM_CAAB  Experimental 
_MM_PERM_CAAC  Experimental 
_MM_PERM_CAAD  Experimental 
_MM_PERM_CABA  Experimental 
_MM_PERM_CABB  Experimental 
_MM_PERM_CABC  Experimental 
_MM_PERM_CABD  Experimental 
_MM_PERM_CACA  Experimental 
_MM_PERM_CACB  Experimental 
_MM_PERM_CACC  Experimental 
_MM_PERM_CACD  Experimental 
_MM_PERM_CADA  Experimental 
_MM_PERM_CADB  Experimental 
_MM_PERM_CADC  Experimental 
_MM_PERM_CADD  Experimental 
_MM_PERM_CBAA  Experimental 
_MM_PERM_CBAB  Experimental 
_MM_PERM_CBAC  Experimental 
_MM_PERM_CBAD  Experimental 
_MM_PERM_CBBA  Experimental 
_MM_PERM_CBBB  Experimental 
_MM_PERM_CBBC  Experimental 
_MM_PERM_CBBD  Experimental 
_MM_PERM_CBCA  Experimental 
_MM_PERM_CBCB  Experimental 
_MM_PERM_CBCC  Experimental 
_MM_PERM_CBCD  Experimental 
_MM_PERM_CBDA  Experimental 
_MM_PERM_CBDB  Experimental 
_MM_PERM_CBDC  Experimental 
_MM_PERM_CBDD  Experimental 
_MM_PERM_CCAA  Experimental 
_MM_PERM_CCAB  Experimental 
_MM_PERM_CCAC  Experimental 
_MM_PERM_CCAD  Experimental 
_MM_PERM_CCBA  Experimental 
_MM_PERM_CCBB  Experimental 
_MM_PERM_CCBC  Experimental 
_MM_PERM_CCBD  Experimental 
_MM_PERM_CCCA  Experimental 
_MM_PERM_CCCB  Experimental 
_MM_PERM_CCCC  Experimental 
_MM_PERM_CCCD  Experimental 
_MM_PERM_CCDA  Experimental 
_MM_PERM_CCDB  Experimental 
_MM_PERM_CCDC  Experimental 
_MM_PERM_CCDD  Experimental 
_MM_PERM_CDAA  Experimental 
_MM_PERM_CDAB  Experimental 
_MM_PERM_CDAC  Experimental 
_MM_PERM_CDAD  Experimental 
_MM_PERM_CDBA  Experimental 
_MM_PERM_CDBB  Experimental 
_MM_PERM_CDBC  Experimental 
_MM_PERM_CDBD  Experimental 
_MM_PERM_CDCA  Experimental 
_MM_PERM_CDCB  Experimental 
_MM_PERM_CDCC  Experimental 
_MM_PERM_CDCD  Experimental 
_MM_PERM_CDDA  Experimental 
_MM_PERM_CDDB  Experimental 
_MM_PERM_CDDC  Experimental 
_MM_PERM_CDDD  Experimental 
_MM_PERM_DAAA  Experimental 
_MM_PERM_DAAB  Experimental 
_MM_PERM_DAAC  Experimental 
_MM_PERM_DAAD  Experimental 
_MM_PERM_DABA  Experimental 
_MM_PERM_DABB  Experimental 
_MM_PERM_DABC  Experimental 
_MM_PERM_DABD  Experimental 
_MM_PERM_DACA  Experimental 
_MM_PERM_DACB  Experimental 
_MM_PERM_DACC  Experimental 
_MM_PERM_DACD  Experimental 
_MM_PERM_DADA  Experimental 
_MM_PERM_DADB  Experimental 
_MM_PERM_DADC  Experimental 
_MM_PERM_DADD  Experimental 
_MM_PERM_DBAA  Experimental 
_MM_PERM_DBAB  Experimental 
_MM_PERM_DBAC  Experimental 
_MM_PERM_DBAD  Experimental 
_MM_PERM_DBBA  Experimental 
_MM_PERM_DBBB  Experimental 
_MM_PERM_DBBC  Experimental 
_MM_PERM_DBBD  Experimental 
_MM_PERM_DBCA  Experimental 
_MM_PERM_DBCB  Experimental 
_MM_PERM_DBCC  Experimental 
_MM_PERM_DBCD  Experimental 
_MM_PERM_DBDA  Experimental 
_MM_PERM_DBDB  Experimental 
_MM_PERM_DBDC  Experimental 
_MM_PERM_DBDD  Experimental 
_MM_PERM_DCAA  Experimental 
_MM_PERM_DCAB  Experimental 
_MM_PERM_DCAC  Experimental 
_MM_PERM_DCAD  Experimental 
_MM_PERM_DCBA  Experimental 
_MM_PERM_DCBB  Experimental 
_MM_PERM_DCBC  Experimental 
_MM_PERM_DCBD  Experimental 
_MM_PERM_DCCA  Experimental 
_MM_PERM_DCCB  Experimental 
_MM_PERM_DCCC  Experimental 
_MM_PERM_DCCD  Experimental 
_MM_PERM_DCDA  Experimental 
_MM_PERM_DCDB  Experimental 
_MM_PERM_DCDC  Experimental 
_MM_PERM_DCDD  Experimental 
_MM_PERM_DDAA  Experimental 
_MM_PERM_DDAB  Experimental 
_MM_PERM_DDAC  Experimental 
_MM_PERM_DDAD  Experimental 
_MM_PERM_DDBA  Experimental 
_MM_PERM_DDBB  Experimental 
_MM_PERM_DDBC  Experimental 
_MM_PERM_DDBD  Experimental 
_MM_PERM_DDCA  Experimental 
_MM_PERM_DDCB  Experimental 
_MM_PERM_DDCC  Experimental 
_MM_PERM_DDCD  Experimental 
_MM_PERM_DDDA  Experimental 
_MM_PERM_DDDB  Experimental 
_MM_PERM_DDDC  Experimental 
_MM_PERM_DDDD  Experimental 
_XABORT_CAPACITY  Experimental Transaction abort due to the transaction using too much memory. 
_XABORT_CONFLICT  Experimental Transaction abort due to a memory conflict with another thread. 
_XABORT_DEBUG  Experimental Transaction abort due to a debug trap. 
_XABORT_EXPLICIT  Experimental Transaction explicitly aborted with xabort. The parameter passed to xabort is available with

_XABORT_NESTED  Experimental Transaction abort in a inner nested transaction. 
_XABORT_RETRY  Experimental Transaction retry is possible. 
_XBEGIN_STARTED  Experimental Transaction successfully started. 
_CMP_EQ_OQ  Equal (ordered, nonsignaling) 
_CMP_EQ_OS  Equal (ordered, signaling) 
_CMP_EQ_UQ  Equal (unordered, nonsignaling) 
_CMP_EQ_US  Equal (unordered, signaling) 
_CMP_FALSE_OQ  False (ordered, nonsignaling) 
_CMP_FALSE_OS  False (ordered, signaling) 
_CMP_GE_OQ  Greaterthanorequal (ordered, nonsignaling) 
_CMP_GE_OS  Greaterthanorequal (ordered, signaling) 
_CMP_GT_OQ  Greaterthan (ordered, nonsignaling) 
_CMP_GT_OS  Greaterthan (ordered, signaling) 
_CMP_LE_OQ  Lessthanorequal (ordered, nonsignaling) 
_CMP_LE_OS  Lessthanorequal (ordered, signaling) 
_CMP_LT_OQ  Lessthan (ordered, nonsignaling) 
_CMP_LT_OS  Lessthan (ordered, signaling) 
_CMP_NEQ_OQ  Notequal (ordered, nonsignaling) 
_CMP_NEQ_OS  Notequal (ordered, signaling) 
_CMP_NEQ_UQ  Notequal (unordered, nonsignaling) 
_CMP_NEQ_US  Notequal (unordered, signaling) 
_CMP_NGE_UQ  Notgreaterthanorequal (unordered, nonsignaling) 
_CMP_NGE_US  Notgreaterthanorequal (unordered, signaling) 
_CMP_NGT_UQ  Notgreaterthan (unordered, nonsignaling) 
_CMP_NGT_US  Notgreaterthan (unordered, signaling) 
_CMP_NLE_UQ  Notlessthanorequal (unordered, nonsignaling) 
_CMP_NLE_US  Notlessthanorequal (unordered, signaling) 
_CMP_NLT_UQ  Notlessthan (unordered, nonsignaling) 
_CMP_NLT_US  Notlessthan (unordered, signaling) 
_CMP_ORD_Q  Ordered (nonsignaling) 
_CMP_ORD_S  Ordered (signaling) 
_CMP_TRUE_UQ  True (unordered, nonsignaling) 
_CMP_TRUE_US  True (unordered, signaling) 
_CMP_UNORD_Q  Unordered (nonsignaling) 
_CMP_UNORD_S  Unordered (signaling) 
_MM_EXCEPT_DENORM  See 
_MM_EXCEPT_DIV_ZERO  See 
_MM_EXCEPT_INEXACT  See 
_MM_EXCEPT_INVALID  See 
_MM_EXCEPT_MASK  
_MM_EXCEPT_OVERFLOW  See 
_MM_EXCEPT_UNDERFLOW  See 
_MM_FLUSH_ZERO_MASK  
_MM_FLUSH_ZERO_OFF  See 
_MM_FLUSH_ZERO_ON  See 
_MM_FROUND_CEIL  round up and do not suppress exceptions 
_MM_FROUND_CUR_DIRECTION  use MXCSR.RC; see 
_MM_FROUND_FLOOR  round down and do not suppress exceptions 
_MM_FROUND_NEARBYINT  use MXCSR.RC and suppress exceptions; see 
_MM_FROUND_NINT  round to nearest and do not suppress exceptions 
_MM_FROUND_NO_EXC  suppress exceptions 
_MM_FROUND_RAISE_EXC  do not suppress exceptions 
_MM_FROUND_RINT  use MXCSR.RC and do not suppress exceptions; see

_MM_FROUND_TO_NEAREST_INT  round to nearest 
_MM_FROUND_TO_NEG_INF  round down 
_MM_FROUND_TO_POS_INF  round up 
_MM_FROUND_TO_ZERO  truncate 
_MM_FROUND_TRUNC  truncate and do not suppress exceptions 
_MM_HINT_NTA  See 
_MM_HINT_T0  See 
_MM_HINT_T1  See 
_MM_HINT_T2  See 
_MM_MASK_DENORM  See 
_MM_MASK_DIV_ZERO  See 
_MM_MASK_INEXACT  See 
_MM_MASK_INVALID  See 
_MM_MASK_MASK  
_MM_MASK_OVERFLOW  See 
_MM_MASK_UNDERFLOW  See 
_MM_ROUND_DOWN  See 
_MM_ROUND_MASK  
_MM_ROUND_NEAREST  See 
_MM_ROUND_TOWARD_ZERO  See 
_MM_ROUND_UP  See 
_SIDD_BIT_MASK  Mask only: return the bit mask 
_SIDD_CMP_EQUAL_ANY  For each character in 
_SIDD_CMP_EQUAL_EACH  The strings defined by 
_SIDD_CMP_EQUAL_ORDERED  Search for the defined substring in the target 
_SIDD_CMP_RANGES  For each character in 
_SIDD_LEAST_SIGNIFICANT  Index only: return the least significant bit (Default) 
_SIDD_MASKED_NEGATIVE_POLARITY  Negates results only before the end of the string 
_SIDD_MASKED_POSITIVE_POLARITY  Do not negate results before the end of the string 
_SIDD_MOST_SIGNIFICANT  Index only: return the most significant bit 
_SIDD_NEGATIVE_POLARITY  Negates results 
_SIDD_POSITIVE_POLARITY  Do not negate results (Default) 
_SIDD_SBYTE_OPS  String contains signed 8bit characters 
_SIDD_SWORD_OPS  String contains unsigned 16bit characters 
_SIDD_UBYTE_OPS  String contains unsigned 8bit characters (Default) 
_SIDD_UNIT_MASK  Mask only: return the byte mask 
_SIDD_UWORD_OPS  String contains unsigned 16bit characters 
_XCR_XFEATURE_ENABLED_MASK 

Functions
_MM_SHUFFLE  Experimental A utility function for creating masks to use with Intel shuffle and permute intrinsics. 
_bittest^{⚠}  Experimental Returns the bit in position 
_bittest64^{⚠}  Experimental Returns the bit in position 
_bittestandcomplement^{⚠}  Experimental Returns the bit in position 
_bittestandcomplement64^{⚠}  Experimental Returns the bit in position 
_bittestandreset^{⚠}  Experimental Returns the bit in position 
_bittestandreset64^{⚠}  Experimental Returns the bit in position 
_bittestandset^{⚠}  Experimental Returns the bit in position 
_bittestandset64^{⚠}  Experimental Returns the bit in position 
_kand_mask16^{⚠}  Experimentalavx512f Compute the bitwise AND of 16bit masks a and b, and store the result in k. 
_kandn_mask16^{⚠}  Experimentalavx512f Compute the bitwise NOT of 16bit masks a and then AND with b, and store the result in k. 
_knot_mask16^{⚠}  Experimentalavx512f Compute the bitwise NOT of 16bit mask a, and store the result in k. 
_kor_mask16^{⚠}  Experimentalavx512f Compute the bitwise OR of 16bit masks a and b, and store the result in k. 
_kxnor_mask16^{⚠}  Experimentalavx512f Compute the bitwise XNOR of 16bit masks a and b, and store the result in k. 
_kxor_mask16^{⚠}  Experimentalavx512f Compute the bitwise XOR of 16bit masks a and b, and store the result in k. 
_mm256_cvtph_ps^{⚠}  Experimentalf16c Converts the 8 x 16bit halfprecision float values in the 128bit vector

_mm256_cvtps_ph^{⚠}  Experimentalf16c Converts the 8 x 32bit float values in the 256bit vector 
_mm256_madd52hi_epu64^{⚠}  Experimentalavx512ifma,avx512vl Multiply packed unsigned 52bit integers in each 64bit element of

_mm256_madd52lo_epu64^{⚠}  Experimentalavx512ifma,avx512vl Multiply packed unsigned 52bit integers in each 64bit element of

_mm512_abs_epi32^{⚠}  Experimentalavx512f Computes the absolute values of packed 32bit integers in 
_mm512_abs_epi64^{⚠}  Experimentalavx512f Compute the absolute value of packed signed 64bit integers in a, and store the unsigned results in dst. 
_mm512_abs_pd^{⚠}  Experimentalavx512f Finds the absolute value of each packed doubleprecision (64bit) floatingpoint element in v2, storing the results in dst. 
_mm512_abs_ps^{⚠}  Experimentalavx512f Finds the absolute value of each packed singleprecision (32bit) floatingpoint element in v2, storing the results in dst. 
_mm512_add_epi32^{⚠}  Experimentalavx512f Add packed 32bit integers in a and b, and store the results in dst. 
_mm512_add_epi64^{⚠}  Experimentalavx512f Add packed 64bit integers in a and b, and store the results in dst. 
_mm512_add_pd^{⚠}  Experimentalavx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_add_ps^{⚠}  Experimentalavx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_add_round_pd^{⚠}  Experimentalavx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_add_round_ps^{⚠}  Experimentalavx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_alignr_epi32^{⚠}  Experimentalavx512f Concatenate a and b into a 128byte immediate result, shift the result right by imm8 32bit elements, and store the low 64 bytes (16 elements) in dst. 
_mm512_alignr_epi64^{⚠}  Experimentalavx512f Concatenate a and b into a 128byte immediate result, shift the result right by imm8 64bit elements, and store the low 64 bytes (8 elements) in dst. 
_mm512_and_epi32^{⚠}  Experimentalavx512f Compute the bitwise AND of packed 32bit integers in a and b, and store the results in dst. 
_mm512_and_epi64^{⚠}  Experimentalavx512f Compute the bitwise AND of 512 bits (composed of packed 64bit integers) in a and b, and store the results in dst. 
_mm512_and_si512^{⚠}  Experimentalavx512f Compute the bitwise AND of 512 bits (representing integer data) in a and b, and store the result in dst. 
_mm512_andnot_epi32^{⚠}  Experimentalavx512f Compute the bitwise NOT of packed 32bit integers in a and then AND with b, and store the results in dst. 
_mm512_andnot_epi64^{⚠}  Experimentalavx512f Compute the bitwise NOT of 512 bits (composed of packed 64bit integers) in a and then AND with b, and store the results in dst. 
_mm512_andnot_si512^{⚠}  Experimentalavx512f Compute the bitwise NOT of 512 bits (representing integer data) in a and then AND with b, and store the result in dst. 
_mm512_broadcast_f32x4^{⚠}  Experimentalavx512f Broadcast the 4 packed singleprecision (32bit) floatingpoint elements from a to all elements of dst. 
_mm512_broadcast_f64x4^{⚠}  Experimentalavx512f Broadcast the 4 packed doubleprecision (64bit) floatingpoint elements from a to all elements of dst. 
_mm512_broadcast_i32x4^{⚠}  Experimentalavx512f Broadcast the 4 packed 32bit integers from a to all elements of dst. 
_mm512_broadcast_i64x4^{⚠}  Experimentalavx512f Broadcast the 4 packed 64bit integers from a to all elements of dst. 
_mm512_broadcastd_epi32^{⚠}  Experimentalavx512f Broadcast the low packed 32bit integer from a to all elements of dst. 
_mm512_broadcastq_epi64^{⚠}  Experimentalavx512f Broadcast the low packed 64bit integer from a to all elements of dst. 
_mm512_broadcastsd_pd^{⚠}  Experimentalavx512f Broadcast the low doubleprecision (64bit) floatingpoint element from a to all elements of dst. 
_mm512_broadcastss_ps^{⚠}  Experimentalavx512f Broadcast the low singleprecision (32bit) floatingpoint element from a to all elements of dst. 
_mm512_castpd128_pd512^{⚠}  Experimentalavx512f Cast vector of type __m128d to type __m512d; the upper 384 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castpd256_pd512^{⚠}  Experimentalavx512f Cast vector of type __m256d to type __m512d; the upper 256 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castpd512_pd128^{⚠}  Experimentalavx512f Cast vector of type __m512d to type __m128d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castpd512_pd256^{⚠}  Experimentalavx512f Cast vector of type __m512d to type __m256d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castpd_ps^{⚠}  Experimentalavx512f Cast vector of type __m512d to type __m512. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castpd_si512^{⚠}  Experimentalavx512f Cast vector of type __m512d to type __m512i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castps128_ps512^{⚠}  Experimentalavx512f Cast vector of type __m128 to type __m512; the upper 384 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castps256_ps512^{⚠}  Experimentalavx512f Cast vector of type __m256 to type __m512; the upper 256 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castps512_ps128^{⚠}  Experimentalavx512f Cast vector of type __m512 to type __m128. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castps512_ps256^{⚠}  Experimentalavx512f Cast vector of type __m512 to type __m256. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castps_pd^{⚠}  Experimentalavx512f Cast vector of type __m512 to type __m512d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castps_si512^{⚠}  Experimentalavx512f Cast vector of type __m512 to type __m512i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castsi128_si512^{⚠}  Experimentalavx512f Cast vector of type __m128i to type __m512i; the upper 384 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castsi256_si512^{⚠}  Experimentalavx512f Cast vector of type __m256i to type __m512i; the upper 256 bits of the result are undefined. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castsi512_pd^{⚠}  Experimentalavx512f Cast vector of type __m512i to type __m512d. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castsi512_ps^{⚠}  Experimentalavx512f Cast vector of type __m512i to type __m512. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castsi512_si128^{⚠}  Experimentalavx512f Cast vector of type __m512i to type __m128i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_castsi512_si256^{⚠}  Experimentalavx512f Cast vector of type __m512i to type __m256i. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_cmp_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k. 
_mm512_cmp_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k. 
_mm512_cmp_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k. 
_mm512_cmp_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k. 
_mm512_cmp_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k. 
_mm512_cmp_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k. 
_mm512_cmp_round_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k. 
_mm512_cmp_round_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k. 
_mm512_cmpeq_epi32_mask^{⚠}  Experimentalavx512f Compare packed 32bit integers in a and b for equality, and store the results in mask vector k. 
_mm512_cmpeq_epi64_mask^{⚠}  Experimentalavx512f Compare packed 64bit integers in a and b for equality, and store the results in mask vector k. 
_mm512_cmpeq_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for equality, and store the results in mask vector k. 
_mm512_cmpeq_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for equality, and store the results in mask vector k. 
_mm512_cmpeq_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for equality, and store the results in mask vector k. 
_mm512_cmpeq_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for equality, and store the results in mask vector k. 
_mm512_cmpge_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b for greaterthanorequal, and store the results in mask vector k. 
_mm512_cmpge_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for greaterthanorequal, and store the results in mask vector k. 
_mm512_cmpge_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for greaterthanorequal, and store the results in mask vector k. 
_mm512_cmpge_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for greaterthanorequal, and store the results in mask vector k. 
_mm512_cmpgt_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b for greaterthan, and store the results in mask vector k. 
_mm512_cmpgt_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for greaterthan, and store the results in mask vector k. 
_mm512_cmpgt_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for greaterthan, and store the results in mask vector k. 
_mm512_cmpgt_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for greaterthan, and store the results in mask vector k. 
_mm512_cmple_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b for lessthanorequal, and store the results in mask vector k. 
_mm512_cmple_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for lessthanorequal, and store the results in mask vector k. 
_mm512_cmple_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for lessthanorequal, and store the results in mask vector k. 
_mm512_cmple_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for lessthanorequal, and store the results in mask vector k. 
_mm512_cmple_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for lessthanorequal, and store the results in mask vector k. 
_mm512_cmple_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for lessthanorequal, and store the results in mask vector k. 
_mm512_cmplt_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b for lessthan, and store the results in mask vector k. 
_mm512_cmplt_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for lessthan, and store the results in mask vector k. 
_mm512_cmplt_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for lessthan, and store the results in mask vector k. 
_mm512_cmplt_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for lessthan, and store the results in mask vector k. 
_mm512_cmplt_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for lessthan, and store the results in mask vector k. 
_mm512_cmplt_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for lessthan, and store the results in mask vector k. 
_mm512_cmpneq_epi32_mask^{⚠}  Experimentalavx512f Compare packed 32bit integers in a and b for notequal, and store the results in mask vector k. 
_mm512_cmpneq_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for notequal, and store the results in mask vector k. 
_mm512_cmpneq_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for notequal, and store the results in mask vector k. 
_mm512_cmpneq_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for notequal, and store the results in mask vector k. 
_mm512_cmpneq_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for notequal, and store the results in mask vector k. 
_mm512_cmpneq_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for notequal, and store the results in mask vector k. 
_mm512_cmpnle_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for notlessthanorequal, and store the results in mask vector k. 
_mm512_cmpnle_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for notlessthanorequal, and store the results in mask vector k. 
_mm512_cmpnlt_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for notlessthan, and store the results in mask vector k. 
_mm512_cmpnlt_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for notlessthan, and store the results in mask vector k. 
_mm512_cmpord_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b to see if neither is NaN, and store the results in mask vector k. 
_mm512_cmpord_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b to see if neither is NaN, and store the results in mask vector k. 
_mm512_cmpunord_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b to see if either is NaN, and store the results in mask vector k. 
_mm512_cmpunord_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b to see if either is NaN, and store the results in mask vector k. 
_mm512_cvt_roundepi32_ps^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst. 
_mm512_cvt_roundepu32_ps^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst. 
_mm512_cvt_roundpd_epi32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst. 
_mm512_cvt_roundpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst. 
_mm512_cvt_roundpd_ps^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst. 
_mm512_cvt_roundph_ps^{⚠}  Experimentalavx512f Convert packed halfprecision (16bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst. 
_mm512_cvt_roundps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst. 
_mm512_cvt_roundps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst. 
_mm512_cvt_roundps_pd^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst. 
_mm512_cvt_roundps_ph^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed halfprecision (16bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtepi8_epi32^{⚠}  Experimentalavx512f Sign extend packed 8bit integers in a to packed 32bit integers, and store the results in dst. 
_mm512_cvtepi8_epi64^{⚠}  Experimentalavx512f Sign extend packed 8bit integers in the low 8 bytes of a to packed 64bit integers, and store the results in dst. 
_mm512_cvtepi16_epi32^{⚠}  Experimentalavx512f Sign extend packed 16bit integers in a to packed 32bit integers, and store the results in dst. 
_mm512_cvtepi16_epi64^{⚠}  Experimentalavx512f Sign extend packed 16bit integers in a to packed 64bit integers, and store the results in dst. 
_mm512_cvtepi32_epi8^{⚠}  Experimentalavx512f Convert packed 32bit integers in a to packed 8bit integers with truncation, and store the results in dst. 
_mm512_cvtepi32_epi16^{⚠}  Experimentalavx512f Convert packed 32bit integers in a to packed 16bit integers with truncation, and store the results in dst. 
_mm512_cvtepi32_epi64^{⚠}  Experimentalavx512f Sign extend packed 32bit integers in a to packed 64bit integers, and store the results in dst. 
_mm512_cvtepi32_pd^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtepi32_ps^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtepi32lo_pd^{⚠}  Experimentalavx512f Performs elementbyelement conversion of the lower half of packed 32bit integer elements in v2 to packed doubleprecision (64bit) floatingpoint elements, storing the results in dst. 
_mm512_cvtepi64_epi8^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 8bit integers with truncation, and store the results in dst. 
_mm512_cvtepi64_epi16^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 16bit integers with truncation, and store the results in dst. 
_mm512_cvtepi64_epi32^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 32bit integers with truncation, and store the results in dst. 
_mm512_cvtepu8_epi32^{⚠}  Experimentalavx512f Zero extend packed unsigned 8bit integers in a to packed 32bit integers, and store the results in dst. 
_mm512_cvtepu8_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 8bit integers in the low 8 byte sof a to packed 64bit integers, and store the results in dst. 
_mm512_cvtepu16_epi32^{⚠}  Experimentalavx512f Zero extend packed unsigned 16bit integers in a to packed 32bit integers, and store the results in dst. 
_mm512_cvtepu16_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 16bit integers in a to packed 64bit integers, and store the results in dst. 
_mm512_cvtepu32_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 32bit integers in a to packed 64bit integers, and store the results in dst. 
_mm512_cvtepu32_pd^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtepu32_ps^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtepu32lo_pd^{⚠}  Experimentalavx512f Performs elementbyelement conversion of the lower half of packed 32bit unsigned integer elements in v2 to packed doubleprecision (64bit) floatingpoint elements, storing the results in dst. 
_mm512_cvtpd_ps^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtpd_pslo^{⚠}  Experimentalavx512f Performs an elementbyelement conversion of packed doubleprecision (64bit) floatingpoint elements in v2 to singleprecision (32bit) floatingpoint elements and stores them in dst. The elements are stored in the lower half of the results vector, while the remaining upper half locations are set to 0. 
_mm512_cvtph_ps^{⚠}  Experimentalavx512f Convert packed halfprecision (16bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst. 
_mm512_cvtps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst. 
_mm512_cvtps_pd^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtps_ph^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed halfprecision (16bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtpslo_pd^{⚠}  Experimentalavx512f Performs elementbyelement conversion of the lower half of packed singleprecision (32bit) floatingpoint elements in v2 to packed doubleprecision (64bit) floatingpoint elements, storing the results in dst. 
_mm512_cvtsepi32_epi8^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed 8bit integers with signed saturation, and store the results in dst. 
_mm512_cvtsepi32_epi16^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed 16bit integers with signed saturation, and store the results in dst. 
_mm512_cvtsepi64_epi8^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 8bit integers with signed saturation, and store the results in dst. 
_mm512_cvtsepi64_epi16^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 16bit integers with signed saturation, and store the results in dst. 
_mm512_cvtsepi64_epi32^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 32bit integers with signed saturation, and store the results in dst. 
_mm512_cvtt_roundpd_epi32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst. 
_mm512_cvtt_roundpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst. 
_mm512_cvtt_roundps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst. 
_mm512_cvtt_roundps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst. 
_mm512_cvttpd_epi32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst. 
_mm512_cvttpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst. 
_mm512_cvttps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst. 
_mm512_cvttps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst. 
_mm512_cvtusepi32_epi8^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed unsigned 8bit integers with unsigned saturation, and store the results in dst. 
_mm512_cvtusepi32_epi16^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed unsigned 16bit integers with unsigned saturation, and store the results in dst. 
_mm512_cvtusepi64_epi8^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 8bit integers with unsigned saturation, and store the results in dst. 
_mm512_cvtusepi64_epi16^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 16bit integers with unsigned saturation, and store the results in dst. 
_mm512_cvtusepi64_epi32^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 32bit integers with unsigned saturation, and store the results in dst. 
_mm512_div_pd^{⚠}  Experimentalavx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst. 
_mm512_div_ps^{⚠}  Experimentalavx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst. 
_mm512_div_round_pd^{⚠}  Experimentalavx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, =and store the results in dst. 
_mm512_div_round_ps^{⚠}  Experimentalavx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst. 
_mm512_extractf32x4_ps^{⚠}  Experimentalavx512f Extract 128 bits (composed of 4 packed singleprecision (32bit) floatingpoint elements) from a, selected with imm8, and store the result in dst. 
_mm512_extractf64x4_pd^{⚠}  Experimentalavx512f Extract 256 bits (composed of 4 packed doubleprecision (64bit) floatingpoint elements) from a, selected with imm8, and store the result in dst. 
_mm512_extracti32x4_epi32^{⚠}  Experimentalavx512f Extract 128 bits (composed of 4 packed 32bit integers) from a, selected with imm8, and store the result in dst. 
_mm512_extracti64x4_epi64^{⚠}  Experimentalavx512f Extract 256 bits (composed of 4 packed 64bit integers) from a, selected with imm8, and store the result in dst. 
_mm512_fixupimm_pd^{⚠}  Experimentalavx512f Fix up packed doubleprecision (64bit) floatingpoint elements in a and b using packed 64bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting. 
_mm512_fixupimm_ps^{⚠}  Experimentalavx512f Fix up packed singleprecision (32bit) floatingpoint elements in a and b using packed 32bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting. 
_mm512_fixupimm_round_pd^{⚠}  Experimentalavx512f Fix up packed doubleprecision (64bit) floatingpoint elements in a and b using packed 64bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting. 
_mm512_fixupimm_round_ps^{⚠}  Experimentalavx512f Fix up packed singleprecision (32bit) floatingpoint elements in a and b using packed 32bit integers in c, and store the results in dst. imm8 is used to set the required flags reporting. 
_mm512_fmadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst. 
_mm512_fmadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst. 
_mm512_fmadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst. 
_mm512_fmadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst. 
_mm512_fmaddsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst. 
_mm512_fmaddsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst. 
_mm512_fmaddsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst. 
_mm512_fmaddsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst. 
_mm512_fmsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst. 
_mm512_fmsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst. 
_mm512_fmsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst. 
_mm512_fmsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst. 
_mm512_fmsubadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst. 
_mm512_fmsubadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst. 
_mm512_fmsubadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst. 
_mm512_fmsubadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst. 
_mm512_fnmadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst. 
_mm512_fnmadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst. 
_mm512_fnmadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst. 
_mm512_fnmadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst. 
_mm512_fnmsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst. 
_mm512_fnmsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst. 
_mm512_fnmsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst. 
_mm512_fnmsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst. 
_mm512_getexp_pd^{⚠}  Experimentalavx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_getexp_ps^{⚠}  Experimentalavx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_getexp_round_pd^{⚠}  Experimentalavx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_getexp_round_ps^{⚠}  Experimentalavx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_getmant_pd^{⚠}  Experimentalavx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_getmant_ps^{⚠}  Experimentalavx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 
_mm512_getmant_round_pd^{⚠}  Experimentalavx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_getmant_round_ps^{⚠}  Experimentalavx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_i32gather_epi32^{⚠}  Experimentalavx512f Gather 32bit integers from memory using 32bit indices. 32bit elements are loaded from addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8. 
_mm512_i32gather_epi64^{⚠}  Experimentalavx512f Gather 64bit integers from memory using 32bit indices. 64bit elements are loaded from addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8. 
_mm512_i32gather_pd^{⚠}  Experimentalavx512f Gather doubleprecision (64bit) floatingpoint elements from memory using 32bit indices. 64bit elements are loaded from addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8. 
_mm512_i32gather_ps^{⚠}  Experimentalavx512f Gather singleprecision (32bit) floatingpoint elements from memory using 32bit indices. 32bit elements are loaded from addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8. 
_mm512_i32scatter_epi32^{⚠}  Experimentalavx512f Scatter 32bit integers from a into memory using 32bit indices. 32bit elements are stored at addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8. 
_mm512_i32scatter_epi64^{⚠}  Experimentalavx512f Scatter 64bit integers from a into memory using 32bit indices. 64bit elements are stored at addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8. 
_mm512_i32scatter_pd^{⚠}  Experimentalavx512f Scatter doubleprecision (64bit) floatingpoint elements from a into memory using 32bit indices. 64bit elements are stored at addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8. 
_mm512_i32scatter_ps^{⚠}  Experimentalavx512f Scatter singleprecision (32bit) floatingpoint elements from a into memory using 32bit indices. 32bit elements are stored at addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8. 
_mm512_i64gather_epi32^{⚠}  Experimentalavx512f Gather 32bit integers from memory using 64bit indices. 32bit elements are loaded from addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8. 
_mm512_i64gather_epi64^{⚠}  Experimentalavx512f Gather 64bit integers from memory using 64bit indices. 64bit elements are loaded from addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8. 
_mm512_i64gather_pd^{⚠}  Experimentalavx512f Gather doubleprecision (64bit) floatingpoint elements from memory using 64bit indices. 64bit elements are loaded from addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8. 
_mm512_i64gather_ps^{⚠}  Experimentalavx512f Gather singleprecision (32bit) floatingpoint elements from memory using 64bit indices. 32bit elements are loaded from addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst. scale should be 1, 2, 4 or 8. 
_mm512_i64scatter_epi32^{⚠}  Experimentalavx512f Scatter 32bit integers from a into memory using 64bit indices. 32bit elements are stored at addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8. 
_mm512_i64scatter_epi64^{⚠}  Experimentalavx512f Scatter 64bit integers from a into memory using 64bit indices. 64bit elements are stored at addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8. 
_mm512_i64scatter_pd^{⚠}  Experimentalavx512f Scatter doubleprecision (64bit) floatingpoint elements from a into memory using 64bit indices. 64bit elements are stored at addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). scale should be 1, 2, 4 or 8. 
_mm512_i64scatter_ps^{⚠}  Experimentalavx512f Scatter singleprecision (32bit) floatingpoint elements from a into memory using 64bit indices. 32bit elements are stored at addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_insertf32x4^{⚠}  Experimentalavx512f Copy a to dst, then insert 128 bits (composed of 4 packed singleprecision (32bit) floatingpoint elements) from b into dst at the location specified by imm8. 
_mm512_insertf64x4^{⚠}  Experimentalavx512f Copy a to dst, then insert 256 bits (composed of 4 packed doubleprecision (64bit) floatingpoint elements) from b into dst at the location specified by imm8. 
_mm512_inserti32x4^{⚠}  Experimentalavx512f Copy a to dst, then insert 128 bits (composed of 4 packed 32bit integers) from b into dst at the location specified by imm8. 
_mm512_inserti64x4^{⚠}  Experimentalavx512f Copy a to dst, then insert 256 bits (composed of 4 packed 64bit integers) from b into dst at the location specified by imm8. 
_mm512_int2mask^{⚠}  Experimentalavx512f Converts integer mask into bitmask, storing the result in dst. 
_mm512_kand^{⚠}  Experimentalavx512f Compute the bitwise AND of 16bit masks a and b, and store the result in k. 
_mm512_kandn^{⚠}  Experimentalavx512f Compute the bitwise NOT of 16bit masks a and then AND with b, and store the result in k. 
_mm512_kmov^{⚠}  Experimentalavx512f Copy 16bit mask a to k. 
_mm512_knot^{⚠}  Experimentalavx512f Compute the bitwise NOT of 16bit mask a, and store the result in k. 
_mm512_kor^{⚠}  Experimentalavx512f Compute the bitwise OR of 16bit masks a and b, and store the result in k. 
_mm512_kortestc^{⚠}  Experimentalavx512f Performs bitwise OR between k1 and k2, storing the result in dst. CF flag is set if dst consists of all 1's. 
_mm512_kunpackb^{⚠}  Experimentalavx512f Unpack and interleave 8 bits from masks a and b, and store the 16bit result in k. 
_mm512_kxnor^{⚠}  Experimentalavx512f Compute the bitwise XNOR of 16bit masks a and b, and store the result in k. 
_mm512_kxor^{⚠}  Experimentalavx512f Compute the bitwise XOR of 16bit masks a and b, and store the result in k. 
_mm512_load_epi32^{⚠}  Experimentalavx512f Load 512bits (composed of 16 packed 32bit integers) from memory into dst. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_load_epi64^{⚠}  Experimentalavx512f Load 512bits (composed of 8 packed 64bit integers) from memory into dst. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_load_pd^{⚠}  Experimentalavx512f Load 512bits (composed of 8 packed doubleprecision (64bit) floatingpoint elements) from memory into dst. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_load_ps^{⚠}  Experimentalavx512f Load 512bits (composed of 16 packed singleprecision (32bit) floatingpoint elements) from memory into dst. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_load_si512^{⚠}  Experimentalavx512f Load 512bits of integer data from memory into dst. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_loadu_epi32^{⚠}  Experimentalavx512f Load 512bits (composed of 16 packed 32bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary. 
_mm512_loadu_epi64^{⚠}  Experimentalavx512f Load 512bits (composed of 8 packed 64bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary. 
_mm512_loadu_pd^{⚠}  Experimentalavx512f Loads 512bits (composed of 8 packed doubleprecision (64bit)
floatingpoint elements) from memory into result.

_mm512_loadu_ps^{⚠}  Experimentalavx512f Loads 512bits (composed of 16 packed singleprecision (32bit)
floatingpoint elements) from memory into result.

_mm512_loadu_si512^{⚠}  Experimentalavx512f Load 512bits of integer data from memory into dst. mem_addr does not need to be aligned on any particular boundary. 
_mm512_madd52hi_epu64^{⚠}  Experimentalavx512ifma Multiply packed unsigned 52bit integers in each 64bit element of

_mm512_madd52lo_epu64^{⚠}  Experimentalavx512ifma Multiply packed unsigned 52bit integers in each 64bit element of

_mm512_mask2_permutex2var_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set). 
_mm512_mask2_permutex2var_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set). 
_mm512_mask2_permutex2var_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set) 
_mm512_mask2_permutex2var_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set). 
_mm512_mask2int^{⚠}  Experimentalavx512f Converts bit mask k1 into an integer value, storing the results in dst. 
_mm512_mask3_fmadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmaddsub_pd^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmaddsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmaddsub_round_pd^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmaddsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsubadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsubadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsubadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsubadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask_abs_epi32^{⚠}  Experimentalavx512f Computes the absolute value of packed 32bit integers in 
_mm512_mask_abs_epi64^{⚠}  Experimentalavx512f Compute the absolute value of packed signed 64bit integers in a, and store the unsigned results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_abs_pd^{⚠}  Experimentalavx512f Finds the absolute value of each packed doubleprecision (64bit) floatingpoint element in v2, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_abs_ps^{⚠}  Experimentalavx512f Finds the absolute value of each packed singleprecision (32bit) floatingpoint element in v2, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_epi32^{⚠}  Experimentalavx512f Add packed 32bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_epi64^{⚠}  Experimentalavx512f Add packed 64bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_pd^{⚠}  Experimentalavx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_ps^{⚠}  Experimentalavx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_round_pd^{⚠}  Experimentalavx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_round_ps^{⚠}  Experimentalavx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_alignr_epi32^{⚠}  Experimentalavx512f Concatenate a and b into a 128byte immediate result, shift the result right by imm8 32bit elements, and store the low 64 bytes (16 elements) in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_alignr_epi64^{⚠}  Experimentalavx512f Concatenate a and b into a 128byte immediate result, shift the result right by imm8 64bit elements, and store the low 64 bytes (8 elements) in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_and_epi32^{⚠}  Experimentalavx512f Performs elementbyelement bitwise AND between packed 32bit integer elements of v2 and v3, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_and_epi64^{⚠}  Experimentalavx512f Compute the bitwise AND of packed 64bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_andnot_epi32^{⚠}  Experimentalavx512f Compute the bitwise NOT of packed 32bit integers in a and then AND with b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_andnot_epi64^{⚠}  Experimentalavx512f Compute the bitwise NOT of packed 64bit integers in a and then AND with b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_blend_epi32^{⚠}  Experimentalavx512f Blend packed 32bit integers from a and b using control mask k, and store the results in dst. 
_mm512_mask_blend_epi64^{⚠}  Experimentalavx512f Blend packed 64bit integers from a and b using control mask k, and store the results in dst. 
_mm512_mask_blend_pd^{⚠}  Experimentalavx512f Blend packed doubleprecision (64bit) floatingpoint elements from a and b using control mask k, and store the results in dst. 
_mm512_mask_blend_ps^{⚠}  Experimentalavx512f Blend packed singleprecision (32bit) floatingpoint elements from a and b using control mask k, and store the results in dst. 
_mm512_mask_broadcast_f32x4^{⚠}  Experimentalavx512f Broadcast the 4 packed singleprecision (32bit) floatingpoint elements from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_broadcast_f64x4^{⚠}  Experimentalavx512f Broadcast the 4 packed doubleprecision (64bit) floatingpoint elements from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_broadcast_i32x4^{⚠}  Experimentalavx512f Broadcast the 4 packed 32bit integers from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_broadcast_i64x4^{⚠}  Experimentalavx512f Broadcast the 4 packed 64bit integers from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_broadcastd_epi32^{⚠}  Experimentalavx512f Broadcast the low packed 32bit integer from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_broadcastq_epi64^{⚠}  Experimentalavx512f Broadcast the low packed 64bit integer from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_broadcastsd_pd^{⚠}  Experimentalavx512f Broadcast the low doubleprecision (64bit) floatingpoint element from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_broadcastss_ps^{⚠}  Experimentalavx512f Broadcast the low singleprecision (32bit) floatingpoint element from a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cmp_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_round_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_round_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b based on the comparison operand specified by imm8, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_epi32_mask^{⚠}  Experimentalavx512f Compare packed 32bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_epi64_mask^{⚠}  Experimentalavx512f Compare packed 64bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for equality, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpge_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b for greaterthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpge_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for greaterthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpge_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for greaterthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpge_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for greaterthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpgt_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b for greaterthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpgt_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for greaterthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpgt_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for greaterthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpgt_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for greaterthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b for lessthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for lessthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for lessthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for lessthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for lessthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for lessthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_epi32_mask^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b for lessthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for lessthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for lessthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for lessthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for lessthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for lessthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_epi32_mask^{⚠}  Experimentalavx512f Compare packed 32bit integers in a and b for notequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_epi64_mask^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b for notequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_epu32_mask^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b for notequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_epu64_mask^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b for notequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for notequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for notequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpnle_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for notlessthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpnle_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for notlessthanorequal, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpnlt_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for notlessthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpnlt_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for notlessthan, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpord_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b to see if neither is NaN, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpord_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b to see if neither is NaN, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpunord_pd_mask^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b to see if either is NaN, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpunord_ps_mask^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b to see if either is NaN, and store the results in mask vector k using zeromask k1 (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_compress_epi32^{⚠}  Experimentalavx512f Contiguously store the active 32bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src. 
_mm512_mask_compress_epi64^{⚠}  Experimentalavx512f Contiguously store the active 64bit integers in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src. 
_mm512_mask_compress_pd^{⚠}  Experimentalavx512f Contiguously store the active doubleprecision (64bit) floatingpoint elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src. 
_mm512_mask_compress_ps^{⚠}  Experimentalavx512f Contiguously store the active singleprecision (32bit) floatingpoint elements in a (those with their respective bit set in writemask k) to dst, and pass through the remaining elements from src. 
_mm512_mask_cvt_roundepi32_ps^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundepu32_ps^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundpd_epi32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundpd_ps^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundph_ps^{⚠}  Experimentalavx512f Convert packed halfprecision (16bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundps_pd^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundps_ph^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed halfprecision (16bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi8_epi32^{⚠}  Experimentalavx512f Sign extend packed 8bit integers in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi8_epi64^{⚠}  Experimentalavx512f Sign extend packed 8bit integers in the low 8 bytes of a to packed 64bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi16_epi32^{⚠}  Experimentalavx512f Sign extend packed 16bit integers in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi16_epi64^{⚠}  Experimentalavx512f Sign extend packed 16bit integers in a to packed 64bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi32_epi8^{⚠}  Experimentalavx512f Convert packed 32bit integers in a to packed 8bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi32_epi16^{⚠}  Experimentalavx512f Convert packed 32bit integers in a to packed 16bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi32_epi64^{⚠}  Experimentalavx512f Sign extend packed 32bit integers in a to packed 64bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi32_pd^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi32_ps^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi32lo_pd^{⚠}  Experimentalavx512f Performs elementbyelement conversion of the lower half of packed 32bit integer elements in v2 to packed doubleprecision (64bit) floatingpoint elements, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi64_epi8^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 8bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi64_epi16^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 16bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepi64_epi32^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepu8_epi32^{⚠}  Experimentalavx512f Zero extend packed unsigned 8bit integers in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepu8_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 8bit integers in the low 8 bytes of a to packed 64bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepu16_epi32^{⚠}  Experimentalavx512f Zero extend packed unsigned 16bit integers in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepu16_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 16bit integers in a to packed 64bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepu32_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 32bit integers in a to packed 64bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepu32_pd^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepu32_ps^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtepu32lo_pd^{⚠}  Experimentalavx512f Performs elementbyelement conversion of the lower half of 32bit unsigned integer elements in v2 to packed doubleprecision (64bit) floatingpoint elements, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtpd_ps^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtpd_pslo^{⚠}  Experimentalavx512f Performs an elementbyelement conversion of packed doubleprecision (64bit) floatingpoint elements in v2 to singleprecision (32bit) floatingpoint elements and stores them in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The elements are stored in the lower half of the results vector, while the remaining upper half locations are set to 0. 
_mm512_mask_cvtph_ps^{⚠}  Experimentalavx512f Convert packed halfprecision (16bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtps_pd^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtps_ph^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed halfprecision (16bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtpslo_pd^{⚠}  Experimentalavx512f Performs elementbyelement conversion of the lower half of packed singleprecision (32bit) floatingpoint elements in v2 to packed doubleprecision (64bit) floatingpoint elements, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtsepi32_epi8^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed 8bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtsepi32_epi16^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed 16bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtsepi64_epi8^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 8bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtsepi64_epi16^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 16bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtsepi64_epi32^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 32bit integers with signed saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtt_roundpd_epi32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtt_roundpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtt_roundps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtt_roundps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvttpd_epi32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvttpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvttps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvttps_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtusepi32_epi8^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed unsigned 8bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtusepi32_epi16^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed unsigned 16bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtusepi64_epi8^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 8bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtusepi64_epi16^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 16bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtusepi64_epi32^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 32bit integers with unsigned saturation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_div_pd^{⚠}  Experimentalavx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_div_ps^{⚠}  Experimentalavx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_div_round_pd^{⚠}  Experimentalavx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_div_round_ps^{⚠}  Experimentalavx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_expand_epi32^{⚠}  Experimentalavx512f Load contiguous active 32bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_expand_epi64^{⚠}  Experimentalavx512f Load contiguous active 64bit integers from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_expand_pd^{⚠}  Experimentalavx512f Load contiguous active doubleprecision (64bit) floatingpoint elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_expand_ps^{⚠}  Experimentalavx512f Load contiguous active singleprecision (32bit) floatingpoint elements from a (those with their respective bit set in mask k), and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_extractf32x4_ps^{⚠}  Experimentalavx512f Extract 128 bits (composed of 4 packed singleprecision (32bit) floatingpoint elements) from a, selected with imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_extractf64x4_pd^{⚠}  Experimentalavx512f Extract 256 bits (composed of 4 packed doubleprecision (64bit) floatingpoint elements) from a, selected with imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_extracti32x4_epi32^{⚠}  Experimentalavx512f Extract 128 bits (composed of 4 packed 32bit integers) from a, selected with imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_extracti64x4_epi64^{⚠}  Experimentalavx512f Extract 256 bits (composed of 4 packed 64bit integers) from a, selected with imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_fixupimm_pd^{⚠}  Experimentalavx512f Fix up packed doubleprecision (64bit) floatingpoint elements in a and b using packed 64bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting. 
_mm512_mask_fixupimm_ps^{⚠}  Experimentalavx512f Fix up packed singleprecision (32bit) floatingpoint elements in a and b using packed 32bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting. 
_mm512_mask_fixupimm_round_pd^{⚠}  Experimentalavx512f Fix up packed doubleprecision (64bit) floatingpoint elements in a and b using packed 64bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting. 
_mm512_mask_fixupimm_round_ps^{⚠}  Experimentalavx512f Fix up packed singleprecision (32bit) floatingpoint elements in a and b using packed 32bit integers in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). imm8 is used to set the required flags reporting. 
_mm512_mask_fmadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmaddsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmaddsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmaddsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmaddsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsubadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsubadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsubadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsubadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_getexp_pd^{⚠}  Experimentalavx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_mask_getexp_ps^{⚠}  Experimentalavx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_mask_getexp_round_pd^{⚠}  Experimentalavx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_mask_getexp_round_ps^{⚠}  Experimentalavx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_mask_getmant_pd^{⚠}  Experimentalavx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_mask_getmant_ps^{⚠}  Experimentalavx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_mask_getmant_round_pd^{⚠}  Experimentalavx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_mask_getmant_round_ps^{⚠}  Experimentalavx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_mask_i32gather_epi32^{⚠}  Experimentalavx512f Gather 32bit integers from memory using 32bit indices. 32bit elements are loaded from addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i32gather_epi64^{⚠}  Experimentalavx512f Gather 64bit integers from memory using 32bit indices. 64bit elements are loaded from addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i32gather_pd^{⚠}  Experimentalavx512f Gather doubleprecision (64bit) floatingpoint elements from memory using 32bit indices. 64bit elements are loaded from addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i32gather_ps^{⚠}  Experimentalavx512f Gather singleprecision (32bit) floatingpoint elements from memory using 32bit indices. 32bit elements are loaded from addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i32scatter_epi32^{⚠}  Experimentalavx512f Scatter 32bit integers from a into memory using 32bit indices. 32bit elements are stored at addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i32scatter_epi64^{⚠}  Experimentalavx512f Scatter 64bit integers from a into memory using 32bit indices. 64bit elements are stored at addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i32scatter_pd^{⚠}  Experimentalavx512f Scatter doubleprecision (64bit) floatingpoint elements from a into memory using 32bit indices. 64bit elements are stored at addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i32scatter_ps^{⚠}  Experimentalavx512f Scatter singleprecision (32bit) floatingpoint elements from a into memory using 32bit indices. 32bit elements are stored at addresses starting at base_addr and offset by each 32bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i64gather_epi32^{⚠}  Experimentalavx512f Gather 32bit integers from memory using 64bit indices. 32bit elements are loaded from addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i64gather_epi64^{⚠}  Experimentalavx512f Gather 64bit integers from memory using 64bit indices. 64bit elements are loaded from addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i64gather_pd^{⚠}  Experimentalavx512f Gather doubleprecision (64bit) floatingpoint elements from memory using 64bit indices. 64bit elements are loaded from addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i64gather_ps^{⚠}  Experimentalavx512f Gather singleprecision (32bit) floatingpoint elements from memory using 64bit indices. 32bit elements are loaded from addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale). Gathered elements are merged into dst using writemask k (elements are copied from src when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i64scatter_epi32^{⚠}  Experimentalavx512f Scatter 32bit integers from a into memory using 64bit indices. 32bit elements are stored at addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i64scatter_epi64^{⚠}  Experimentalavx512f Scatter 64bit integers from a into memory using 64bit indices. 64bit elements are stored at addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i64scatter_pd^{⚠}  Experimentalavx512f Scatter doubleprecision (64bit) floatingpoint elements from a into memory using 64bit indices. 64bit elements are stored at addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_i64scatter_ps^{⚠}  Experimentalavx512f Scatter singleprecision (32bit) floatingpoint elements from a into memory using 64bit indices. 32bit elements are stored at addresses starting at base_addr and offset by each 64bit element in vindex (each index is scaled by the factor in scale) subject to mask k (elements are not stored when the corresponding mask bit is not set). scale should be 1, 2, 4 or 8. 
_mm512_mask_insertf32x4^{⚠}  Experimentalavx512f Copy a to tmp, then insert 128 bits (composed of 4 packed singleprecision (32bit) floatingpoint elements) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_insertf64x4^{⚠}  Experimentalavx512f Copy a to tmp, then insert 256 bits (composed of 4 packed doubleprecision (64bit) floatingpoint elements) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_inserti32x4^{⚠}  Experimentalavx512f Copy a to tmp, then insert 128 bits (composed of 4 packed 32bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_inserti64x4^{⚠}  Experimentalavx512f Copy a to tmp, then insert 256 bits (composed of 4 packed 64bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_epi32^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_epi64^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_epu32^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_epu64^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_round_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_round_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_epi32^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_epi64^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_epu32^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_epu64^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_round_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_round_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mov_epi32^{⚠}  Experimentalavx512f Move packed 32bit integers from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mov_epi64^{⚠}  Experimentalavx512f Move packed 64bit integers from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mov_pd^{⚠}  Experimentalavx512f Move packed doubleprecision (64bit) floatingpoint elements from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mov_ps^{⚠}  Experimentalavx512f Move packed singleprecision (32bit) floatingpoint elements from a to dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_movedup_pd^{⚠}  Experimentalavx512f Duplicate evenindexed doubleprecision (64bit) floatingpoint elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_movehdup_ps^{⚠}  Experimentalavx512f Duplicate oddindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_moveldup_ps^{⚠}  Experimentalavx512f Duplicate evenindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mul_epi32^{⚠}  Experimentalavx512f Multiply the low signed 32bit integers from each packed 64bit element in a and b, and store the signed 64bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mul_epu32^{⚠}  Experimentalavx512f Multiply the low unsigned 32bit integers from each packed 64bit element in a and b, and store the unsigned 64bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mul_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). RM. 
_mm512_mask_mul_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). RM. 
_mm512_mask_mul_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mul_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mullo_epi32^{⚠}  Experimentalavx512f Multiply the packed 32bit integers in a and b, producing intermediate 64bit integers, and store the low 32 bits of the intermediate integers in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mullox_epi64^{⚠}  Experimentalavx512f Multiplies elements in packed 64bit integer vectors a and b together, storing the lower 64 bits of the result in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_or_epi32^{⚠}  Experimentalavx512f Compute the bitwise OR of packed 32bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_or_epi64^{⚠}  Experimentalavx512f Compute the bitwise OR of packed 64bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permute_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permute_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutevar_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Note that this intrinsic shuffles across 128bit lanes, unlike past intrinsics that use the permutevar name. This intrinsic is identical to _mm512_mask_permutexvar_epi32, and it is recommended that you use that intrinsic name. 
_mm512_mask_permutevar_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutevar_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutex2var_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_permutex2var_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_permutex2var_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_permutex2var_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_permutex_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a within 256bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutex_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 256bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutexvar_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutexvar_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutexvar_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutexvar_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rcp14_pd^{⚠}  Experimentalavx512f Compute the approximate reciprocal of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_mask_rcp14_ps^{⚠}  Experimentalavx512f Compute the approximate reciprocal of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_mask_reduce_add_epi32^{⚠}  Experimentalavx512f Reduce the packed 32bit integers in a by addition using mask k. Returns the sum of all active elements in a. 
_mm512_mask_reduce_add_epi64^{⚠}  Experimentalavx512f Reduce the packed 64bit integers in a by addition using mask k. Returns the sum of all active elements in a. 
_mm512_mask_reduce_add_pd^{⚠}  Experimentalavx512f Reduce the packed doubleprecision (64bit) floatingpoint elements in a by addition using mask k. Returns the sum of all active elements in a. 
_mm512_mask_reduce_add_ps^{⚠}  Experimentalavx512f Reduce the packed singleprecision (32bit) floatingpoint elements in a by addition using mask k. Returns the sum of all active elements in a. 
_mm512_mask_reduce_and_epi32^{⚠}  Experimentalavx512f Reduce the packed 32bit integers in a by bitwise AND using mask k. Returns the bitwise AND of all active elements in a. 
_mm512_mask_reduce_and_epi64^{⚠}  Experimentalavx512f Reduce the packed 64bit integers in a by addition using mask k. Returns the sum of all active elements in a. 
_mm512_mask_reduce_max_epi32^{⚠}  Experimentalavx512f Reduce the packed signed 32bit integers in a by maximum using mask k. Returns the maximum of all active elements in a. 
_mm512_mask_reduce_max_epi64^{⚠}  Experimentalavx512f Reduce the packed signed 64bit integers in a by maximum using mask k. Returns the maximum of all active elements in a. 
_mm512_mask_reduce_max_epu32^{⚠}  Experimentalavx512f Reduce the packed unsigned 32bit integers in a by maximum using mask k. Returns the maximum of all active elements in a. 
_mm512_mask_reduce_max_epu64^{⚠}  Experimentalavx512f Reduce the packed unsigned 64bit integers in a by maximum using mask k. Returns the maximum of all active elements in a. 
_mm512_mask_reduce_max_pd^{⚠}  Experimentalavx512f Reduce the packed doubleprecision (64bit) floatingpoint elements in a by maximum using mask k. Returns the maximum of all active elements in a. 
_mm512_mask_reduce_max_ps^{⚠}  Experimentalavx512f Reduce the packed singleprecision (32bit) floatingpoint elements in a by maximum using mask k. Returns the maximum of all active elements in a. 
_mm512_mask_reduce_min_epi32^{⚠}  Experimentalavx512f Reduce the packed signed 32bit integers in a by maximum using mask k. Returns the minimum of all active elements in a. 
_mm512_mask_reduce_min_epi64^{⚠}  Experimentalavx512f Reduce the packed signed 64bit integers in a by maximum using mask k. Returns the minimum of all active elements in a. 
_mm512_mask_reduce_min_epu32^{⚠}  Experimentalavx512f Reduce the packed unsigned 32bit integers in a by maximum using mask k. Returns the minimum of all active elements in a. 
_mm512_mask_reduce_min_epu64^{⚠}  Experimentalavx512f Reduce the packed signed 64bit integers in a by maximum using mask k. Returns the minimum of all active elements in a. 
_mm512_mask_reduce_min_pd^{⚠}  Experimentalavx512f Reduce the packed doubleprecision (64bit) floatingpoint elements in a by maximum using mask k. Returns the minimum of all active elements in a. 
_mm512_mask_reduce_min_ps^{⚠}  Experimentalavx512f Reduce the packed singleprecision (32bit) floatingpoint elements in a by maximum using mask k. Returns the minimum of all active elements in a. 
_mm512_mask_reduce_mul_epi32^{⚠}  Experimentalavx512f Reduce the packed 32bit integers in a by multiplication using mask k. Returns the product of all active elements in a. 
_mm512_mask_reduce_mul_epi64^{⚠}  Experimentalavx512f Reduce the packed 64bit integers in a by multiplication using mask k. Returns the product of all active elements in a. 
_mm512_mask_reduce_mul_pd^{⚠}  Experimentalavx512f Reduce the packed doubleprecision (64bit) floatingpoint elements in a by multiplication using mask k. Returns the product of all active elements in a. 
_mm512_mask_reduce_mul_ps^{⚠}  Experimentalavx512f Reduce the packed singleprecision (32bit) floatingpoint elements in a by multiplication using mask k. Returns the product of all active elements in a. 
_mm512_mask_reduce_or_epi32^{⚠}  Experimentalavx512f Reduce the packed 32bit integers in a by bitwise OR using mask k. Returns the bitwise OR of all active elements in a. 
_mm512_mask_reduce_or_epi64^{⚠}  Experimentalavx512f Reduce the packed 64bit integers in a by bitwise OR using mask k. Returns the bitwise OR of all active elements in a. 
_mm512_mask_rol_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rol_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rolv_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rolv_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_ror_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_ror_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rorv_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rorv_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_roundscale_pd^{⚠}  Experimentalavx512f Round packed doubleprecision (64bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_roundscale_ps^{⚠}  Experimentalavx512f Round packed singleprecision (32bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_roundscale_round_pd^{⚠}  Experimentalavx512f Round packed doubleprecision (64bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_roundscale_round_ps^{⚠}  Experimentalavx512f Round packed singleprecision (32bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rsqrt14_pd^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_mask_rsqrt14_ps^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_mask_scalef_pd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_scalef_ps^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_scalef_round_pd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_scalef_round_ps^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_set1_epi32^{⚠}  Experimentalavx512f Broadcast 32bit integer a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_set1_epi64^{⚠}  Experimentalavx512f Broadcast 64bit integer a to all elements of dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_f32x4^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 4 singleprecision (32bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_f64x2^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 2 doubleprecision (64bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_i32x4^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 4 32bit integers) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_i64x2^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 2 64bit integers) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sll_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sll_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_slli_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_slli_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sllv_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sllv_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sqrt_pd^{⚠}  Experimentalavx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sqrt_ps^{⚠}  Experimentalavx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sqrt_round_pd^{⚠}  Experimentalavx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sqrt_round_ps^{⚠}  Experimentalavx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sra_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sra_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srai_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srai_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srav_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srav_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srl_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srl_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srli_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srli_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srlv_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srlv_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_epi32^{⚠}  Experimentalavx512f Subtract packed 32bit integers in b from packed 32bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_epi64^{⚠}  Experimentalavx512f Subtract packed 64bit integers in b from packed 64bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_pd^{⚠}  Experimentalavx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_ps^{⚠}  Experimentalavx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_round_pd^{⚠}  Experimentalavx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_round_ps^{⚠}  Experimentalavx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_ternarylogic_epi32^{⚠}  Experimentalavx512f Bitwise ternary logic that provides the capability to implement any threeoperand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using writemask k at 32bit granularity (32bit elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_ternarylogic_epi64^{⚠}  Experimentalavx512f Bitwise ternary logic that provides the capability to implement any threeoperand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64bit integer, the corresponding bit from src, a, and b are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using writemask k at 64bit granularity (64bit elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_test_epi32_mask^{⚠}  Experimentalavx512f Compute the bitwise AND of packed 32bit integers in a and b, producing intermediate 32bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is nonzero. 
_mm512_mask_test_epi64_mask^{⚠}  Experimentalavx512f Compute the bitwise AND of packed 64bit integers in a and b, producing intermediate 64bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is nonzero. 
_mm512_mask_testn_epi32_mask^{⚠}  Experimentalavx512f Compute the bitwise NAND of packed 32bit integers in a and b, producing intermediate 32bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is zero. 
_mm512_mask_testn_epi64_mask^{⚠}  Experimentalavx512f Compute the bitwise NAND of packed 64bit integers in a and b, producing intermediate 64bit values, and set the corresponding bit in result mask k (subject to writemask k) if the intermediate value is zero. 
_mm512_mask_unpackhi_epi32^{⚠}  Experimentalavx512f Unpack and interleave 32bit integers from the high half of each 128bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_unpackhi_epi64^{⚠}  Experimentalavx512f Unpack and interleave 64bit integers from the high half of each 128bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_unpackhi_pd^{⚠}  Experimentalavx512f Unpack and interleave doubleprecision (64bit) floatingpoint elements from the high half of each 128bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_unpackhi_ps^{⚠}  Experimentalavx512f Unpack and interleave singleprecision (32bit) floatingpoint elements from the high half of each 128bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_unpacklo_epi32^{⚠}  Experimentalavx512f Unpack and interleave 32bit integers from the low half of each 128bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_unpacklo_epi64^{⚠}  Experimentalavx512f Unpack and interleave 64bit integers from the low half of each 128bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_unpacklo_pd^{⚠}  Experimentalavx512f Unpack and interleave doubleprecision (64bit) floatingpoint elements from the low half of each 128bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_unpacklo_ps^{⚠}  Experimentalavx512f Unpack and interleave singleprecision (32bit) floatingpoint elements from the low half of each 128bit lane in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_xor_epi32^{⚠}  Experimentalavx512f Compute the bitwise XOR of packed 32bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_xor_epi64^{⚠}  Experimentalavx512f Compute the bitwise XOR of packed 64bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_maskz_abs_epi32^{⚠}  Experimentalavx512f Computes the absolute value of packed 32bit integers in 
_mm512_maskz_abs_epi64^{⚠}  Experimentalavx512f Compute the absolute value of packed signed 64bit integers in a, and store the unsigned results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_epi32^{⚠}  Experimentalavx512f Add packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_epi64^{⚠}  Experimentalavx512f Add packed 64bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_pd^{⚠}  Experimentalavx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_ps^{⚠}  Experimentalavx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_round_pd^{⚠}  Experimentalavx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_round_ps^{⚠}  Experimentalavx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_alignr_epi32^{⚠}  Experimentalavx512f Concatenate a and b into a 128byte immediate result, shift the result right by imm8 32bit elements, and stores the low 64 bytes (16 elements) in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_alignr_epi64^{⚠}  Experimentalavx512f Concatenate a and b into a 128byte immediate result, shift the result right by imm8 64bit elements, and stores the low 64 bytes (8 elements) in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_and_epi32^{⚠}  Experimentalavx512f Compute the bitwise AND of packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_and_epi64^{⚠}  Experimentalavx512f Compute the bitwise AND of packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_andnot_epi32^{⚠}  Experimentalavx512f Compute the bitwise NOT of packed 32bit integers in a and then AND with b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_andnot_epi64^{⚠}  Experimentalavx512f Compute the bitwise NOT of packed 64bit integers in a and then AND with b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_broadcast_f32x4^{⚠}  Experimentalavx512f Broadcast the 4 packed singleprecision (32bit) floatingpoint elements from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_broadcast_f64x4^{⚠}  Experimentalavx512f Broadcast the 4 packed doubleprecision (64bit) floatingpoint elements from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_broadcast_i32x4^{⚠}  Experimentalavx512f Broadcast the 4 packed 32bit integers from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_broadcast_i64x4^{⚠}  Experimentalavx512f Broadcast the 4 packed 64bit integers from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_broadcastd_epi32^{⚠}  Experimentalavx512f Broadcast the low packed 32bit integer from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_broadcastq_epi64^{⚠}  Experimentalavx512f Broadcast the low packed 64bit integer from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_broadcastsd_pd^{⚠}  Experimentalavx512f Broadcast the low doubleprecision (64bit) floatingpoint element from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_broadcastss_ps^{⚠}  Experimentalavx512f Broadcast the low singleprecision (32bit) floatingpoint element from a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_compress_epi32^{⚠}  Experimentalavx512f Contiguously store the active 32bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero. 
_mm512_maskz_compress_epi64^{⚠}  Experimentalavx512f Contiguously store the active 64bit integers in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero. 
_mm512_maskz_compress_pd^{⚠}  Experimentalavx512f Contiguously store the active doubleprecision (64bit) floatingpoint elements in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero. 
_mm512_maskz_compress_ps^{⚠}  Experimentalavx512f Contiguously store the active singleprecision (32bit) floatingpoint elements in a (those with their respective bit set in zeromask k) to dst, and set the remaining elements to zero. 
_mm512_maskz_cvt_roundepi32_ps^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundepu32_ps^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundpd_epi32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundpd_ps^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundph_ps^{⚠}  Experimentalavx512f Convert packed halfprecision (16bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundps_pd^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundps_ph^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed halfprecision (16bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi8_epi32^{⚠}  Experimentalavx512f Sign extend packed 8bit integers in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi8_epi64^{⚠}  Experimentalavx512f Sign extend packed 8bit integers in the low 8 bytes of a to packed 64bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi16_epi32^{⚠}  Experimentalavx512f Sign extend packed 16bit integers in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi16_epi64^{⚠}  Experimentalavx512f Sign extend packed 16bit integers in a to packed 64bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi32_epi8^{⚠}  Experimentalavx512f Convert packed 32bit integers in a to packed 8bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi32_epi16^{⚠}  Experimentalavx512f Convert packed 32bit integers in a to packed 16bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi32_epi64^{⚠}  Experimentalavx512f Sign extend packed 32bit integers in a to packed 64bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi32_pd^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi32_ps^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi64_epi8^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 8bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi64_epi16^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 16bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepi64_epi32^{⚠}  Experimentalavx512f Convert packed 64bit integers in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepu8_epi32^{⚠}  Experimentalavx512f Zero extend packed unsigned 8bit integers in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepu8_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 8bit integers in the low 8 bytes of a to packed 64bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepu16_epi32^{⚠}  Experimentalavx512f Zero extend packed unsigned 16bit integers in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepu16_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 16bit integers in a to packed 64bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepu32_epi64^{⚠}  Experimentalavx512f Zero extend packed unsigned 32bit integers in a to packed 64bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepu32_pd^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtepu32_ps^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtpd_ps^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtph_ps^{⚠}  Experimentalavx512f Convert packed halfprecision (16bit) floatingpoint elements in a to packed singleprecision (32bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtps_pd^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtps_ph^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed halfprecision (16bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtsepi32_epi8^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed 8bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtsepi32_epi16^{⚠}  Experimentalavx512f Convert packed signed 32bit integers in a to packed 16bit integers with signed saturation, and store the results in dst. 
_mm512_maskz_cvtsepi64_epi8^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 8bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtsepi64_epi16^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 16bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtsepi64_epi32^{⚠}  Experimentalavx512f Convert packed signed 64bit integers in a to packed 32bit integers with signed saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtt_roundpd_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtt_roundpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtt_roundps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtt_roundps_epu32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvttpd_epi32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvttpd_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvttps_epi32^{⚠}  Experimentalavx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvttps_epu32^{⚠}  Experimentalavx512f Convert packed doubleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtusepi32_epi8^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed unsigned 8bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtusepi32_epi16^{⚠}  Experimentalavx512f Convert packed unsigned 32bit integers in a to packed unsigned 16bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtusepi64_epi8^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 8bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtusepi64_epi16^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 16bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtusepi64_epi32^{⚠}  Experimentalavx512f Convert packed unsigned 64bit integers in a to packed unsigned 32bit integers with unsigned saturation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_div_pd^{⚠}  Experimentalavx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_div_ps^{⚠}  Experimentalavx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_div_round_pd^{⚠}  Experimentalavx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_div_round_ps^{⚠}  Experimentalavx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_expand_epi32^{⚠}  Experimentalavx512f Load contiguous active 32bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_expand_epi64^{⚠}  Experimentalavx512f Load contiguous active 64bit integers from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_expand_pd^{⚠}  Experimentalavx512f Load contiguous active doubleprecision (64bit) floatingpoint elements from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_expand_ps^{⚠}  Experimentalavx512f Load contiguous active singleprecision (32bit) floatingpoint elements from a (those with their respective bit set in mask k), and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_extractf32x4_ps^{⚠}  Experimentalavx512f Extract 128 bits (composed of 4 packed singleprecision (32bit) floatingpoint elements) from a, selected with imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_extractf64x4_pd^{⚠}  Experimentalavx512f Extract 256 bits (composed of 4 packed doubleprecision (64bit) floatingpoint elements) from a, selected with imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_extracti32x4_epi32^{⚠}  Experimentalavx512f Extract 128 bits (composed of 4 packed 32bit integers) from a, selected with imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_extracti64x4_epi64^{⚠}  Experimentalavx512f Extract 256 bits (composed of 4 packed 64bit integers) from a, selected with imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fixupimm_pd^{⚠}  Experimentalavx512f Fix up packed doubleprecision (64bit) floatingpoint elements in a and b using packed 64bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting. 
_mm512_maskz_fixupimm_ps^{⚠}  Experimentalavx512f Fix up packed singleprecision (32bit) floatingpoint elements in a and b using packed 32bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting. 
_mm512_maskz_fixupimm_round_pd^{⚠}  Experimentalavx512f Fix up packed doubleprecision (64bit) floatingpoint elements in a and b using packed 64bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting. 
_mm512_maskz_fixupimm_round_ps^{⚠}  Experimentalavx512f Fix up packed singleprecision (32bit) floatingpoint elements in a and b using packed 32bit integers in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). imm8 is used to set the required flags reporting. 
_mm512_maskz_fmadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in a using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmaddsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmaddsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmaddsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmaddsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsubadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsubadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsubadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsubadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmadd_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmadd_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmadd_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmadd_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmsub_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmsub_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmsub_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmsub_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_getexp_pd^{⚠}  Experimentalavx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_maskz_getexp_ps^{⚠}  Experimentalavx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_maskz_getexp_round_pd^{⚠}  Experimentalavx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_maskz_getexp_round_ps^{⚠}  Experimentalavx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_maskz_getmant_pd^{⚠}  Experimentalavx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_maskz_getmant_ps^{⚠}  Experimentalavx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_maskz_getmant_round_pd^{⚠}  Experimentalavx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_maskz_getmant_round_ps^{⚠}  Experimentalavx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm512_maskz_insertf32x4^{⚠}  Experimentalavx512f Copy a to tmp, then insert 128 bits (composed of 4 packed singleprecision (32bit) floatingpoint elements) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_insertf64x4^{⚠}  Experimentalavx512f Copy a to tmp, then insert 256 bits (composed of 4 packed doubleprecision (64bit) floatingpoint elements) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_inserti32x4^{⚠}  Experimentalavx512f Copy a to tmp, then insert 128 bits (composed of 4 packed 32bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_inserti64x4^{⚠}  Experimentalavx512f Copy a to tmp, then insert 256 bits (composed of 4 packed 64bit integers) from b into tmp at the location specified by imm8. Store tmp to dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_epi32^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_epi64^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_epu32^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_epu64^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_round_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_round_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_epi32^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_epi64^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_epu32^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_epu64^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_round_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_round_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mov_epi32^{⚠}  Experimentalavx512f Move packed 32bit integers from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mov_epi64^{⚠}  Experimentalavx512f Move packed 64bit integers from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mov_pd^{⚠}  Experimentalavx512f Move packed doubleprecision (64bit) floatingpoint elements from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mov_ps^{⚠}  Experimentalavx512f Move packed singleprecision (32bit) floatingpoint elements from a into dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_movedup_pd^{⚠}  Experimentalavx512f Duplicate evenindexed doubleprecision (64bit) floatingpoint elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_movehdup_ps^{⚠}  Experimentalavx512f Duplicate oddindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_moveldup_ps^{⚠}  Experimentalavx512f Duplicate evenindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_epi32^{⚠}  Experimentalavx512f Multiply the low signed 32bit integers from each packed 64bit element in a and b, and store the signed 64bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_epu32^{⚠}  Experimentalavx512f Multiply the low unsigned 32bit integers from each packed 64bit element in a and b, and store the unsigned 64bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_round_pd^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mullo_epi32^{⚠}  Experimentalavx512f Multiply the packed 32bit integers in a and b, producing intermediate 64bit integers, and store the low 32 bits of the intermediate integers in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_or_epi32^{⚠}  Experimentalavx512f Compute the bitwise OR of packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_or_epi64^{⚠}  Experimentalavx512f Compute the bitwise OR of packed 64bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permute_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permute_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutevar_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutevar_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex2var_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex2var_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex2var_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex2var_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a within 256bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 256bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutexvar_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutexvar_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutexvar_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutexvar_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rcp14_pd^{⚠}  Experimentalavx512f Compute the approximate reciprocal of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_maskz_rcp14_ps^{⚠}  Experimentalavx512f Compute the approximate reciprocal of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_maskz_rol_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rol_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rolv_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rolv_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_ror_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_ror_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rorv_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rorv_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_roundscale_pd^{⚠}  Experimentalavx512f Round packed doubleprecision (64bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_roundscale_ps^{⚠}  Experimentalavx512f Round packed singleprecision (32bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_roundscale_round_pd^{⚠}  Experimentalavx512f Round packed doubleprecision (64bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_roundscale_round_ps^{⚠}  Experimentalavx512f Round packed singleprecision (32bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rsqrt14_pd^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_maskz_rsqrt14_ps^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_maskz_scalef_pd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_scalef_ps^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_scalef_round_pd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_scalef_round_ps^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_set1_epi32^{⚠}  Experimentalavx512f Broadcast 32bit integer a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_set1_epi64^{⚠}  Experimentalavx512f Broadcast 64bit integer a to all elements of dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_f32x4^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 4 singleprecision (32bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_f64x2^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 2 doubleprecision (64bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_i32x4^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 4 32bit integers) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_i64x2^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 2 64bit integers) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sll_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sll_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_slli_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_slli_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sllv_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sllv_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sqrt_pd^{⚠}  Experimentalavx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sqrt_ps^{⚠}  Experimentalavx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sqrt_round_pd^{⚠}  Experimentalavx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sqrt_round_ps^{⚠}  Experimentalavx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sra_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sra_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srai_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srai_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srav_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srav_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srl_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srl_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srli_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srli_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srlv_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srlv_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_epi32^{⚠}  Experimentalavx512f Subtract packed 32bit integers in b from packed 32bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_epi64^{⚠}  Experimentalavx512f Subtract packed 64bit integers in b from packed 64bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_pd^{⚠}  Experimentalavx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_ps^{⚠}  Experimentalavx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_round_pd^{⚠}  Experimentalavx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_round_ps^{⚠}  Experimentalavx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_ternarylogic_epi32^{⚠}  Experimentalavx512f Bitwise ternary logic that provides the capability to implement any threeoperand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using zeromask k at 32bit granularity (32bit elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_ternarylogic_epi64^{⚠}  Experimentalavx512f Bitwise ternary logic that provides the capability to implement any threeoperand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst using zeromask k at 64bit granularity (64bit elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_unpackhi_epi32^{⚠}  Experimentalavx512f Unpack and interleave 32bit integers from the high half of each 128bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_unpackhi_epi64^{⚠}  Experimentalavx512f Unpack and interleave 64bit integers from the high half of each 128bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_unpackhi_pd^{⚠}  Experimentalavx512f Unpack and interleave doubleprecision (64bit) floatingpoint elements from the high half of each 128bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_unpackhi_ps^{⚠}  Experimentalavx512f Unpack and interleave singleprecision (32bit) floatingpoint elements from the high half of each 128bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_unpacklo_epi32^{⚠}  Experimentalavx512f Unpack and interleave 32bit integers from the low half of each 128bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_unpacklo_epi64^{⚠}  Experimentalavx512f Unpack and interleave 64bit integers from the low half of each 128bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_unpacklo_pd^{⚠}  Experimentalavx512f Unpack and interleave doubleprecision (64bit) floatingpoint elements from the low half of each 128bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_unpacklo_ps^{⚠}  Experimentalavx512f Unpack and interleave singleprecision (32bit) floatingpoint elements from the low half of each 128bit lane in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_xor_epi32^{⚠}  Experimentalavx512f Compute the bitwise XOR of packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_xor_epi64^{⚠}  Experimentalavx512f Compute the bitwise XOR of packed 64bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_max_epi32^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b, and store packed maximum values in dst. 
_mm512_max_epi64^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b, and store packed maximum values in dst. 
_mm512_max_epu32^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b, and store packed maximum values in dst. 
_mm512_max_epu64^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b, and store packed maximum values in dst. 
_mm512_max_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst. 
_mm512_max_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst. 
_mm512_max_round_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst. 
_mm512_max_round_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst. 
_mm512_min_epi32^{⚠}  Experimentalavx512f Compare packed signed 32bit integers in a and b, and store packed minimum values in dst. 
_mm512_min_epi64^{⚠}  Experimentalavx512f Compare packed signed 64bit integers in a and b, and store packed minimum values in dst. 
_mm512_min_epu32^{⚠}  Experimentalavx512f Compare packed unsigned 32bit integers in a and b, and store packed minimum values in dst. 
_mm512_min_epu64^{⚠}  Experimentalavx512f Compare packed unsigned 64bit integers in a and b, and store packed minimum values in dst. 
_mm512_min_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst. 
_mm512_min_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst. 
_mm512_min_round_pd^{⚠}  Experimentalavx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst. 
_mm512_min_round_ps^{⚠}  Experimentalavx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst. 
_mm512_movedup_pd^{⚠}  Experimentalavx512f Duplicate evenindexed doubleprecision (64bit) floatingpoint elements from a, and store the results in dst. 
_mm512_movehdup_ps^{⚠}  Experimentalavx512f Duplicate oddindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst. 
_mm512_moveldup_ps^{⚠}  Experimentalavx512f Duplicate evenindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst. 
_mm512_mul_epi32^{⚠}  Experimentalavx512f Multiply the low signed 32bit integers from each packed 64bit element in a and b, and store the signed 64bit results in dst. 
_mm512_mul_epu32^{⚠}  Experimentalavx512f Multiply the low unsigned 32bit integers from each packed 64bit element in a and b, and store the unsigned 64bit results in dst. 
_mm512_mul_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_mul_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_mul_round_pd^{⚠}  Experimentalavx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_mul_round_ps^{⚠}  Experimentalavx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_mullo_epi32^{⚠}  Experimentalavx512f Multiply the packed 32bit integers in a and b, producing intermediate 64bit integers, and store the low 32 bits of the intermediate integers in dst. 
_mm512_mullox_epi64^{⚠}  Experimentalavx512f Multiplies elements in packed 64bit integer vectors a and b together, storing the lower 64 bits of the result in dst. 
_mm512_or_epi32^{⚠}  Experimentalavx512f Compute the bitwise OR of packed 32bit integers in a and b, and store the results in dst. 
_mm512_or_epi64^{⚠}  Experimentalavx512f Compute the bitwise OR of packed 64bit integers in a and b, and store the resut in dst. 
_mm512_or_si512^{⚠}  Experimentalavx512f Compute the bitwise OR of 512 bits (representing integer data) in a and b, and store the result in dst. 
_mm512_permute_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst. 
_mm512_permute_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst. 
_mm512_permutevar_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst. Note that this intrinsic shuffles across 128bit lanes, unlike past intrinsics that use the permutevar name. This intrinsic is identical to _mm512_permutexvar_epi32, and it is recommended that you use that intrinsic name. 
_mm512_permutevar_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst. 
_mm512_permutevar_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst. 
_mm512_permutex2var_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst. 
_mm512_permutex2var_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst. 
_mm512_permutex2var_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst. 
_mm512_permutex2var_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst. 
_mm512_permutex_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a within 256bit lanes using the control in imm8, and store the results in dst. 
_mm512_permutex_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 256bit lanes using the control in imm8, and store the results in dst. 
_mm512_permutexvar_epi32^{⚠}  Experimentalavx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst. 
_mm512_permutexvar_epi64^{⚠}  Experimentalavx512f Shuffle 64bit integers in a across lanes using the corresponding index in idx, and store the results in dst. 
_mm512_permutexvar_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst. 
_mm512_permutexvar_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a across lanes using the corresponding index in idx. 
_mm512_rcp14_pd^{⚠}  Experimentalavx512f Compute the approximate reciprocal of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^14. 
_mm512_rcp14_ps^{⚠}  Experimentalavx512f Compute the approximate reciprocal of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^14. 
_mm512_reduce_add_epi32^{⚠}  Experimentalavx512f Reduce the packed 32bit integers in a by addition. Returns the sum of all elements in a. 
_mm512_reduce_add_epi64^{⚠}  Experimentalavx512f Reduce the packed 64bit integers in a by addition. Returns the sum of all elements in a. 
_mm512_reduce_add_pd^{⚠}  Experimentalavx512f Reduce the packed doubleprecision (64bit) floatingpoint elements in a by addition. Returns the sum of all elements in a. 
_mm512_reduce_add_ps^{⚠}  Experimentalavx512f Reduce the packed singleprecision (32bit) floatingpoint elements in a by addition. Returns the sum of all elements in a. 
_mm512_reduce_and_epi32^{⚠}  Experimentalavx512f Reduce the packed 32bit integers in a by bitwise AND. Returns the bitwise AND of all elements in a. 
_mm512_reduce_and_epi64^{⚠}  Experimentalavx512f Reduce the packed 64bit integers in a by bitwise AND. Returns the bitwise AND of all elements in a. 
_mm512_reduce_max_epi32^{⚠}  Experimentalavx512f Reduce the packed signed 32bit integers in a by maximum. Returns the maximum of all elements in a. 
_mm512_reduce_max_epi64^{⚠}  Experimentalavx512f Reduce the packed signed 64bit integers in a by maximum. Returns the maximum of all elements in a. 
_mm512_reduce_max_epu32^{⚠}  Experimentalavx512f Reduce the packed unsigned 32bit integers in a by maximum. Returns the maximum of all elements in a. 
_mm512_reduce_max_epu64^{⚠}  Experimentalavx512f Reduce the packed unsigned 64bit integers in a by maximum. Returns the maximum of all elements in a. 
_mm512_reduce_max_pd^{⚠}  Experimentalavx512f Reduce the packed doubleprecision (64bit) floatingpoint elements in a by maximum. Returns the maximum of all elements in a. 
_mm512_reduce_max_ps^{⚠}  Experimentalavx512f Reduce the packed singleprecision (32bit) floatingpoint elements in a by maximum. Returns the maximum of all elements in a. 
_mm512_reduce_min_epi32^{⚠}  Experimentalavx512f Reduce the packed signed 32bit integers in a by minimum. Returns the minimum of all elements in a. 
_mm512_reduce_min_epi64^{⚠}  Experimentalavx512f Reduce the packed signed 64bit integers in a by minimum. Returns the minimum of all elements in a. 
_mm512_reduce_min_epu32^{⚠}  Experimentalavx512f Reduce the packed unsigned 32bit integers in a by minimum. Returns the minimum of all elements in a. 
_mm512_reduce_min_epu64^{⚠}  Experimentalavx512f Reduce the packed unsigned 64bit integers in a by minimum. Returns the minimum of all elements in a. 
_mm512_reduce_min_pd^{⚠}  Experimentalavx512f Reduce the packed doubleprecision (64bit) floatingpoint elements in a by minimum. Returns the minimum of all elements in a. 
_mm512_reduce_min_ps^{⚠}  Experimentalavx512f Reduce the packed singleprecision (32bit) floatingpoint elements in a by minimum. Returns the minimum of all elements in a. 
_mm512_reduce_mul_epi32^{⚠}  Experimentalavx512f Reduce the packed 32bit integers in a by multiplication. Returns the product of all elements in a. 
_mm512_reduce_mul_epi64^{⚠}  Experimentalavx512f Reduce the packed 64bit integers in a by multiplication. Returns the product of all elements in a. 
_mm512_reduce_mul_pd^{⚠}  Experimentalavx512f Reduce the packed doubleprecision (64bit) floatingpoint elements in a by multiplication. Returns the product of all elements in a. 
_mm512_reduce_mul_ps^{⚠}  Experimentalavx512f Reduce the packed singleprecision (32bit) floatingpoint elements in a by multiplication. Returns the product of all elements in a. 
_mm512_reduce_or_epi32^{⚠}  Experimentalavx512f Reduce the packed 32bit integers in a by bitwise OR. Returns the bitwise OR of all elements in a. 
_mm512_reduce_or_epi64^{⚠}  Experimentalavx512f Reduce the packed 64bit integers in a by bitwise OR. Returns the bitwise OR of all elements in a. 
_mm512_rol_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in imm8, and store the results in dst. 
_mm512_rol_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in imm8, and store the results in dst. 
_mm512_rolv_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst. 
_mm512_rolv_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst. 
_mm512_ror_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in imm8, and store the results in dst. 
_mm512_ror_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in imm8, and store the results in dst. 
_mm512_rorv_epi32^{⚠}  Experimentalavx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst. 
_mm512_rorv_epi64^{⚠}  Experimentalavx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst. 
_mm512_roundscale_pd^{⚠}  Experimentalavx512f Round packed doubleprecision (64bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst. 
_mm512_roundscale_ps^{⚠}  Experimentalavx512f Round packed singleprecision (32bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst. 
_mm512_roundscale_round_pd^{⚠}  Experimentalavx512f Round packed doubleprecision (64bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst. 
_mm512_roundscale_round_ps^{⚠}  Experimentalavx512f Round packed singleprecision (32bit) floatingpoint elements in a to the number of fraction bits specified by imm8, and store the results in dst. 
_mm512_rsqrt14_pd^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^14. 
_mm512_rsqrt14_ps^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^14. 
_mm512_scalef_pd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, and store the results in dst. 
_mm512_scalef_ps^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, and store the results in dst. 
_mm512_scalef_round_pd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, and store the results in dst. 
_mm512_scalef_round_ps^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, and store the results in dst. 
_mm512_set1_epi8^{⚠}  Experimentalavx512f Broadcast 8bit integer a to all elements of dst. 
_mm512_set1_epi16^{⚠}  Experimentalavx512f Broadcast the low packed 16bit integer from a to all all elements of dst. 
_mm512_set1_epi32^{⚠}  Experimentalavx512f Broadcast 32bit integer 
_mm512_set1_epi64^{⚠}  Experimentalavx512f Broadcast 64bit integer 
_mm512_set1_pd^{⚠}  Experimentalavx512f Broadcast 64bit float 
_mm512_set1_ps^{⚠}  Experimentalavx512f Broadcast 32bit float 
_mm512_set4_epi32^{⚠}  Experimentalavx512f Set packed 32bit integers in dst with the repeated 4 element sequence. 
_mm512_set4_epi64^{⚠}  Experimentalavx512f Set packed 64bit integers in dst with the repeated 4 element sequence. 
_mm512_set4_pd^{⚠}  Experimentalavx512f Set packed doubleprecision (64bit) floatingpoint elements in dst with the repeated 4 element sequence. 
_mm512_set4_ps^{⚠}  Experimentalavx512f Set packed singleprecision (32bit) floatingpoint elements in dst with the repeated 4 element sequence. 
_mm512_set_epi8^{⚠}  Experimentalavx512f Set packed 8bit integers in dst with the supplied values. 
_mm512_set_epi16^{⚠}  Experimentalavx512f Set packed 16bit integers in dst with the supplied values. 
_mm512_set_epi32^{⚠}  Experimentalavx512f Sets packed 32bit integers in 
_mm512_set_epi64^{⚠}  Experimentalavx512f Set packed 64bit integers in dst with the supplied values. 
_mm512_set_pd^{⚠}  Experimentalavx512f Set packed doubleprecision (64bit) floatingpoint elements in dst with the supplied values. 
_mm512_set_ps^{⚠}  Experimentalavx512f Sets packed 32bit integers in 
_mm512_setr4_epi32^{⚠}  Experimentalavx512f Set packed 32bit integers in dst with the repeated 4 element sequence in reverse order. 
_mm512_setr4_epi64^{⚠}  Experimentalavx512f Set packed 64bit integers in dst with the repeated 4 element sequence in reverse order. 
_mm512_setr4_pd^{⚠}  Experimentalavx512f Set packed doubleprecision (64bit) floatingpoint elements in dst with the repeated 4 element sequence in reverse order. 
_mm512_setr4_ps^{⚠}  Experimentalavx512f Set packed singleprecision (32bit) floatingpoint elements in dst with the repeated 4 element sequence in reverse order. 
_mm512_setr_epi32^{⚠}  Experimentalavx512f Sets packed 32bit integers in 
_mm512_setr_epi64^{⚠}  Experimentalavx512f Set packed 64bit integers in dst with the supplied values in reverse order. 
_mm512_setr_pd^{⚠}  Experimentalavx512f Set packed doubleprecision (64bit) floatingpoint elements in dst with the supplied values in reverse order. 
_mm512_setr_ps^{⚠}  Experimentalavx512f Sets packed 32bit integers in 
_mm512_setzero^{⚠}  Experimentalavx512f Return vector of type __m512 with all elements set to zero. 
_mm512_setzero_epi32^{⚠}  Experimentalavx512f Return vector of type __m512i with all elements set to zero. 
_mm512_setzero_pd^{⚠}  Experimentalavx512f Returns vector of type 
_mm512_setzero_ps^{⚠}  Experimentalavx512f Returns vector of type 
_mm512_setzero_si512^{⚠}  Experimentalavx512f Returns vector of type 
_mm512_shuffle_epi32^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst. 
_mm512_shuffle_f32x4^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 4 singleprecision (32bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst. 
_mm512_shuffle_f64x2^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 2 doubleprecision (64bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst. 
_mm512_shuffle_i32x4^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 4 32bit integers) selected by imm8 from a and b, and store the results in dst. 
_mm512_shuffle_i64x2^{⚠}  Experimentalavx512f Shuffle 128bits (composed of 2 64bit integers) selected by imm8 from a and b, and store the results in dst. 
_mm512_shuffle_pd^{⚠}  Experimentalavx512f Shuffle doubleprecision (64bit) floatingpoint elements within 128bit lanes using the control in imm8, and store the results in dst. 
_mm512_shuffle_ps^{⚠}  Experimentalavx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst. 
_mm512_sll_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by count while shifting in zeros, and store the results in dst. 
_mm512_sll_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by count while shifting in zeros, and store the results in dst. 
_mm512_slli_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by imm8 while shifting in zeros, and store the results in dst. 
_mm512_slli_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by imm8 while shifting in zeros, and store the results in dst. 
_mm512_sllv_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst. 
_mm512_sllv_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst. 
_mm512_sqrt_pd^{⚠}  Experimentalavx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. 
_mm512_sqrt_ps^{⚠}  Experimentalavx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. 
_mm512_sqrt_round_pd^{⚠}  Experimentalavx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. 
_mm512_sqrt_round_ps^{⚠}  Experimentalavx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. 
_mm512_sra_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by count while shifting in sign bits, and store the results in dst. 
_mm512_sra_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by count while shifting in sign bits, and store the results in dst. 
_mm512_srai_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by imm8 while shifting in sign bits, and store the results in dst. 
_mm512_srai_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by imm8 while shifting in sign bits, and store the results in dst. 
_mm512_srav_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst. 
_mm512_srav_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst. 
_mm512_srl_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by count while shifting in zeros, and store the results in dst. 
_mm512_srl_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by count while shifting in zeros, and store the results in dst. 
_mm512_srli_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by imm8 while shifting in zeros, and store the results in dst. 
_mm512_srli_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by imm8 while shifting in zeros, and store the results in dst. 
_mm512_srlv_epi32^{⚠}  Experimentalavx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst. 
_mm512_srlv_epi64^{⚠}  Experimentalavx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst. 
_mm512_store_epi32^{⚠}  Experimentalavx512f Store 512bits (composed of 16 packed 32bit integers) from a into memory. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_store_epi64^{⚠}  Experimentalavx512f Store 512bits (composed of 8 packed 64bit integers) from a into memory. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_store_pd^{⚠}  Experimentalavx512f Store 512bits (composed of 8 packed doubleprecision (64bit) floatingpoint elements) from a into memory. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_store_ps^{⚠}  Experimentalavx512f Store 512bits of integer data from a into memory. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_store_si512^{⚠}  Experimentalavx512f Store 512bits of integer data from a into memory. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_storeu_epi32^{⚠}  Experimentalavx512f Store 512bits (composed of 16 packed 32bit integers) from a into memory. mem_addr does not need to be aligned on any particular boundary. 
_mm512_storeu_epi64^{⚠}  Experimentalavx512f Store 512bits (composed of 8 packed 64bit integers) from a into memory. mem_addr does not need to be aligned on any particular boundary. 
_mm512_storeu_pd^{⚠}  Experimentalavx512f Stores 512bits (composed of 8 packed doubleprecision (64bit)
floatingpoint elements) from 
_mm512_storeu_si512^{⚠}  Experimentalavx512f Store 512bits of integer data from a into memory. mem_addr does not need to be aligned on any particular boundary. 
_mm512_stream_pd^{⚠}  Experimentalavx512f Store 512bits (composed of 8 packed doubleprecision (64bit) floatingpoint elements) from a into memory using a nontemporal memory hint. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_stream_ps^{⚠}  Experimentalavx512f Store 512bits (composed of 16 packed singleprecision (32bit) floatingpoint elements) from a into memory using a nontemporal memory hint. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_stream_si512^{⚠}  Experimentalavx512f Store 512bits of integer data from a into memory using a nontemporal memory hint. mem_addr must be aligned on a 64byte boundary or a generalprotection exception may be generated. 
_mm512_sub_epi32^{⚠}  Experimentalavx512f Subtract packed 32bit integers in b from packed 32bit integers in a, and store the results in dst. 
_mm512_sub_epi64^{⚠}  Experimentalavx512f Subtract packed 64bit integers in b from packed 64bit integers in a, and store the results in dst. 
_mm512_sub_pd^{⚠}  Experimentalavx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. 
_mm512_sub_ps^{⚠}  Experimentalavx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. 
_mm512_sub_round_pd^{⚠}  Experimentalavx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. 
_mm512_sub_round_ps^{⚠}  Experimentalavx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. 
_mm512_ternarylogic_epi32^{⚠}  Experimentalavx512f Bitwise ternary logic that provides the capability to implement any threeoperand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 32bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst. 
_mm512_ternarylogic_epi64^{⚠}  Experimentalavx512f Bitwise ternary logic that provides the capability to implement any threeoperand binary function; the specific binary function is specified by value in imm8. For each bit in each packed 64bit integer, the corresponding bit from a, b, and c are used to form a 3 bit index into imm8, and the value at that bit in imm8 is written to the corresponding bit in dst. 
_mm512_test_epi32_mask^{⚠}  Experimentalavx512f Compute the bitwise AND of packed 32bit integers in a and b, producing intermediate 32bit values, and set the corresponding bit in result mask k if the intermediate value is nonzero. 
_mm512_test_epi64_mask^{⚠}  Experimentalavx512f Compute the bitwise AND of packed 64bit integers in a and b, producing intermediate 64bit values, and set the corresponding bit in result mask k if the intermediate value is nonzero. 
_mm512_testn_epi32_mask^{⚠}  Experimentalavx512f Compute the bitwise NAND of packed 32bit integers in a and b, producing intermediate 32bit values, and set the corresponding bit in result mask k if the intermediate value is zero. 
_mm512_testn_epi64_mask^{⚠}  Experimentalavx512f Compute the bitwise NAND of packed 64bit integers in a and b, producing intermediate 64bit values, and set the corresponding bit in result mask k if the intermediate value is zero. 
_mm512_undefined^{⚠}  Experimentalavx512f Return vector of type __m512 with undefined elements. 
_mm512_undefined_epi32^{⚠}  Experimentalavx512f Return vector of type __m512i with undefined elements. 
_mm512_undefined_pd^{⚠}  Experimentalavx512f Returns vector of type 
_mm512_undefined_ps^{⚠}  Experimentalavx512f Returns vector of type 
_mm512_unpackhi_epi32^{⚠}  Experimentalavx512f Unpack and interleave 32bit integers from the high half of each 128bit lane in a and b, and store the results in dst. 
_mm512_unpackhi_epi64^{⚠}  Experimentalavx512f Unpack and interleave 64bit integers from the high half of each 128bit lane in a and b, and store the results in dst. 
_mm512_unpackhi_pd^{⚠}  Experimentalavx512f Unpack and interleave doubleprecision (64bit) floatingpoint elements from the high half of each 128bit lane in a and b, and store the results in dst. 
_mm512_unpackhi_ps^{⚠}  Experimentalavx512f Unpack and interleave singleprecision (32bit) floatingpoint elements from the high half of each 128bit lane in a and b, and store the results in dst. 
_mm512_unpacklo_epi32^{⚠}  Experimentalavx512f Unpack and interleave 32bit integers from the low half of each 128bit lane in a and b, and store the results in dst. 
_mm512_unpacklo_epi64^{⚠}  Experimentalavx512f Unpack and interleave 64bit integers from the low half of each 128bit lane in a and b, and store the results in dst. 
_mm512_unpacklo_pd^{⚠}  Experimentalavx512f Unpack and interleave doubleprecision (64bit) floatingpoint elements from the low half of each 128bit lane in a and b, and store the results in dst. 
_mm512_unpacklo_ps^{⚠}  Experimentalavx512f Unpack and interleave singleprecision (32bit) floatingpoint elements from the low half of each 128bit lane in a and b, and store the results in dst. 
_mm512_xor_epi32^{⚠}  Experimentalavx512f Compute the bitwise XOR of packed 32bit integers in a and b, and store the results in dst. 
_mm512_xor_epi64^{⚠}  Experimentalavx512f Compute the bitwise XOR of packed 64bit integers in a and b, and store the results in dst. 
_mm512_xor_si512^{⚠}  Experimentalavx512f Compute the bitwise XOR of 512 bits (representing integer data) in a and b, and store the result in dst. 
_mm512_zextpd128_pd512^{⚠}  Experimentalavx512f Cast vector of type __m128d to type __m512d; the upper 384 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_zextpd256_pd512^{⚠}  Experimentalavx512f Cast vector of type __m256d to type __m512d; the upper 256 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_zextps128_ps512^{⚠}  Experimentalavx512f Cast vector of type __m128 to type __m512; the upper 384 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_zextps256_ps512^{⚠}  Experimentalavx512f Cast vector of type __m256 to type __m512; the upper 256 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_zextsi128_si512^{⚠}  Experimentalavx512f Cast vector of type __m128i to type __m512i; the upper 384 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm512_zextsi256_si512^{⚠}  Experimentalavx512f Cast vector of type __m256i to type __m512i; the upper 256 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency. 
_mm_add_round_sd^{⚠}  Experimentalavx512f Add the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_add_round_ss^{⚠}  Experimentalavx512f Add the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_cmp_round_sd_mask^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k. 
_mm_cmp_round_ss_mask^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k. 
_mm_cmp_sd_mask^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k. 
_mm_cmp_ss_mask^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k. 
_mm_comi_round_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and return the boolean result (0 or 1). 
_mm_comi_round_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and return the boolean result (0 or 1). 
_mm_cvt_roundi32_ss^{⚠}  Experimentalavx512f Convert the signed 32bit integer b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_cvt_roundsd_i32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to a 32bit integer, and store the result in dst. 
_mm_cvt_roundsd_si32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to a 32bit integer, and store the result in dst. 
_mm_cvt_roundsd_ss^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_cvt_roundsd_u32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to an unsigned 32bit integer, and store the result in dst. 
_mm_cvt_roundsi32_ss^{⚠}  Experimentalavx512f Convert the signed 32bit integer b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_cvt_roundss_i32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to a 32bit integer, and store the result in dst. 
_mm_cvt_roundss_sd^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint element, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_cvt_roundss_si32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to a 32bit integer, and store the result in dst. 
_mm_cvt_roundss_u32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to an unsigned 32bit integer, and store the result in dst. 
_mm_cvt_roundu32_ss^{⚠}  Experimentalavx512f Convert the unsigned 32bit integer b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_cvti32_sd^{⚠}  Experimentalavx512f Convert the signed 32bit integer b to a doubleprecision (64bit) floatingpoint element, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_cvti32_ss^{⚠}  Experimentalavx512f Convert the signed 32bit integer b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_cvtph_ps^{⚠}  Experimentalf16c Converts the 4 x 16bit halfprecision float values in the lowest 64bit of
the 128bit vector 
_mm_cvtps_ph^{⚠}  Experimentalf16c Converts the 4 x 32bit float values in the 128bit vector 
_mm_cvtsd_i32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to a 32bit integer, and store the result in dst. 
_mm_cvtsd_u32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to an unsigned 32bit integer, and store the result in dst. 
_mm_cvtss_i32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to a 32bit integer, and store the result in dst. 
_mm_cvtss_u32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to an unsigned 32bit integer, and store the result in dst. 
_mm_cvtt_roundsd_i32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to a 32bit integer with truncation, and store the result in dst. 
_mm_cvtt_roundsd_si32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to a 32bit integer with truncation, and store the result in dst. 
_mm_cvtt_roundsd_u32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to an unsigned 32bit integer with truncation, and store the result in dst. 
_mm_cvtt_roundss_i32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to a 32bit integer with truncation, and store the result in dst. 
_mm_cvtt_roundss_si32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to a 32bit integer with truncation, and store the result in dst. 
_mm_cvtt_roundss_u32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to an unsigned 32bit integer with truncation, and store the result in dst. 
_mm_cvttsd_i32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to a 32bit integer with truncation, and store the result in dst. 
_mm_cvttsd_u32^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in a to an unsigned 32bit integer with truncation, and store the result in dst. 
_mm_cvttss_i32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to a 32bit integer with truncation, and store the result in dst. 
_mm_cvttss_u32^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in a to an unsigned 32bit integer with truncation, and store the result in dst. 
_mm_cvtu32_sd^{⚠}  Experimentalavx512f Convert the unsigned 32bit integer b to a doubleprecision (64bit) floatingpoint element, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_cvtu32_ss^{⚠}  Experimentalavx512f Convert the unsigned 32bit integer b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_cvtu64_sd^{⚠}  Experimentalavx512f Convert the unsigned 64bit integer b to a doubleprecision (64bit) floatingpoint element, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_cvtu64_ss^{⚠}  Experimentalavx512f Convert the unsigned 64bit integer b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_div_round_sd^{⚠}  Experimentalavx512f Divide the lower doubleprecision (64bit) floatingpoint element in a by the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_div_round_ss^{⚠}  Experimentalavx512f Divide the lower singleprecision (32bit) floatingpoint element in a by the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_fixupimm_round_sd^{⚠}  Experimentalavx512f Fix up the lower doubleprecision (64bit) floatingpoint elements in a and b using the lower 64bit integer in c, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting. 
_mm_fixupimm_round_ss^{⚠}  Experimentalavx512f Fix up the lower singleprecision (32bit) floatingpoint elements in a and b using the lower 32bit integer in c, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting. 
_mm_fixupimm_sd^{⚠}  Experimentalavx512f Fix up the lower doubleprecision (64bit) floatingpoint elements in a and b using the lower 64bit integer in c, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting. 
_mm_fixupimm_ss^{⚠}  Experimentalavx512f Fix up the lower singleprecision (32bit) floatingpoint elements in a and b using the lower 32bit integer in c, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting. 
_mm_fmadd_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_fmadd_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_fmsub_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_fmsub_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_fnmadd_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_fnmadd_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_fnmsub_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_fnmsub_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, subtract the lower element in c from the negated intermediate result, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_getexp_round_sd^{⚠}  Experimentalavx512f Convert the exponent of the lower doubleprecision (64bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_getexp_round_ss^{⚠}  Experimentalavx512f Convert the exponent of the lower singleprecision (32bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_getexp_sd^{⚠}  Experimentalavx512f Convert the exponent of the lower doubleprecision (64bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_getexp_ss^{⚠}  Experimentalavx512f Convert the exponent of the lower singleprecision (32bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_getmant_round_sd^{⚠}  Experimentalavx512f Normalize the mantissas of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_getmant_round_ss^{⚠}  Experimentalavx512f Normalize the mantissas of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_getmant_sd^{⚠}  Experimentalavx512f Normalize the mantissas of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_getmant_ss^{⚠}  Experimentalavx512f Normalize the mantissas of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_madd52hi_epu64^{⚠}  Experimentalavx512ifma,avx512vl Multiply packed unsigned 52bit integers in each 64bit element of

_mm_madd52lo_epu64^{⚠}  Experimentalavx512ifma,avx512vl Multiply packed unsigned 52bit integers in each 64bit element of

_mm_mask3_fmadd_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst. 
_mm_mask3_fmadd_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst. 
_mm_mask3_fmadd_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst. 
_mm_mask3_fmadd_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst. 
_mm_mask3_fmsub_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst. 
_mm_mask3_fmsub_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst. 
_mm_mask3_fmsub_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst. 
_mm_mask3_fmsub_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst. 
_mm_mask3_fnmadd_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst. 
_mm_mask3_fnmadd_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst. 
_mm_mask3_fnmadd_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst. 
_mm_mask3_fnmadd_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst. 
_mm_mask3_fnmsub_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst. 
_mm_mask3_fnmsub_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst. 
_mm_mask3_fnmsub_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from c to the upper element of dst. 
_mm_mask3_fnmsub_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from c to the upper elements of dst. 
_mm_mask_add_round_sd^{⚠}  Experimentalavx512f Add the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_add_round_ss^{⚠}  Experimentalavx512f Add the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_add_sd^{⚠}  Experimentalavx512f Add the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_add_ss^{⚠}  Experimentalavx512f Add the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_cmp_round_sd_mask^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k using zeromask k1 (the element is zeroed out when mask bit 0 is not set). 
_mm_mask_cmp_round_ss_mask^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k using zeromask k1 (the element is zeroed out when mask bit 0 is not seti). 
_mm_mask_cmp_sd_mask^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k using zeromask k1 (the element is zeroed out when mask bit 0 is not set). 
_mm_mask_cmp_ss_mask^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint element in a and b based on the comparison operand specified by imm8, and store the result in mask vector k using zeromask k1 (the element is zeroed out when mask bit 0 is not set). 
_mm_mask_cvt_roundsd_ss^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_cvt_roundss_sd^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint element, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_cvtsd_ss^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_cvtss_sd^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint element, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_div_round_sd^{⚠}  Experimentalavx512f Divide the lower doubleprecision (64bit) floatingpoint element in a by the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_div_round_ss^{⚠}  Experimentalavx512f Divide the lower singleprecision (32bit) floatingpoint element in a by the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_div_sd^{⚠}  Experimentalavx512f Divide the lower doubleprecision (64bit) floatingpoint element in a by the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_div_ss^{⚠}  Experimentalavx512f Divide the lower singleprecision (32bit) floatingpoint element in a by the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_fixupimm_round_sd^{⚠}  Experimentalavx512f Fix up the lower doubleprecision (64bit) floatingpoint elements in a and b using the lower 64bit integer in c, store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting. 
_mm_mask_fixupimm_round_ss^{⚠}  Experimentalavx512f Fix up the lower singleprecision (32bit) floatingpoint elements in a and b using the lower 32bit integer in c, store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting. 
_mm_mask_fixupimm_sd^{⚠}  Experimentalavx512f Fix up the lower doubleprecision (64bit) floatingpoint elements in a and b using the lower 64bit integer in c, store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting. 
_mm_mask_fixupimm_ss^{⚠}  Experimentalavx512f Fix up the lower singleprecision (32bit) floatingpoint elements in a and b using the lower 32bit integer in c, store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting. 
_mm_mask_fmadd_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_fmadd_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_fmadd_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_fmadd_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_fmsub_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_fmsub_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_fmsub_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_fmsub_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_fnmadd_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_fnmadd_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_fnmadd_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_fnmadd_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using writemask k (the element is copied from a when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_fnmsub_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_fnmsub_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_fnmsub_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_fnmsub_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using writemask k (the element is copied from c when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_getexp_round_sd^{⚠}  Experimentalavx512f Convert the exponent of the lower doubleprecision (64bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_mask_getexp_round_ss^{⚠}  Experimentalavx512f Convert the exponent of the lower singleprecision (32bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_mask_getexp_sd^{⚠}  Experimentalavx512f Convert the exponent of the lower doubleprecision (64bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_mask_getexp_ss^{⚠}  Experimentalavx512f Convert the exponent of the lower singleprecision (32bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_mask_getmant_round_sd^{⚠}  Experimentalavx512f Normalize the mantissas of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_mask_getmant_round_ss^{⚠}  Experimentalavx512f Normalize the mantissas of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_mask_getmant_sd^{⚠}  Experimentalavx512f Normalize the mantissas of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_mask_getmant_ss^{⚠}  Experimentalavx512f Normalize the mantissas of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_mask_max_round_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_max_round_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_max_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_max_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_min_round_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_min_round_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_min_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_min_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_move_sd^{⚠}  Experimentalavx512f Move the lower doubleprecision (64bit) floatingpoint element from b to the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_move_ss^{⚠}  Experimentalavx512f Move the lower singleprecision (32bit) floatingpoint element from b to the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_mul_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_mul_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_mul_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_mul_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_rcp14_sd^{⚠}  Experimentalavx512f Compute the approximate reciprocal of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_mask_rcp14_ss^{⚠}  Experimentalavx512f Compute the approximate reciprocal of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_mask_roundscale_round_sd^{⚠}  Experimentalavx512f Round the lower doubleprecision (64bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_roundscale_round_ss^{⚠}  Experimentalavx512f Round the lower singleprecision (32bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_roundscale_sd^{⚠}  Experimentalavx512f Round the lower doubleprecision (64bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_roundscale_ss^{⚠}  Experimentalavx512f Round the lower singleprecision (32bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_rsqrt14_sd^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_mask_rsqrt14_ss^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_mask_scalef_round_sd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_scalef_round_ss^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_scalef_sd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_scalef_ss^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_sqrt_round_sd^{⚠}  Experimentalavx512f Compute the square root of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_sqrt_round_ss^{⚠}  Experimentalavx512f Compute the square root of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_sqrt_sd^{⚠}  Experimentalavx512f Compute the square root of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_sqrt_ss^{⚠}  Experimentalavx512f Compute the square root of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_sub_round_sd^{⚠}  Experimentalavx512f Subtract the lower doubleprecision (64bit) floatingpoint element in b from the lower doubleprecision (64bit) floatingpoint element in a, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_sub_round_ss^{⚠}  Experimentalavx512f Subtract the lower singleprecision (32bit) floatingpoint element in b from the lower singleprecision (32bit) floatingpoint element in a, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mask_sub_sd^{⚠}  Experimentalavx512f Subtract the lower doubleprecision (64bit) floatingpoint element in b from the lower doubleprecision (64bit) floatingpoint element in a, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_mask_sub_ss^{⚠}  Experimentalavx512f Subtract the lower singleprecision (32bit) floatingpoint element in b from the lower singleprecision (32bit) floatingpoint element in a, store the result in the lower element of dst using writemask k (the element is copied from src when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_add_round_sd^{⚠}  Experimentalavx512f Add the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_add_round_ss^{⚠}  Experimentalavx512f Add the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_add_sd^{⚠}  Experimentalavx512f Add the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_add_ss^{⚠}  Experimentalavx512f Add the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_cvt_roundsd_ss^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_cvt_roundss_sd^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint element, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_cvtsd_ss^{⚠}  Experimentalavx512f Convert the lower doubleprecision (64bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint element, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_cvtss_sd^{⚠}  Experimentalavx512f Convert the lower singleprecision (32bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint element, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_div_round_sd^{⚠}  Experimentalavx512f Divide the lower doubleprecision (64bit) floatingpoint element in a by the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_div_round_ss^{⚠}  Experimentalavx512f Divide the lower singleprecision (32bit) floatingpoint element in a by the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_div_sd^{⚠}  Experimentalavx512f Divide the lower doubleprecision (64bit) floatingpoint element in a by the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_div_ss^{⚠}  Experimentalavx512f Divide the lower singleprecision (32bit) floatingpoint element in a by the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_fixupimm_round_sd^{⚠}  Experimentalavx512f Fix up the lower doubleprecision (64bit) floatingpoint elements in a and b using the lower 64bit integer in c, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting. 
_mm_maskz_fixupimm_round_ss^{⚠}  Experimentalavx512f Fix up the lower singleprecision (32bit) floatingpoint elements in a and b using the lower 32bit integer in c, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting. 
_mm_maskz_fixupimm_sd^{⚠}  Experimentalavx512f Fix up the lower doubleprecision (64bit) floatingpoint elements in a and b using the lower 64bit integer in c, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. imm8 is used to set the required flags reporting. 
_mm_maskz_fixupimm_ss^{⚠}  Experimentalavx512f Fix up the lower singleprecision (32bit) floatingpoint elements in a and b using the lower 32bit integer in c, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. imm8 is used to set the required flags reporting. 
_mm_maskz_fmadd_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_fmadd_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_fmadd_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_fmadd_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_fmsub_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_fmsub_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_fmsub_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_fmsub_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_fnmadd_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_fnmadd_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_fnmadd_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_fnmadd_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and add the negated intermediate result to the lower element in c. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_fnmsub_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_fnmsub_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_fnmsub_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_fnmsub_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint elements in a and b, and subtract the lower element in c from the negated intermediate result. Store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_getexp_round_sd^{⚠}  Experimentalavx512f Convert the exponent of the lower doubleprecision (64bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_maskz_getexp_round_ss^{⚠}  Experimentalavx512f Convert the exponent of the lower singleprecision (32bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_maskz_getexp_sd^{⚠}  Experimentalavx512f Convert the exponent of the lower doubleprecision (64bit) floatingpoint element in b to a doubleprecision (64bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_maskz_getexp_ss^{⚠}  Experimentalavx512f Convert the exponent of the lower singleprecision (32bit) floatingpoint element in b to a singleprecision (32bit) floatingpoint number representing the integer exponent, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates floor(log2(x)) for the lower element. 
_mm_maskz_getmant_round_sd^{⚠}  Experimentalavx512f Normalize the mantissas of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_maskz_getmant_round_ss^{⚠}  Experimentalavx512f Normalize the mantissas of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_maskz_getmant_sd^{⚠}  Experimentalavx512f Normalize the mantissas of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_maskz_getmant_ss^{⚠}  Experimentalavx512f Normalize the mantissas of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. 
_mm_maskz_max_round_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_max_round_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_max_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_max_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_min_round_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_min_round_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_min_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_min_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_move_sd^{⚠}  Experimentalavx512f Move the lower doubleprecision (64bit) floatingpoint element from b to the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_move_ss^{⚠}  Experimentalavx512f Move the lower singleprecision (32bit) floatingpoint element from b to the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_mul_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_mul_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_mul_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_mul_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_rcp14_sd^{⚠}  Experimentalavx512f Compute the approximate reciprocal of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_maskz_rcp14_ss^{⚠}  Experimentalavx512f Compute the approximate reciprocal of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_maskz_roundscale_round_sd^{⚠}  Experimentalavx512f Round the lower doubleprecision (64bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_roundscale_round_ss^{⚠}  Experimentalavx512f Round the lower singleprecision (32bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_roundscale_sd^{⚠}  Experimentalavx512f Round the lower doubleprecision (64bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_roundscale_ss^{⚠}  Experimentalavx512f Round the lower singleprecision (32bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_rsqrt14_sd^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_maskz_rsqrt14_ss^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_maskz_scalef_round_sd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_scalef_round_ss^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_scalef_sd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_scalef_ss^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_sqrt_round_sd^{⚠}  Experimentalavx512f Compute the square root of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_sqrt_round_ss^{⚠}  Experimentalavx512f Compute the square root of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_sqrt_sd^{⚠}  Experimentalavx512f Compute the square root of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_sqrt_ss^{⚠}  Experimentalavx512f Compute the square root of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_sub_round_sd^{⚠}  Experimentalavx512f Subtract the lower doubleprecision (64bit) floatingpoint element in b from the lower doubleprecision (64bit) floatingpoint element in a, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_sub_round_ss^{⚠}  Experimentalavx512f Subtract the lower singleprecision (32bit) floatingpoint element in b from the lower singleprecision (32bit) floatingpoint element in a, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_maskz_sub_sd^{⚠}  Experimentalavx512f Subtract the lower doubleprecision (64bit) floatingpoint element in b from the lower doubleprecision (64bit) floatingpoint element in a, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper element from a to the upper element of dst. 
_mm_maskz_sub_ss^{⚠}  Experimentalavx512f Subtract the lower singleprecision (32bit) floatingpoint element in b from the lower singleprecision (32bit) floatingpoint element in a, store the result in the lower element of dst using zeromask k (the element is zeroed out when mask bit 0 is not set), and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_max_round_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_max_round_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the maximum value in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_min_round_sd^{⚠}  Experimentalavx512f Compare the lower doubleprecision (64bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst , and copy the upper element from a to the upper element of dst. 
_mm_min_round_ss^{⚠}  Experimentalavx512f Compare the lower singleprecision (32bit) floatingpoint elements in a and b, store the minimum value in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_mul_round_sd^{⚠}  Experimentalavx512f Multiply the lower doubleprecision (64bit) floatingpoint element in a and b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_mul_round_ss^{⚠}  Experimentalavx512f Multiply the lower singleprecision (32bit) floatingpoint element in a and b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_rcp14_sd^{⚠}  Experimentalavx512f Compute the approximate reciprocal of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_rcp14_ss^{⚠}  Experimentalavx512f Compute the approximate reciprocal of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_roundscale_round_sd^{⚠}  Experimentalavx512f Round the lower doubleprecision (64bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_roundscale_round_ss^{⚠}  Experimentalavx512f Round the lower singleprecision (32bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_roundscale_sd^{⚠}  Experimentalavx512f Round the lower doubleprecision (64bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_roundscale_ss^{⚠}  Experimentalavx512f Round the lower singleprecision (32bit) floatingpoint element in b to the number of fraction bits specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_rsqrt14_sd^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_rsqrt14_ss^{⚠}  Experimentalavx512f Compute the approximate reciprocal square root of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. The maximum relative error for this approximation is less than 2^14. 
_mm_scalef_round_sd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_scalef_round_ss^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_scalef_sd^{⚠}  Experimentalavx512f Scale the packed doubleprecision (64bit) floatingpoint elements in a using values from b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_scalef_ss^{⚠}  Experimentalavx512f Scale the packed singleprecision (32bit) floatingpoint elements in a using values from b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_sqrt_round_sd^{⚠}  Experimentalavx512f Compute the square root of the lower doubleprecision (64bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_sqrt_round_ss^{⚠}  Experimentalavx512f Compute the square root of the lower singleprecision (32bit) floatingpoint element in b, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_mm_sub_round_sd^{⚠}  Experimentalavx512f Subtract the lower doubleprecision (64bit) floatingpoint element in b from the lower doubleprecision (64bit) floatingpoint element in a, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst. 
_mm_sub_round_ss^{⚠}  Experimentalavx512f Subtract the lower singleprecision (32bit) floatingpoint element in b from the lower singleprecision (32bit) floatingpoint element in a, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst. 
_xabort^{⚠}  Experimentalrtm Forces a restricted transactional memory (RTM) region to abort. 
_xabort_code  Experimental Retrieves the parameter passed to 
_xbegin^{⚠}  Experimentalrtm Specifies the start of a restricted transactional memory (RTM) code region and returns a value indicating status. 
_xend^{⚠}  Experimentalrtm Specifies the end of a restricted transactional memory (RTM) code region. 
_xtest^{⚠}  Experimentalrtm Queries whether the processor is executing in a transactional region identified by restricted transactional memory (RTM) or hardware lock elision (HLE). 
cmpxchg16b^{⚠}  Experimentalcmpxchg16b Compares and exchange 16 bytes (128 bits) of data atomically. 
has_cpuid  Experimental Does the host support the 
ud2^{⚠}  Experimental Generates the trap instruction 
_MM_GET_EXCEPTION_MASK^{⚠}  sse See 
_MM_GET_EXCEPTION_STATE^{⚠}  sse See 
_MM_GET_FLUSH_ZERO_MODE^{⚠}  sse See 
_MM_GET_ROUNDING_MODE^{⚠}  sse See 
_MM_SET_EXCEPTION_MASK^{⚠}  sse See 
_MM_SET_EXCEPTION_STATE^{⚠}  sse See 
_MM_SET_FLUSH_ZERO_MODE^{⚠}  sse See 
_MM_SET_ROUNDING_MODE^{⚠}  sse See 
_MM_TRANSPOSE4_PS^{⚠}  sse Transpose the 4x4 matrix formed by 4 rows of __m128 in place. 
__cpuid^{⚠}  See 
__cpuid_count^{⚠}  Returns the result of the 
__get_cpuid_max^{⚠}  Returns the highestsupported 
__rdtscp^{⚠}  Reads the current value of the processor’s timestamp counter and
the 
_addcarry_u32^{⚠}  Adds unsigned 32bit integers 
_addcarry_u64^{⚠}  Adds unsigned 64bit integers 
_addcarryx_u32^{⚠}  adx Adds unsigned 32bit integers 
_addcarryx_u64^{⚠}  adx Adds unsigned 64bit integers 
_andn_u32^{⚠}  bmi1 Bitwise logical 
_andn_u64^{⚠}  bmi1 Bitwise logical 
_bextr2_u32^{⚠}  bmi1 Extracts bits of 
_bextr2_u64^{⚠}  bmi1 Extracts bits of 
_bextr_u32^{⚠}  bmi1 Extracts bits in range [ 
_bextr_u64^{⚠}  bmi1 Extracts bits in range [ 
_blcfill_u32^{⚠}  tbm Clears all bits below the least significant zero bit of 
_blcfill_u64^{⚠}  tbm Clears all bits below the least significant zero bit of 
_blci_u32^{⚠}  tbm Sets all bits of 
_blci_u64^{⚠}  tbm Sets all bits of 
_blcic_u32^{⚠}  tbm Sets the least significant zero bit of 
_blcic_u64^{⚠}  tbm Sets the least significant zero bit of 
_blcmsk_u32^{⚠}  tbm Sets the least significant zero bit of 
_blcmsk_u64^{⚠}  tbm Sets the least significant zero bit of 
_blcs_u32^{⚠}  tbm Sets the least significant zero bit of 
_blcs_u64^{⚠}  tbm Sets the least significant zero bit of 
_blsfill_u32^{⚠}  tbm Sets all bits of 
_blsfill_u64^{⚠}  tbm Sets all bits of 
_blsi_u32^{⚠}  bmi1 Extracts lowest set isolated bit. 
_blsi_u64^{⚠}  bmi1 Extracts lowest set isolated bit. 
_blsic_u32^{⚠}  tbm Clears least significant bit and sets all other bits. 
_blsic_u64^{⚠}  tbm Clears least significant bit and sets all other bits. 
_blsmsk_u32^{⚠}  bmi1 Gets mask up to lowest set bit. 
_blsmsk_u64^{⚠}  bmi1 Gets mask up to lowest set bit. 
_blsr_u32^{⚠}  bmi1 Resets the lowest set bit of 
_blsr_u64^{⚠}  bmi1 Resets the lowest set bit of 
_bswap^{⚠}  Returns an integer with the reversed byte order of x 
_bswap64^{⚠}  Returns an integer with the reversed byte order of x 
_bzhi_u32^{⚠}  bmi2 Zeroes higher bits of 
_bzhi_u64^{⚠}  bmi2 Zeroes higher bits of 
_fxrstor^{⚠}  fxsr Restores the 
_fxrstor64^{⚠}  fxsr Restores the 
_fxsave^{⚠}  fxsr Saves the 
_fxsave64^{⚠}  fxsr Saves the 
_lzcnt_u32^{⚠}  lzcnt Counts the leading most significant zero bits. 
_lzcnt_u64^{⚠}  lzcnt Counts the leading most significant zero bits. 
_mm256_abs_epi8^{⚠}  avx2 Computes the absolute values of packed 8bit integers in 
_mm256_abs_epi16^{⚠}  avx2 Computes the absolute values of packed 16bit integers in 
_mm256_abs_epi32^{⚠}  avx2 Computes the absolute values of packed 32bit integers in 
_mm256_add_epi8^{⚠}  avx2 Adds packed 8bit integers in 
_mm256_add_epi16^{⚠}  avx2 Adds packed 16bit integers in 
_mm256_add_epi32^{⚠}  avx2 Adds packed 32bit integers in 
_mm256_add_epi64^{⚠}  avx2 Adds packed 64bit integers in 
_mm256_add_pd^{⚠}  avx Adds packed doubleprecision (64bit) floatingpoint elements
in 
_mm256_add_ps^{⚠}  avx Adds packed singleprecision (32bit) floatingpoint elements in 
_mm256_adds_epi8^{⚠}  avx2 Adds packed 8bit integers in 
_mm256_adds_epi16^{⚠}  avx2 Adds packed 16bit integers in 
_mm256_adds_epu8^{⚠}  avx2 Adds packed unsigned 8bit integers in 
_mm256_adds_epu16^{⚠}  avx2 Adds packed unsigned 16bit integers in 
_mm256_addsub_pd^{⚠}  avx Alternatively adds and subtracts packed doubleprecision (64bit)
floatingpoint elements in 
_mm256_addsub_ps^{⚠}  avx Alternatively adds and subtracts packed singleprecision (32bit)
floatingpoint elements in 
_mm256_alignr_epi8^{⚠}  avx2 Concatenates pairs of 16byte blocks in 
_mm256_and_pd^{⚠}  avx Computes the bitwise AND of a packed doubleprecision (64bit)
floatingpoint elements in 
_mm256_and_ps^{⚠}  avx Computes the bitwise AND of packed singleprecision (32bit) floatingpoint
elements in 
_mm256_and_si256^{⚠}  avx2 Computes the bitwise AND of 256 bits (representing integer data)
in 
_mm256_andnot_pd^{⚠}  avx Computes the bitwise NOT of packed doubleprecision (64bit) floatingpoint
elements in 
_mm256_andnot_ps^{⚠}  avx Computes the bitwise NOT of packed singleprecision (32bit) floatingpoint
elements in 
_mm256_andnot_si256^{⚠}  avx2 Computes the bitwise NOT of 256 bits (representing integer data)
in 
_mm256_avg_epu8^{⚠}  avx2 Averages packed unsigned 8bit integers in 
_mm256_avg_epu16^{⚠}  avx2 Averages packed unsigned 16bit integers in 
_mm256_blend_epi16^{⚠}  avx2 Blends packed 16bit integers from 
_mm256_blend_epi32^{⚠}  avx2 Blends packed 32bit integers from 
_mm256_blend_pd^{⚠}  avx Blends packed doubleprecision (64bit) floatingpoint elements from

_mm256_blend_ps^{⚠}  avx Blends packed singleprecision (32bit) floatingpoint elements from

_mm256_blendv_epi8^{⚠}  avx2 Blends packed 8bit integers from 
_mm256_blendv_pd^{⚠}  avx Blends packed doubleprecision (64bit) floatingpoint elements from

_mm256_blendv_ps^{⚠}  avx Blends packed singleprecision (32bit) floatingpoint elements from

_mm256_broadcast_pd^{⚠}  avx Broadcasts 128 bits from memory (composed of 2 packed doubleprecision (64bit) floatingpoint elements) to all elements of the returned vector. 
_mm256_broadcast_ps^{⚠}  avx Broadcasts 128 bits from memory (composed of 4 packed singleprecision (32bit) floatingpoint elements) to all elements of the returned vector. 
_mm256_broadcast_sd^{⚠}  avx Broadcasts a doubleprecision (64bit) floatingpoint element from memory to all elements of the returned vector. 
_mm256_broadcast_ss^{⚠}  avx Broadcasts a singleprecision (32bit) floatingpoint element from memory to all elements of the returned vector. 
_mm256_broadcastb_epi8^{⚠}  avx2 Broadcasts the low packed 8bit integer from 
_mm256_broadcastd_epi32^{⚠}  avx2 Broadcasts the low packed 32bit integer from 
_mm256_broadcastq_epi64^{⚠}  avx2 Broadcasts the low packed 64bit integer from 
_mm256_broadcastsd_pd^{⚠}  avx2 Broadcasts the low doubleprecision (64bit) floatingpoint element
from 
_mm256_broadcastsi128_si256^{⚠}  avx2 Broadcasts 128 bits of integer data from a to all 128bit lanes in the 256bit returned value. 
_mm256_broadcastss_ps^{⚠}  avx2 Broadcasts the low singleprecision (32bit) floatingpoint element
from 
_mm256_broadcastw_epi16^{⚠}  avx2 Broadcasts the low packed 16bit integer from a to all elements of the 256bit returned value 
_mm256_bslli_epi128^{⚠}  avx2 Shifts 128bit lanes in 
_mm256_bsrli_epi128^{⚠}  avx2 Shifts 128bit lanes in 
_mm256_castpd128_pd256^{⚠}  avx Casts vector of type __m128d to type __m256d; the upper 128 bits of the result are undefined. 
_mm256_castpd256_pd128^{⚠}  avx Casts vector of type __m256d to type __m128d. 
_mm256_castpd_ps^{⚠}  avx Cast vector of type __m256d to type __m256. 
_mm256_castpd_si256^{⚠}  avx Casts vector of type __m256d to type __m256i. 
_mm256_castps128_ps256^{⚠}  avx Casts vector of type __m128 to type __m256; the upper 128 bits of the result are undefined. 
_mm256_castps256_ps128^{⚠} 