1.27.0[−][src]Module core::arch::x86_64
Platformspecific intrinsics for the x86_64
platform.
See the module documentation for more details.
Structs
CpuidResult  x8664 Result of the 
__m128  x8664 128bit wide set of four 
__m128d  x8664 128bit wide set of two 
__m128i  x8664 128bit wide integer vector type, x86specific 
__m256  x8664 256bit wide set of eight 
__m256d  x8664 256bit wide set of four 
__m256i  x8664 256bit wide integer vector type, x86specific 
__m512  Experimentalx8664 512bit wide set of sixteen 
__m512d  Experimentalx8664 512bit wide set of eight 
__m512i  Experimentalx8664 512bit wide integer vector type, x86specific 
Constants
_CMP_EQ_OQ  x8664 Equal (ordered, nonsignaling) 
_CMP_EQ_OS  x8664 Equal (ordered, signaling) 
_CMP_EQ_UQ  x8664 Equal (unordered, nonsignaling) 
_CMP_EQ_US  x8664 Equal (unordered, signaling) 
_CMP_FALSE_OQ  x8664 False (ordered, nonsignaling) 
_CMP_FALSE_OS  x8664 False (ordered, signaling) 
_CMP_GE_OQ  x8664 Greaterthanorequal (ordered, nonsignaling) 
_CMP_GE_OS  x8664 Greaterthanorequal (ordered, signaling) 
_CMP_GT_OQ  x8664 Greaterthan (ordered, nonsignaling) 
_CMP_GT_OS  x8664 Greaterthan (ordered, signaling) 
_CMP_LE_OQ  x8664 Lessthanorequal (ordered, nonsignaling) 
_CMP_LE_OS  x8664 Lessthanorequal (ordered, signaling) 
_CMP_LT_OQ  x8664 Lessthan (ordered, nonsignaling) 
_CMP_LT_OS  x8664 Lessthan (ordered, signaling) 
_CMP_NEQ_OQ  x8664 Notequal (ordered, nonsignaling) 
_CMP_NEQ_OS  x8664 Notequal (ordered, signaling) 
_CMP_NEQ_UQ  x8664 Notequal (unordered, nonsignaling) 
_CMP_NEQ_US  x8664 Notequal (unordered, signaling) 
_CMP_NGE_UQ  x8664 Notgreaterthanorequal (unordered, nonsignaling) 
_CMP_NGE_US  x8664 Notgreaterthanorequal (unordered, signaling) 
_CMP_NGT_UQ  x8664 Notgreaterthan (unordered, nonsignaling) 
_CMP_NGT_US  x8664 Notgreaterthan (unordered, signaling) 
_CMP_NLE_UQ  x8664 Notlessthanorequal (unordered, nonsignaling) 
_CMP_NLE_US  x8664 Notlessthanorequal (unordered, signaling) 
_CMP_NLT_UQ  x8664 Notlessthan (unordered, nonsignaling) 
_CMP_NLT_US  x8664 Notlessthan (unordered, signaling) 
_CMP_ORD_Q  x8664 Ordered (nonsignaling) 
_CMP_ORD_S  x8664 Ordered (signaling) 
_CMP_TRUE_UQ  x8664 True (unordered, nonsignaling) 
_CMP_TRUE_US  x8664 True (unordered, signaling) 
_CMP_UNORD_Q  x8664 Unordered (nonsignaling) 
_CMP_UNORD_S  x8664 Unordered (signaling) 
_MM_EXCEPT_DENORM  x8664 See 
_MM_EXCEPT_DIV_ZERO  x8664 See 
_MM_EXCEPT_INEXACT  x8664 See 
_MM_EXCEPT_INVALID  x8664 See 
_MM_EXCEPT_MASK  x8664 
_MM_EXCEPT_OVERFLOW  x8664 See 
_MM_EXCEPT_UNDERFLOW  x8664 See 
_MM_FLUSH_ZERO_MASK  x8664 
_MM_FLUSH_ZERO_OFF  x8664 See 
_MM_FLUSH_ZERO_ON  x8664 See 
_MM_FROUND_CEIL  x8664 round up and do not suppress exceptions 
_MM_FROUND_CUR_DIRECTION  x8664 use MXCSR.RC; see 
_MM_FROUND_FLOOR  x8664 round down and do not suppress exceptions 
_MM_FROUND_NEARBYINT  x8664 use MXCSR.RC and suppress exceptions; see 
_MM_FROUND_NINT  x8664 round to nearest and do not suppress exceptions 
_MM_FROUND_NO_EXC  x8664 suppress exceptions 
_MM_FROUND_RAISE_EXC  x8664 do not suppress exceptions 
_MM_FROUND_RINT  x8664 use MXCSR.RC and do not suppress exceptions; see

_MM_FROUND_TO_NEAREST_INT  x8664 round to nearest 
_MM_FROUND_TO_NEG_INF  x8664 round down 
_MM_FROUND_TO_POS_INF  x8664 round up 
_MM_FROUND_TO_ZERO  x8664 truncate 
_MM_FROUND_TRUNC  x8664 truncate and do not suppress exceptions 
_MM_HINT_NTA  x8664 See 
_MM_HINT_T0  x8664 See 
_MM_HINT_T1  x8664 See 
_MM_HINT_T2  x8664 See 
_MM_MASK_DENORM  x8664 See 
_MM_MASK_DIV_ZERO  x8664 See 
_MM_MASK_INEXACT  x8664 See 
_MM_MASK_INVALID  x8664 See 
_MM_MASK_MASK  x8664 
_MM_MASK_OVERFLOW  x8664 See 
_MM_MASK_UNDERFLOW  x8664 See 
_MM_ROUND_DOWN  x8664 See 
_MM_ROUND_MASK  x8664 
_MM_ROUND_NEAREST  x8664 See 
_MM_ROUND_TOWARD_ZERO  x8664 See 
_MM_ROUND_UP  x8664 See 
_SIDD_BIT_MASK  x8664 Mask only: return the bit mask 
_SIDD_CMP_EQUAL_ANY  x8664 For each character in 
_SIDD_CMP_EQUAL_EACH  x8664 The strings defined by 
_SIDD_CMP_EQUAL_ORDERED  x8664 Search for the defined substring in the target 
_SIDD_CMP_RANGES  x8664 For each character in 
_SIDD_LEAST_SIGNIFICANT  x8664 Index only: return the least significant bit (Default) 
_SIDD_MASKED_NEGATIVE_POLARITY  x8664 Negates results only before the end of the string 
_SIDD_MASKED_POSITIVE_POLARITY  x8664 Do not negate results before the end of the string 
_SIDD_MOST_SIGNIFICANT  x8664 Index only: return the most significant bit 
_SIDD_NEGATIVE_POLARITY  x8664 Negates results 
_SIDD_POSITIVE_POLARITY  x8664 Do not negate results (Default) 
_SIDD_SBYTE_OPS  x8664 String contains signed 8bit characters 
_SIDD_SWORD_OPS  x8664 String contains unsigned 16bit characters 
_SIDD_UBYTE_OPS  x8664 String contains unsigned 8bit characters (Default) 
_SIDD_UNIT_MASK  x8664 Mask only: return the byte mask 
_SIDD_UWORD_OPS  x8664 String contains unsigned 16bit characters 
_XCR_XFEATURE_ENABLED_MASK  x8664

_MM_CMPINT_EQ  Experimentalx8664 Equal 
_MM_CMPINT_FALSE  Experimentalx8664 False 
_MM_CMPINT_LE  Experimentalx8664 Lessthanorequal 
_MM_CMPINT_LT  Experimentalx8664 Lessthan 
_MM_CMPINT_NE  Experimentalx8664 Notequal 
_MM_CMPINT_NLE  Experimentalx8664 Not lessthanorequal 
_MM_CMPINT_NLT  Experimentalx8664 Not lessthan 
_MM_CMPINT_TRUE  Experimentalx8664 True 
_MM_MANT_NORM_1_2  Experimentalx8664 interval [1, 2) 
_MM_MANT_NORM_P5_1  Experimentalx8664 interval [0.5, 1) 
_MM_MANT_NORM_P5_2  Experimentalx8664 interval [0.5, 2) 
_MM_MANT_NORM_P75_1P5  Experimentalx8664 interval [0.75, 1.5) 
_MM_MANT_SIGN_NAN  Experimentalx8664 DEST = NaN if sign(SRC) = 1 
_MM_MANT_SIGN_SRC  Experimentalx8664 sign = sign(SRC) 
_MM_MANT_SIGN_ZERO  Experimentalx8664 sign = 0 
_MM_PERM_AAAA  Experimentalx8664 
_MM_PERM_AAAB  Experimentalx8664 
_MM_PERM_AAAC  Experimentalx8664 
_MM_PERM_AAAD  Experimentalx8664 
_MM_PERM_AABA  Experimentalx8664 
_MM_PERM_AABB  Experimentalx8664 
_MM_PERM_AABC  Experimentalx8664 
_MM_PERM_AABD  Experimentalx8664 
_MM_PERM_AACA  Experimentalx8664 
_MM_PERM_AACB  Experimentalx8664 
_MM_PERM_AACC  Experimentalx8664 
_MM_PERM_AACD  Experimentalx8664 
_MM_PERM_AADA  Experimentalx8664 
_MM_PERM_AADB  Experimentalx8664 
_MM_PERM_AADC  Experimentalx8664 
_MM_PERM_AADD  Experimentalx8664 
_MM_PERM_ABAA  Experimentalx8664 
_MM_PERM_ABAB  Experimentalx8664 
_MM_PERM_ABAC  Experimentalx8664 
_MM_PERM_ABAD  Experimentalx8664 
_MM_PERM_ABBA  Experimentalx8664 
_MM_PERM_ABBB  Experimentalx8664 
_MM_PERM_ABBC  Experimentalx8664 
_MM_PERM_ABBD  Experimentalx8664 
_MM_PERM_ABCA  Experimentalx8664 
_MM_PERM_ABCB  Experimentalx8664 
_MM_PERM_ABCC  Experimentalx8664 
_MM_PERM_ABCD  Experimentalx8664 
_MM_PERM_ABDA  Experimentalx8664 
_MM_PERM_ABDB  Experimentalx8664 
_MM_PERM_ABDC  Experimentalx8664 
_MM_PERM_ABDD  Experimentalx8664 
_MM_PERM_ACAA  Experimentalx8664 
_MM_PERM_ACAB  Experimentalx8664 
_MM_PERM_ACAC  Experimentalx8664 
_MM_PERM_ACAD  Experimentalx8664 
_MM_PERM_ACBA  Experimentalx8664 
_MM_PERM_ACBB  Experimentalx8664 
_MM_PERM_ACBC  Experimentalx8664 
_MM_PERM_ACBD  Experimentalx8664 
_MM_PERM_ACCA  Experimentalx8664 
_MM_PERM_ACCB  Experimentalx8664 
_MM_PERM_ACCC  Experimentalx8664 
_MM_PERM_ACCD  Experimentalx8664 
_MM_PERM_ACDA  Experimentalx8664 
_MM_PERM_ACDB  Experimentalx8664 
_MM_PERM_ACDC  Experimentalx8664 
_MM_PERM_ACDD  Experimentalx8664 
_MM_PERM_ADAA  Experimentalx8664 
_MM_PERM_ADAB  Experimentalx8664 
_MM_PERM_ADAC  Experimentalx8664 
_MM_PERM_ADAD  Experimentalx8664 
_MM_PERM_ADBA  Experimentalx8664 
_MM_PERM_ADBB  Experimentalx8664 
_MM_PERM_ADBC  Experimentalx8664 
_MM_PERM_ADBD  Experimentalx8664 
_MM_PERM_ADCA  Experimentalx8664 
_MM_PERM_ADCB  Experimentalx8664 
_MM_PERM_ADCC  Experimentalx8664 
_MM_PERM_ADCD  Experimentalx8664 
_MM_PERM_ADDA  Experimentalx8664 
_MM_PERM_ADDB  Experimentalx8664 
_MM_PERM_ADDC  Experimentalx8664 
_MM_PERM_ADDD  Experimentalx8664 
_MM_PERM_BAAA  Experimentalx8664 
_MM_PERM_BAAB  Experimentalx8664 
_MM_PERM_BAAC  Experimentalx8664 
_MM_PERM_BAAD  Experimentalx8664 
_MM_PERM_BABA  Experimentalx8664 
_MM_PERM_BABB  Experimentalx8664 
_MM_PERM_BABC  Experimentalx8664 
_MM_PERM_BABD  Experimentalx8664 
_MM_PERM_BACA  Experimentalx8664 
_MM_PERM_BACB  Experimentalx8664 
_MM_PERM_BACC  Experimentalx8664 
_MM_PERM_BACD  Experimentalx8664 
_MM_PERM_BADA  Experimentalx8664 
_MM_PERM_BADB  Experimentalx8664 
_MM_PERM_BADC  Experimentalx8664 
_MM_PERM_BADD  Experimentalx8664 
_MM_PERM_BBAA  Experimentalx8664 
_MM_PERM_BBAB  Experimentalx8664 
_MM_PERM_BBAC  Experimentalx8664 
_MM_PERM_BBAD  Experimentalx8664 
_MM_PERM_BBBA  Experimentalx8664 
_MM_PERM_BBBB  Experimentalx8664 
_MM_PERM_BBBC  Experimentalx8664 
_MM_PERM_BBBD  Experimentalx8664 
_MM_PERM_BBCA  Experimentalx8664 
_MM_PERM_BBCB  Experimentalx8664 
_MM_PERM_BBCC  Experimentalx8664 
_MM_PERM_BBCD  Experimentalx8664 
_MM_PERM_BBDA  Experimentalx8664 
_MM_PERM_BBDB  Experimentalx8664 
_MM_PERM_BBDC  Experimentalx8664 
_MM_PERM_BBDD  Experimentalx8664 
_MM_PERM_BCAA  Experimentalx8664 
_MM_PERM_BCAB  Experimentalx8664 
_MM_PERM_BCAC  Experimentalx8664 
_MM_PERM_BCAD  Experimentalx8664 
_MM_PERM_BCBA  Experimentalx8664 
_MM_PERM_BCBB  Experimentalx8664 
_MM_PERM_BCBC  Experimentalx8664 
_MM_PERM_BCBD  Experimentalx8664 
_MM_PERM_BCCA  Experimentalx8664 
_MM_PERM_BCCB  Experimentalx8664 
_MM_PERM_BCCC  Experimentalx8664 
_MM_PERM_BCCD  Experimentalx8664 
_MM_PERM_BCDA  Experimentalx8664 
_MM_PERM_BCDB  Experimentalx8664 
_MM_PERM_BCDC  Experimentalx8664 
_MM_PERM_BCDD  Experimentalx8664 
_MM_PERM_BDAA  Experimentalx8664 
_MM_PERM_BDAB  Experimentalx8664 
_MM_PERM_BDAC  Experimentalx8664 
_MM_PERM_BDAD  Experimentalx8664 
_MM_PERM_BDBA  Experimentalx8664 
_MM_PERM_BDBB  Experimentalx8664 
_MM_PERM_BDBC  Experimentalx8664 
_MM_PERM_BDBD  Experimentalx8664 
_MM_PERM_BDCA  Experimentalx8664 
_MM_PERM_BDCB  Experimentalx8664 
_MM_PERM_BDCC  Experimentalx8664 
_MM_PERM_BDCD  Experimentalx8664 
_MM_PERM_BDDA  Experimentalx8664 
_MM_PERM_BDDB  Experimentalx8664 
_MM_PERM_BDDC  Experimentalx8664 
_MM_PERM_BDDD  Experimentalx8664 
_MM_PERM_CAAA  Experimentalx8664 
_MM_PERM_CAAB  Experimentalx8664 
_MM_PERM_CAAC  Experimentalx8664 
_MM_PERM_CAAD  Experimentalx8664 
_MM_PERM_CABA  Experimentalx8664 
_MM_PERM_CABB  Experimentalx8664 
_MM_PERM_CABC  Experimentalx8664 
_MM_PERM_CABD  Experimentalx8664 
_MM_PERM_CACA  Experimentalx8664 
_MM_PERM_CACB  Experimentalx8664 
_MM_PERM_CACC  Experimentalx8664 
_MM_PERM_CACD  Experimentalx8664 
_MM_PERM_CADA  Experimentalx8664 
_MM_PERM_CADB  Experimentalx8664 
_MM_PERM_CADC  Experimentalx8664 
_MM_PERM_CADD  Experimentalx8664 
_MM_PERM_CBAA  Experimentalx8664 
_MM_PERM_CBAB  Experimentalx8664 
_MM_PERM_CBAC  Experimentalx8664 
_MM_PERM_CBAD  Experimentalx8664 
_MM_PERM_CBBA  Experimentalx8664 
_MM_PERM_CBBB  Experimentalx8664 
_MM_PERM_CBBC  Experimentalx8664 
_MM_PERM_CBBD  Experimentalx8664 
_MM_PERM_CBCA  Experimentalx8664 
_MM_PERM_CBCB  Experimentalx8664 
_MM_PERM_CBCC  Experimentalx8664 
_MM_PERM_CBCD  Experimentalx8664 
_MM_PERM_CBDA  Experimentalx8664 
_MM_PERM_CBDB  Experimentalx8664 
_MM_PERM_CBDC  Experimentalx8664 
_MM_PERM_CBDD  Experimentalx8664 
_MM_PERM_CCAA  Experimentalx8664 
_MM_PERM_CCAB  Experimentalx8664 
_MM_PERM_CCAC  Experimentalx8664 
_MM_PERM_CCAD  Experimentalx8664 
_MM_PERM_CCBA  Experimentalx8664 
_MM_PERM_CCBB  Experimentalx8664 
_MM_PERM_CCBC  Experimentalx8664 
_MM_PERM_CCBD  Experimentalx8664 
_MM_PERM_CCCA  Experimentalx8664 
_MM_PERM_CCCB  Experimentalx8664 
_MM_PERM_CCCC  Experimentalx8664 
_MM_PERM_CCCD  Experimentalx8664 
_MM_PERM_CCDA  Experimentalx8664 
_MM_PERM_CCDB  Experimentalx8664 
_MM_PERM_CCDC  Experimentalx8664 
_MM_PERM_CCDD  Experimentalx8664 
_MM_PERM_CDAA  Experimentalx8664 
_MM_PERM_CDAB  Experimentalx8664 
_MM_PERM_CDAC  Experimentalx8664 
_MM_PERM_CDAD  Experimentalx8664 
_MM_PERM_CDBA  Experimentalx8664 
_MM_PERM_CDBB  Experimentalx8664 
_MM_PERM_CDBC  Experimentalx8664 
_MM_PERM_CDBD  Experimentalx8664 
_MM_PERM_CDCA  Experimentalx8664 
_MM_PERM_CDCB  Experimentalx8664 
_MM_PERM_CDCC  Experimentalx8664 
_MM_PERM_CDCD  Experimentalx8664 
_MM_PERM_CDDA  Experimentalx8664 
_MM_PERM_CDDB  Experimentalx8664 
_MM_PERM_CDDC  Experimentalx8664 
_MM_PERM_CDDD  Experimentalx8664 
_MM_PERM_DAAA  Experimentalx8664 
_MM_PERM_DAAB  Experimentalx8664 
_MM_PERM_DAAC  Experimentalx8664 
_MM_PERM_DAAD  Experimentalx8664 
_MM_PERM_DABA  Experimentalx8664 
_MM_PERM_DABB  Experimentalx8664 
_MM_PERM_DABC  Experimentalx8664 
_MM_PERM_DABD  Experimentalx8664 
_MM_PERM_DACA  Experimentalx8664 
_MM_PERM_DACB  Experimentalx8664 
_MM_PERM_DACC  Experimentalx8664 
_MM_PERM_DACD  Experimentalx8664 
_MM_PERM_DADA  Experimentalx8664 
_MM_PERM_DADB  Experimentalx8664 
_MM_PERM_DADC  Experimentalx8664 
_MM_PERM_DADD  Experimentalx8664 
_MM_PERM_DBAA  Experimentalx8664 
_MM_PERM_DBAB  Experimentalx8664 
_MM_PERM_DBAC  Experimentalx8664 
_MM_PERM_DBAD  Experimentalx8664 
_MM_PERM_DBBA  Experimentalx8664 
_MM_PERM_DBBB  Experimentalx8664 
_MM_PERM_DBBC  Experimentalx8664 
_MM_PERM_DBBD  Experimentalx8664 
_MM_PERM_DBCA  Experimentalx8664 
_MM_PERM_DBCB  Experimentalx8664 
_MM_PERM_DBCC  Experimentalx8664 
_MM_PERM_DBCD  Experimentalx8664 
_MM_PERM_DBDA  Experimentalx8664 
_MM_PERM_DBDB  Experimentalx8664 
_MM_PERM_DBDC  Experimentalx8664 
_MM_PERM_DBDD  Experimentalx8664 
_MM_PERM_DCAA  Experimentalx8664 
_MM_PERM_DCAB  Experimentalx8664 
_MM_PERM_DCAC  Experimentalx8664 
_MM_PERM_DCAD  Experimentalx8664 
_MM_PERM_DCBA  Experimentalx8664 
_MM_PERM_DCBB  Experimentalx8664 
_MM_PERM_DCBC  Experimentalx8664 
_MM_PERM_DCBD  Experimentalx8664 
_MM_PERM_DCCA  Experimentalx8664 
_MM_PERM_DCCB  Experimentalx8664 
_MM_PERM_DCCC  Experimentalx8664 
_MM_PERM_DCCD  Experimentalx8664 
_MM_PERM_DCDA  Experimentalx8664 
_MM_PERM_DCDB  Experimentalx8664 
_MM_PERM_DCDC  Experimentalx8664 
_MM_PERM_DCDD  Experimentalx8664 
_MM_PERM_DDAA  Experimentalx8664 
_MM_PERM_DDAB  Experimentalx8664 
_MM_PERM_DDAC  Experimentalx8664 
_MM_PERM_DDAD  Experimentalx8664 
_MM_PERM_DDBA  Experimentalx8664 
_MM_PERM_DDBB  Experimentalx8664 
_MM_PERM_DDBC  Experimentalx8664 
_MM_PERM_DDBD  Experimentalx8664 
_MM_PERM_DDCA  Experimentalx8664 
_MM_PERM_DDCB  Experimentalx8664 
_MM_PERM_DDCC  Experimentalx8664 
_MM_PERM_DDCD  Experimentalx8664 
_MM_PERM_DDDA  Experimentalx8664 
_MM_PERM_DDDB  Experimentalx8664 
_MM_PERM_DDDC  Experimentalx8664 
_MM_PERM_DDDD  Experimentalx8664 
_XABORT_CAPACITY  Experimentalx8664 Transaction abort due to the transaction using too much memory. 
_XABORT_CONFLICT  Experimentalx8664 Transaction abort due to a memory conflict with another thread. 
_XABORT_DEBUG  Experimentalx8664 Transaction abort due to a debug trap. 
_XABORT_EXPLICIT  Experimentalx8664 Transaction explicitly aborted with xabort. The parameter passed to xabort is available with

_XABORT_NESTED  Experimentalx8664 Transaction abort in a inner nested transaction. 
_XABORT_RETRY  Experimentalx8664 Transaction retry is possible. 
_XBEGIN_STARTED  Experimentalx8664 Transaction successfully started. 
Functions
_MM_GET_EXCEPTION_MASK^{⚠}  x8664 and sse See 
_MM_GET_EXCEPTION_STATE^{⚠}  x8664 and sse See 
_MM_GET_FLUSH_ZERO_MODE^{⚠}  x8664 and sse See 
_MM_GET_ROUNDING_MODE^{⚠}  x8664 and sse See 
_MM_SET_EXCEPTION_MASK^{⚠}  x8664 and sse See 
_MM_SET_EXCEPTION_STATE^{⚠}  x8664 and sse See 
_MM_SET_FLUSH_ZERO_MODE^{⚠}  x8664 and sse See 
_MM_SET_ROUNDING_MODE^{⚠}  x8664 and sse See 
_MM_TRANSPOSE4_PS^{⚠}  x8664 and sse Transpose the 4x4 matrix formed by 4 rows of __m128 in place. 
__cpuid^{⚠}  x8664 See 
__cpuid_count^{⚠}  x8664 Returns the result of the 
__get_cpuid_max^{⚠}  x8664 Returns the highestsupported 
__rdtscp^{⚠}  x8664 Reads the current value of the processor’s timestamp counter and
the 
_addcarry_u32^{⚠}  x8664 Adds unsigned 32bit integers 
_addcarry_u64^{⚠}  x8664 Adds unsigned 64bit integers 
_addcarryx_u32^{⚠}  x8664 and adx Adds unsigned 32bit integers 
_addcarryx_u64^{⚠}  x8664 and adx Adds unsigned 64bit integers 
_andn_u32^{⚠}  x8664 and bmi1 Bitwise logical 
_andn_u64^{⚠}  x8664 and bmi1 Bitwise logical 
_bextr2_u32^{⚠}  x8664 and bmi1 Extracts bits of 
_bextr2_u64^{⚠}  x8664 and bmi1 Extracts bits of 
_bextr_u32^{⚠}  x8664 and bmi1 Extracts bits in range [ 
_bextr_u64^{⚠}  x8664 and bmi1 Extracts bits in range [ 
_blcfill_u32^{⚠}  x8664 and tbm Clears all bits below the least significant zero bit of 
_blcfill_u64^{⚠}  x8664 and tbm Clears all bits below the least significant zero bit of 
_blci_u32^{⚠}  x8664 and tbm Sets all bits of 
_blci_u64^{⚠}  x8664 and tbm Sets all bits of 
_blcic_u32^{⚠}  x8664 and tbm Sets the least significant zero bit of 
_blcic_u64^{⚠}  x8664 and tbm Sets the least significant zero bit of 
_blcmsk_u32^{⚠}  x8664 and tbm Sets the least significant zero bit of 
_blcmsk_u64^{⚠}  x8664 and tbm Sets the least significant zero bit of 
_blcs_u32^{⚠}  x8664 and tbm Sets the least significant zero bit of 
_blcs_u64^{⚠}  x8664 and tbm Sets the least significant zero bit of 
_blsfill_u32^{⚠}  x8664 and tbm Sets all bits of 
_blsfill_u64^{⚠}  x8664 and tbm Sets all bits of 
_blsi_u32^{⚠}  x8664 and bmi1 Extracts lowest set isolated bit. 
_blsi_u64^{⚠}  x8664 and bmi1 Extracts lowest set isolated bit. 
_blsic_u32^{⚠}  x8664 and tbm Clears least significant bit and sets all other bits. 
_blsic_u64^{⚠}  x8664 and tbm Clears least significant bit and sets all other bits. 
_blsmsk_u32^{⚠}  x8664 and bmi1 Gets mask up to lowest set bit. 
_blsmsk_u64^{⚠}  x8664 and bmi1 Gets mask up to lowest set bit. 
_blsr_u32^{⚠}  x8664 and bmi1 Resets the lowest set bit of 
_blsr_u64^{⚠}  x8664 and bmi1 Resets the lowest set bit of 
_bswap^{⚠}  x8664 Returns an integer with the reversed byte order of x 
_bswap64^{⚠}  x8664 Returns an integer with the reversed byte order of x 
_bzhi_u32^{⚠}  x8664 and bmi2 Zeroes higher bits of 
_bzhi_u64^{⚠}  x8664 and bmi2 Zeroes higher bits of 
_fxrstor^{⚠}  x8664 and fxsr Restores the 
_fxrstor64^{⚠}  x8664 and fxsr Restores the 
_fxsave^{⚠}  x8664 and fxsr Saves the 
_fxsave64^{⚠}  x8664 and fxsr Saves the 
_lzcnt_u32^{⚠}  x8664 and lzcnt Counts the leading most significant zero bits. 
_lzcnt_u64^{⚠}  x8664 and lzcnt Counts the leading most significant zero bits. 
_mm256_abs_epi8^{⚠}  x8664 and avx2 Computes the absolute values of packed 8bit integers in 
_mm256_abs_epi16^{⚠}  x8664 and avx2 Computes the absolute values of packed 16bit integers in 
_mm256_abs_epi32^{⚠}  x8664 and avx2 Computes the absolute values of packed 32bit integers in 
_mm256_add_epi8^{⚠}  x8664 and avx2 Adds packed 8bit integers in 
_mm256_add_epi16^{⚠}  x8664 and avx2 Adds packed 16bit integers in 
_mm256_add_epi32^{⚠}  x8664 and avx2 Adds packed 32bit integers in 
_mm256_add_epi64^{⚠}  x8664 and avx2 Adds packed 64bit integers in 
_mm256_add_pd^{⚠}  x8664 and avx Adds packed doubleprecision (64bit) floatingpoint elements
in 
_mm256_add_ps^{⚠}  x8664 and avx Adds packed singleprecision (32bit) floatingpoint elements in 
_mm256_adds_epi8^{⚠}  x8664 and avx2 Adds packed 8bit integers in 
_mm256_adds_epi16^{⚠}  x8664 and avx2 Adds packed 16bit integers in 
_mm256_adds_epu8^{⚠}  x8664 and avx2 Adds packed unsigned 8bit integers in 
_mm256_adds_epu16^{⚠}  x8664 and avx2 Adds packed unsigned 16bit integers in 
_mm256_addsub_pd^{⚠}  x8664 and avx Alternatively adds and subtracts packed doubleprecision (64bit)
floatingpoint elements in 
_mm256_addsub_ps^{⚠}  x8664 and avx Alternatively adds and subtracts packed singleprecision (32bit)
floatingpoint elements in 
_mm256_alignr_epi8^{⚠}  x8664 and avx2 Concatenates pairs of 16byte blocks in 
_mm256_and_pd^{⚠}  x8664 and avx Computes the bitwise AND of a packed doubleprecision (64bit)
floatingpoint elements in 
_mm256_and_ps^{⚠}  x8664 and avx Computes the bitwise AND of packed singleprecision (32bit) floatingpoint
elements in 
_mm256_and_si256^{⚠}  x8664 and avx2 Computes the bitwise AND of 256 bits (representing integer data)
in 
_mm256_andnot_pd^{⚠}  x8664 and avx Computes the bitwise NOT of packed doubleprecision (64bit) floatingpoint
elements in 
_mm256_andnot_ps^{⚠}  x8664 and avx Computes the bitwise NOT of packed singleprecision (32bit) floatingpoint
elements in 
_mm256_andnot_si256^{⚠}  x8664 and avx2 Computes the bitwise NOT of 256 bits (representing integer data)
in 
_mm256_avg_epu8^{⚠}  x8664 and avx2 Averages packed unsigned 8bit integers in 
_mm256_avg_epu16^{⚠}  x8664 and avx2 Averages packed unsigned 16bit integers in 
_mm256_blend_epi16^{⚠}  x8664 and avx2 Blends packed 16bit integers from 
_mm256_blend_epi32^{⚠}  x8664 and avx2 Blends packed 32bit integers from 
_mm256_blend_pd^{⚠}  x8664 and avx Blends packed doubleprecision (64bit) floatingpoint elements from

_mm256_blend_ps^{⚠}  x8664 and avx Blends packed singleprecision (32bit) floatingpoint elements from

_mm256_blendv_epi8^{⚠}  x8664 and avx2 Blends packed 8bit integers from 
_mm256_blendv_pd^{⚠}  x8664 and avx Blends packed doubleprecision (64bit) floatingpoint elements from

_mm256_blendv_ps^{⚠}  x8664 and avx Blends packed singleprecision (32bit) floatingpoint elements from

_mm256_broadcast_pd^{⚠}  x8664 and avx Broadcasts 128 bits from memory (composed of 2 packed doubleprecision (64bit) floatingpoint elements) to all elements of the returned vector. 
_mm256_broadcast_ps^{⚠}  x8664 and avx Broadcasts 128 bits from memory (composed of 4 packed singleprecision (32bit) floatingpoint elements) to all elements of the returned vector. 
_mm256_broadcast_sd^{⚠}  x8664 and avx Broadcasts a doubleprecision (64bit) floatingpoint element from memory to all elements of the returned vector. 
_mm256_broadcast_ss^{⚠}  x8664 and avx Broadcasts a singleprecision (32bit) floatingpoint element from memory to all elements of the returned vector. 
_mm256_broadcastb_epi8^{⚠}  x8664 and avx2 Broadcasts the low packed 8bit integer from 
_mm256_broadcastd_epi32^{⚠}  x8664 and avx2 Broadcasts the low packed 32bit integer from 
_mm256_broadcastq_epi64^{⚠}  x8664 and avx2 Broadcasts the low packed 64bit integer from 
_mm256_broadcastsd_pd^{⚠}  x8664 and avx2 Broadcasts the low doubleprecision (64bit) floatingpoint element
from 
_mm256_broadcastsi128_si256^{⚠}  x8664 and avx2 Broadcasts 128 bits of integer data from a to all 128bit lanes in the 256bit returned value. 
_mm256_broadcastss_ps^{⚠}  x8664 and avx2 Broadcasts the low singleprecision (32bit) floatingpoint element
from 
_mm256_broadcastw_epi16^{⚠}  x8664 and avx2 Broadcasts the low packed 16bit integer from a to all elements of the 256bit returned value 
_mm256_bslli_epi128^{⚠}  x8664 and avx2 Shifts 128bit lanes in 
_mm256_bsrli_epi128^{⚠}  x8664 and avx2 Shifts 128bit lanes in 
_mm256_castpd128_pd256^{⚠}  x8664 and avx Casts vector of type __m128d to type __m256d; the upper 128 bits of the result are undefined. 
_mm256_castpd256_pd128^{⚠}  x8664 and avx Casts vector of type __m256d to type __m128d. 
_mm256_castpd_ps^{⚠}  x8664 and avx Cast vector of type __m256d to type __m256. 
_mm256_castpd_si256^{⚠}  x8664 and avx Casts vector of type __m256d to type __m256i. 
_mm256_castps128_ps256^{⚠}  x8664 and avx Casts vector of type __m128 to type __m256; the upper 128 bits of the result are undefined. 
_mm256_castps256_ps128^{⚠}  x8664 and avx Casts vector of type __m256 to type __m128. 
_mm256_castps_pd^{⚠}  x8664 and avx Cast vector of type __m256 to type __m256d. 
_mm256_castps_si256^{⚠}  x8664 and avx Casts vector of type __m256 to type __m256i. 
_mm256_castsi128_si256^{⚠}  x8664 and avx Casts vector of type __m128i to type __m256i; the upper 128 bits of the result are undefined. 
_mm256_castsi256_pd^{⚠}  x8664 and avx Casts vector of type __m256i to type __m256d. 
_mm256_castsi256_ps^{⚠}  x8664 and avx Casts vector of type __m256i to type __m256. 
_mm256_castsi256_si128^{⚠}  x8664 and avx Casts vector of type __m256i to type __m128i. 
_mm256_ceil_pd^{⚠}  x8664 and avx Rounds packed doubleprecision (64bit) floating point elements in 
_mm256_ceil_ps^{⚠}  x8664 and avx Rounds packed singleprecision (32bit) floating point elements in 
_mm256_cmp_pd^{⚠}  x8664 and avx Compares packed doubleprecision (64bit) floatingpoint
elements in 
_mm256_cmp_ps^{⚠}  x8664 and avx Compares packed singleprecision (32bit) floatingpoint
elements in 
_mm256_cmpeq_epi8^{⚠}  x8664 and avx2 Compares packed 8bit integers in 
_mm256_cmpeq_epi16^{⚠}  x8664 and avx2 Compares packed 16bit integers in 
_mm256_cmpeq_epi32^{⚠}  x8664 and avx2 Compares packed 32bit integers in 
_mm256_cmpeq_epi64^{⚠}  x8664 and avx2 Compares packed 64bit integers in 
_mm256_cmpgt_epi8^{⚠}  x8664 and avx2 Compares packed 8bit integers in 
_mm256_cmpgt_epi16^{⚠}  x8664 and avx2 Compares packed 16bit integers in 
_mm256_cmpgt_epi32^{⚠}  x8664 and avx2 Compares packed 32bit integers in 
_mm256_cmpgt_epi64^{⚠}  x8664 and avx2 Compares packed 64bit integers in 
_mm256_cvtepi8_epi16^{⚠}  x8664 and avx2 Signextend 8bit integers to 16bit integers. 
_mm256_cvtepi8_epi32^{⚠}  x8664 and avx2 Signextend 8bit integers to 32bit integers. 
_mm256_cvtepi8_epi64^{⚠}  x8664 and avx2 Signextend 8bit integers to 64bit integers. 
_mm256_cvtepi16_epi32^{⚠}  x8664 and avx2 Signextend 16bit integers to 32bit integers. 
_mm256_cvtepi16_epi64^{⚠}  x8664 and avx2 Signextend 16bit integers to 64bit integers. 
_mm256_cvtepi32_epi64^{⚠}  x8664 and avx2 Signextend 32bit integers to 64bit integers. 
_mm256_cvtepi32_pd^{⚠}  x8664 and avx Converts packed 32bit integers in 
_mm256_cvtepi32_ps^{⚠}  x8664 and avx Converts packed 32bit integers in 
_mm256_cvtepu8_epi16^{⚠}  x8664 and avx2 Zeroextend unsigned 8bit integers in 
_mm256_cvtepu8_epi32^{⚠}  x8664 and avx2 Zeroextend the lower eight unsigned 8bit integers in 
_mm256_cvtepu8_epi64^{⚠}  x8664 and avx2 Zeroextend the lower four unsigned 8bit integers in 
_mm256_cvtepu16_epi32^{⚠}  x8664 and avx2 Zeroes extend packed unsigned 16bit integers in 
_mm256_cvtepu16_epi64^{⚠}  x8664 and avx2 Zeroextend the lower four unsigned 16bit integers in 
_mm256_cvtepu32_epi64^{⚠}  x8664 and avx2 Zeroextend unsigned 32bit integers in 
_mm256_cvtpd_epi32^{⚠}  x8664 and avx Converts packed doubleprecision (64bit) floatingpoint elements in 
_mm256_cvtpd_ps^{⚠}  x8664 and avx Converts packed doubleprecision (64bit) floatingpoint elements in 
_mm256_cvtps_epi32^{⚠}  x8664 and avx Converts packed singleprecision (32bit) floatingpoint elements in 
_mm256_cvtps_pd^{⚠}  x8664 and avx Converts packed singleprecision (32bit) floatingpoint elements in 
_mm256_cvtsd_f64^{⚠}  x8664 and avx2 Returns the first element of the input vector of 
_mm256_cvtsi256_si32^{⚠}  x8664 and avx2 Returns the first element of the input vector of 
_mm256_cvtss_f32^{⚠}  x8664 and avx Returns the first element of the input vector of 
_mm256_cvttpd_epi32^{⚠}  x8664 and avx Converts packed doubleprecision (64bit) floatingpoint elements in 
_mm256_cvttps_epi32^{⚠}  x8664 and avx Converts packed singleprecision (32bit) floatingpoint elements in 
_mm256_div_pd^{⚠}  x8664 and avx Computes the division of each of the 4 packed 64bit floatingpoint elements
in 
_mm256_div_ps^{⚠}  x8664 and avx Computes the division of each of the 8 packed 32bit floatingpoint elements
in 
_mm256_dp_ps^{⚠}  x8664 and avx Conditionally multiplies the packed singleprecision (32bit) floatingpoint
elements in 
_mm256_extract_epi8^{⚠}  x8664 and avx2 Extracts an 8bit integer from 
_mm256_extract_epi16^{⚠}  x8664 and avx2 Extracts a 16bit integer from 
_mm256_extract_epi32^{⚠}  x8664 and avx2 Extracts a 32bit integer from 
_mm256_extract_epi64^{⚠}  x8664 and avx2 Extracts a 64bit integer from 
_mm256_extractf128_pd^{⚠}  x8664 and avx Extracts 128 bits (composed of 2 packed doubleprecision (64bit)
floatingpoint elements) from 
_mm256_extractf128_ps^{⚠}  x8664 and avx Extracts 128 bits (composed of 4 packed singleprecision (32bit)
floatingpoint elements) from 
_mm256_extractf128_si256^{⚠}  x8664 and avx Extracts 128 bits (composed of integer data) from 
_mm256_extracti128_si256^{⚠}  x8664 and avx2 Extracts 128 bits (of integer data) from 
_mm256_floor_pd^{⚠}  x8664 and avx Rounds packed doubleprecision (64bit) floating point elements in 
_mm256_floor_ps^{⚠}  x8664 and avx Rounds packed singleprecision (32bit) floating point elements in 
_mm256_fmadd_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm256_fmadd_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm256_fmaddsub_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm256_fmaddsub_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm256_fmsub_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm256_fmsub_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm256_fmsubadd_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm256_fmsubadd_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm256_fnmadd_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm256_fnmadd_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm256_fnmsub_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm256_fnmsub_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm256_hadd_epi16^{⚠}  x8664 and avx2 Horizontally adds adjacent pairs of 16bit integers in 
_mm256_hadd_epi32^{⚠}  x8664 and avx2 Horizontally adds adjacent pairs of 32bit integers in 
_mm256_hadd_pd^{⚠}  x8664 and avx Horizontal addition of adjacent pairs in the two packed vectors
of 4 64bit floating points 
_mm256_hadd_ps^{⚠}  x8664 and avx Horizontal addition of adjacent pairs in the two packed vectors
of 8 32bit floating points 
_mm256_hadds_epi16^{⚠}  x8664 and avx2 Horizontally adds adjacent pairs of 16bit integers in 
_mm256_hsub_epi16^{⚠}  x8664 and avx2 Horizontally subtract adjacent pairs of 16bit integers in 
_mm256_hsub_epi32^{⚠}  x8664 and avx2 Horizontally subtract adjacent pairs of 32bit integers in 
_mm256_hsub_pd^{⚠}  x8664 and avx Horizontal subtraction of adjacent pairs in the two packed vectors
of 4 64bit floating points 
_mm256_hsub_ps^{⚠}  x8664 and avx Horizontal subtraction of adjacent pairs in the two packed vectors
of 8 32bit floating points 
_mm256_hsubs_epi16^{⚠}  x8664 and avx2 Horizontally subtract adjacent pairs of 16bit integers in 
_mm256_i32gather_epi32^{⚠}  x8664 and avx2 Returns values from 
_mm256_i32gather_epi64^{⚠}  x8664 and avx2 Returns values from 
_mm256_i32gather_pd^{⚠}  x8664 and avx2 Returns values from 
_mm256_i32gather_ps^{⚠}  x8664 and avx2 Returns values from 
_mm256_i64gather_epi32^{⚠}  x8664 and avx2 Returns values from 
_mm256_i64gather_epi64^{⚠}  x8664 and avx2 Returns values from 
_mm256_i64gather_pd^{⚠}  x8664 and avx2 Returns values from 
_mm256_i64gather_ps^{⚠}  x8664 and avx2 Returns values from 
_mm256_insert_epi8^{⚠}  x8664 and avx Copies 
_mm256_insert_epi16^{⚠}  x8664 and avx Copies 
_mm256_insert_epi32^{⚠}  x8664 and avx Copies 
_mm256_insert_epi64^{⚠}  x8664 and avx Copies 
_mm256_insertf128_pd^{⚠}  x8664 and avx Copies 
_mm256_insertf128_ps^{⚠}  x8664 and avx Copies 
_mm256_insertf128_si256^{⚠}  x8664 and avx Copies 
_mm256_inserti128_si256^{⚠}  x8664 and avx2 Copies 
_mm256_lddqu_si256^{⚠}  x8664 and avx Loads 256bits of integer data from unaligned memory into result.
This intrinsic may perform better than 
_mm256_load_pd^{⚠}  x8664 and avx Loads 256bits (composed of 4 packed doubleprecision (64bit)
floatingpoint elements) from memory into result.

_mm256_load_ps^{⚠}  x8664 and avx Loads 256bits (composed of 8 packed singleprecision (32bit)
floatingpoint elements) from memory into result.

_mm256_load_si256^{⚠}  x8664 and avx Loads 256bits of integer data from memory into result.

_mm256_loadu2_m128^{⚠}  x8664 and avx,sse Loads two 128bit values (composed of 4 packed singleprecision (32bit)
floatingpoint elements) from memory, and combine them into a 256bit
value.

_mm256_loadu2_m128d^{⚠}  x8664 and avx,sse2 Loads two 128bit values (composed of 2 packed doubleprecision (64bit)
floatingpoint elements) from memory, and combine them into a 256bit
value.

_mm256_loadu2_m128i^{⚠}  x8664 and avx,sse2 Loads two 128bit values (composed of integer data) from memory, and combine
them into a 256bit value.

_mm256_loadu_pd^{⚠}  x8664 and avx Loads 256bits (composed of 4 packed doubleprecision (64bit)
floatingpoint elements) from memory into result.

_mm256_loadu_ps^{⚠}  x8664 and avx Loads 256bits (composed of 8 packed singleprecision (32bit)
floatingpoint elements) from memory into result.

_mm256_loadu_si256^{⚠}  x8664 and avx Loads 256bits of integer data from memory into result.

_mm256_madd_epi16^{⚠}  x8664 and avx2 Multiplies packed signed 16bit integers in 
_mm256_maddubs_epi16^{⚠}  x8664 and avx2 Vertically multiplies each unsigned 8bit integer from 
_mm256_mask_i32gather_epi32^{⚠}  x8664 and avx2 Returns values from 
_mm256_mask_i32gather_epi64^{⚠}  x8664 and avx2 Returns values from 
_mm256_mask_i32gather_pd^{⚠}  x8664 and avx2 Returns values from 
_mm256_mask_i32gather_ps^{⚠}  x8664 and avx2 Returns values from 
_mm256_mask_i64gather_epi32^{⚠}  x8664 and avx2 Returns values from 
_mm256_mask_i64gather_epi64^{⚠}  x8664 and avx2 Returns values from 
_mm256_mask_i64gather_pd^{⚠}  x8664 and avx2 Returns values from 
_mm256_mask_i64gather_ps^{⚠}  x8664 and avx2 Returns values from 
_mm256_maskload_epi32^{⚠}  x8664 and avx2 Loads packed 32bit integers from memory pointed by 
_mm256_maskload_epi64^{⚠}  x8664 and avx2 Loads packed 64bit integers from memory pointed by 
_mm256_maskload_pd^{⚠}  x8664 and avx Loads packed doubleprecision (64bit) floatingpoint elements from memory
into result using 
_mm256_maskload_ps^{⚠}  x8664 and avx Loads packed singleprecision (32bit) floatingpoint elements from memory
into result using 
_mm256_maskstore_epi32^{⚠}  x8664 and avx2 Stores packed 32bit integers from 
_mm256_maskstore_epi64^{⚠}  x8664 and avx2 Stores packed 64bit integers from 
_mm256_maskstore_pd^{⚠}  x8664 and avx Stores packed doubleprecision (64bit) floatingpoint elements from 
_mm256_maskstore_ps^{⚠}  x8664 and avx Stores packed singleprecision (32bit) floatingpoint elements from 
_mm256_max_epi8^{⚠}  x8664 and avx2 Compares packed 8bit integers in 
_mm256_max_epi16^{⚠}  x8664 and avx2 Compares packed 16bit integers in 
_mm256_max_epi32^{⚠}  x8664 and avx2 Compares packed 32bit integers in 
_mm256_max_epu8^{⚠}  x8664 and avx2 Compares packed unsigned 8bit integers in 
_mm256_max_epu16^{⚠}  x8664 and avx2 Compares packed unsigned 16bit integers in 
_mm256_max_epu32^{⚠}  x8664 and avx2 Compares packed unsigned 32bit integers in 
_mm256_max_pd^{⚠}  x8664 and avx Compares packed doubleprecision (64bit) floatingpoint elements
in 
_mm256_max_ps^{⚠}  x8664 and avx Compares packed singleprecision (32bit) floatingpoint elements in 
_mm256_min_epi8^{⚠}  x8664 and avx2 Compares packed 8bit integers in 
_mm256_min_epi16^{⚠}  x8664 and avx2 Compares packed 16bit integers in 
_mm256_min_epi32^{⚠}  x8664 and avx2 Compares packed 32bit integers in 
_mm256_min_epu8^{⚠}  x8664 and avx2 Compares packed unsigned 8bit integers in 
_mm256_min_epu16^{⚠}  x8664 and avx2 Compares packed unsigned 16bit integers in 
_mm256_min_epu32^{⚠}  x8664 and avx2 Compares packed unsigned 32bit integers in 
_mm256_min_pd^{⚠}  x8664 and avx Compares packed doubleprecision (64bit) floatingpoint elements
in 
_mm256_min_ps^{⚠}  x8664 and avx Compares packed singleprecision (32bit) floatingpoint elements in 
_mm256_movedup_pd^{⚠}  x8664 and avx Duplicate evenindexed doubleprecision (64bit) floatingpoint elements
from 
_mm256_movehdup_ps^{⚠}  x8664 and avx Duplicate oddindexed singleprecision (32bit) floatingpoint elements
from 
_mm256_moveldup_ps^{⚠}  x8664 and avx Duplicate evenindexed singleprecision (32bit) floatingpoint elements
from 
_mm256_movemask_epi8^{⚠}  x8664 and avx2 Creates mask from the most significant bit of each 8bit element in 
_mm256_movemask_pd^{⚠}  x8664 and avx Sets each bit of the returned mask based on the most significant bit of the
corresponding packed doubleprecision (64bit) floatingpoint element in

_mm256_movemask_ps^{⚠}  x8664 and avx Sets each bit of the returned mask based on the most significant bit of the
corresponding packed singleprecision (32bit) floatingpoint element in

_mm256_mpsadbw_epu8^{⚠}  x8664 and avx2 Computes the sum of absolute differences (SADs) of quadruplets of unsigned
8bit integers in 
_mm256_mul_epi32^{⚠}  x8664 and avx2 Multiplies the low 32bit integers from each packed 64bit element in

_mm256_mul_epu32^{⚠}  x8664 and avx2 Multiplies the low unsigned 32bit integers from each packed 64bit
element in 
_mm256_mul_pd^{⚠}  x8664 and avx Multiplies packed doubleprecision (64bit) floatingpoint elements
in 
_mm256_mul_ps^{⚠}  x8664 and avx Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm256_mulhi_epi16^{⚠}  x8664 and avx2 Multiplies the packed 16bit integers in 
_mm256_mulhi_epu16^{⚠}  x8664 and avx2 Multiplies the packed unsigned 16bit integers in 
_mm256_mulhrs_epi16^{⚠}  x8664 and avx2 Multiplies packed 16bit integers in 
_mm256_mullo_epi16^{⚠}  x8664 and avx2 Multiplies the packed 16bit integers in 
_mm256_mullo_epi32^{⚠}  x8664 and avx2 Multiplies the packed 32bit integers in 
_mm256_or_pd^{⚠}  x8664 and avx Computes the bitwise OR packed doubleprecision (64bit) floatingpoint
elements in 
_mm256_or_ps^{⚠}  x8664 and avx Computes the bitwise OR packed singleprecision (32bit) floatingpoint
elements in 
_mm256_or_si256^{⚠}  x8664 and avx2 Computes the bitwise OR of 256 bits (representing integer data) in 
_mm256_packs_epi16^{⚠}  x8664 and avx2 Converts packed 16bit integers from 
_mm256_packs_epi32^{⚠}  x8664 and avx2 Converts packed 32bit integers from 
_mm256_packus_epi16^{⚠}  x8664 and avx2 Converts packed 16bit integers from 
_mm256_packus_epi32^{⚠}  x8664 and avx2 Converts packed 32bit integers from 
_mm256_permute2f128_pd^{⚠}  x8664 and avx Shuffles 256 bits (composed of 4 packed doubleprecision (64bit)
floatingpoint elements) selected by 
_mm256_permute2f128_ps^{⚠}  x8664 and avx Shuffles 256 bits (composed of 8 packed singleprecision (32bit)
floatingpoint elements) selected by 
_mm256_permute2f128_si256^{⚠}  x8664 and avx Shuffles 128bits (composed of integer data) selected by 
_mm256_permute2x128_si256^{⚠}  x8664 and avx2 Shuffles 128bits of integer data selected by 
_mm256_permute4x64_epi64^{⚠}  x8664 and avx2 Permutes 64bit integers from 
_mm256_permute4x64_pd^{⚠}  x8664 and avx2 Shuffles 64bit floatingpoint elements in 
_mm256_permute_pd^{⚠}  x8664 and avx Shuffles doubleprecision (64bit) floatingpoint elements in 
_mm256_permute_ps^{⚠}  x8664 and avx Shuffles singleprecision (32bit) floatingpoint elements in 
_mm256_permutevar8x32_epi32^{⚠}  x8664 and avx2 Permutes packed 32bit integers from 
_mm256_permutevar8x32_ps^{⚠}  x8664 and avx2 Shuffles eight 32bit foatingpoint elements in 
_mm256_permutevar_pd^{⚠}  x8664 and avx Shuffles doubleprecision (64bit) floatingpoint elements in 
_mm256_permutevar_ps^{⚠}  x8664 and avx Shuffles singleprecision (32bit) floatingpoint elements in 
_mm256_rcp_ps^{⚠}  x8664 and avx Computes the approximate reciprocal of packed singleprecision (32bit)
floatingpoint elements in 
_mm256_round_pd^{⚠}  x8664 and avx Rounds packed doubleprecision (64bit) floating point elements in 
_mm256_round_ps^{⚠}  x8664 and avx Rounds packed singleprecision (32bit) floating point elements in 
_mm256_rsqrt_ps^{⚠}  x8664 and avx Computes the approximate reciprocal square root of packed singleprecision
(32bit) floatingpoint elements in 
_mm256_sad_epu8^{⚠}  x8664 and avx2 Computes the absolute differences of packed unsigned 8bit integers in 
_mm256_set1_epi8^{⚠}  x8664 and avx Broadcasts 8bit integer 
_mm256_set1_epi16^{⚠}  x8664 and avx Broadcasts 16bit integer 
_mm256_set1_epi32^{⚠}  x8664 and avx Broadcasts 32bit integer 
_mm256_set1_epi64x^{⚠}  x8664 and avx Broadcasts 64bit integer 
_mm256_set1_pd^{⚠}  x8664 and avx Broadcasts doubleprecision (64bit) floatingpoint value 
_mm256_set1_ps^{⚠}  x8664 and avx Broadcasts singleprecision (32bit) floatingpoint value 
_mm256_set_epi8^{⚠}  x8664 and avx Sets packed 8bit integers in returned vector with the supplied values in reverse order. 
_mm256_set_epi16^{⚠}  x8664 and avx Sets packed 16bit integers in returned vector with the supplied values. 
_mm256_set_epi32^{⚠}  x8664 and avx Sets packed 32bit integers in returned vector with the supplied values. 
_mm256_set_epi64x^{⚠}  x8664 and avx Sets packed 64bit integers in returned vector with the supplied values. 
_mm256_set_m128^{⚠}  x8664 and avx Sets packed __m256 returned vector with the supplied values. 
_mm256_set_m128d^{⚠}  x8664 and avx Sets packed __m256d returned vector with the supplied values. 
_mm256_set_m128i^{⚠}  x8664 and avx Sets packed __m256i returned vector with the supplied values. 
_mm256_set_pd^{⚠}  x8664 and avx Sets packed doubleprecision (64bit) floatingpoint elements in returned vector with the supplied values. 
_mm256_set_ps^{⚠}  x8664 and avx Sets packed singleprecision (32bit) floatingpoint elements in returned vector with the supplied values. 
_mm256_setr_epi8^{⚠}  x8664 and avx Sets packed 8bit integers in returned vector with the supplied values in reverse order. 
_mm256_setr_epi16^{⚠}  x8664 and avx Sets packed 16bit integers in returned vector with the supplied values in reverse order. 
_mm256_setr_epi32^{⚠}  x8664 and avx Sets packed 32bit integers in returned vector with the supplied values in reverse order. 
_mm256_setr_epi64x^{⚠}  x8664 and avx Sets packed 64bit integers in returned vector with the supplied values in reverse order. 
_mm256_setr_m128^{⚠}  x8664 and avx Sets packed __m256 returned vector with the supplied values. 
_mm256_setr_m128d^{⚠}  x8664 and avx Sets packed __m256d returned vector with the supplied values. 
_mm256_setr_m128i^{⚠}  x8664 and avx Sets packed __m256i returned vector with the supplied values. 
_mm256_setr_pd^{⚠}  x8664 and avx Sets packed doubleprecision (64bit) floatingpoint elements in returned vector with the supplied values in reverse order. 
_mm256_setr_ps^{⚠}  x8664 and avx Sets packed singleprecision (32bit) floatingpoint elements in returned vector with the supplied values in reverse order. 
_mm256_setzero_pd^{⚠}  x8664 and avx Returns vector of type __m256d with all elements set to zero. 
_mm256_setzero_ps^{⚠}  x8664 and avx Returns vector of type __m256 with all elements set to zero. 
_mm256_setzero_si256^{⚠}  x8664 and avx Returns vector of type __m256i with all elements set to zero. 
_mm256_shuffle_epi8^{⚠}  x8664 and avx2 Shuffles bytes from 
_mm256_shuffle_epi32^{⚠}  x8664 and avx2 Shuffles 32bit integers in 128bit lanes of 
_mm256_shuffle_pd^{⚠}  x8664 and avx Shuffles doubleprecision (64bit) floatingpoint elements within 128bit
lanes using the control in 
_mm256_shuffle_ps^{⚠}  x8664 and avx Shuffles singleprecision (32bit) floatingpoint elements in 
_mm256_shufflehi_epi16^{⚠}  x8664 and avx2 Shuffles 16bit integers in the high 64 bits of 128bit lanes of 
_mm256_shufflelo_epi16^{⚠}  x8664 and avx2 Shuffles 16bit integers in the low 64 bits of 128bit lanes of 
_mm256_sign_epi8^{⚠}  x8664 and avx2 Negates packed 8bit integers in 
_mm256_sign_epi16^{⚠}  x8664 and avx2 Negates packed 16bit integers in 
_mm256_sign_epi32^{⚠}  x8664 and avx2 Negates packed 32bit integers in 
_mm256_sll_epi16^{⚠}  x8664 and avx2 Shifts packed 16bit integers in 
_mm256_sll_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_sll_epi64^{⚠}  x8664 and avx2 Shifts packed 64bit integers in 
_mm256_slli_epi16^{⚠}  x8664 and avx2 Shifts packed 16bit integers in 
_mm256_slli_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_slli_epi64^{⚠}  x8664 and avx2 Shifts packed 64bit integers in 
_mm256_slli_si256^{⚠}  x8664 and avx2 Shifts 128bit lanes in 
_mm256_sllv_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_sllv_epi64^{⚠}  x8664 and avx2 Shifts packed 64bit integers in 
_mm256_sqrt_pd^{⚠}  x8664 and avx Returns the square root of packed doubleprecision (64bit) floating point
elements in 
_mm256_sqrt_ps^{⚠}  x8664 and avx Returns the square root of packed singleprecision (32bit) floating point
elements in 
_mm256_sra_epi16^{⚠}  x8664 and avx2 Shifts packed 16bit integers in 
_mm256_sra_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_srai_epi16^{⚠}  x8664 and avx2 Shifts packed 16bit integers in 
_mm256_srai_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_srav_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_srl_epi16^{⚠}  x8664 and avx2 Shifts packed 16bit integers in 
_mm256_srl_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_srl_epi64^{⚠}  x8664 and avx2 Shifts packed 64bit integers in 
_mm256_srli_epi16^{⚠}  x8664 and avx2 Shifts packed 16bit integers in 
_mm256_srli_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_srli_epi64^{⚠}  x8664 and avx2 Shifts packed 64bit integers in 
_mm256_srli_si256^{⚠}  x8664 and avx2 Shifts 128bit lanes in 
_mm256_srlv_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm256_srlv_epi64^{⚠}  x8664 and avx2 Shifts packed 64bit integers in 
_mm256_store_pd^{⚠}  x8664 and avx Stores 256bits (composed of 4 packed doubleprecision (64bit)
floatingpoint elements) from 
_mm256_store_ps^{⚠}  x8664 and avx Stores 256bits (composed of 8 packed singleprecision (32bit)
floatingpoint elements) from 
_mm256_store_si256^{⚠}  x8664 and avx Stores 256bits of integer data from 
_mm256_storeu2_m128^{⚠}  x8664 and avx,sse Stores the high and low 128bit halves (each composed of 4 packed
singleprecision (32bit) floatingpoint elements) from 
_mm256_storeu2_m128d^{⚠}  x8664 and avx,sse2 Stores the high and low 128bit halves (each composed of 2 packed
doubleprecision (64bit) floatingpoint elements) from 
_mm256_storeu2_m128i^{⚠}  x8664 and avx,sse2 Stores the high and low 128bit halves (each composed of integer data) from

_mm256_storeu_pd^{⚠}  x8664 and avx Stores 256bits (composed of 4 packed doubleprecision (64bit)
floatingpoint elements) from 
_mm256_storeu_ps^{⚠}  x8664 and avx Stores 256bits (composed of 8 packed singleprecision (32bit)
floatingpoint elements) from 
_mm256_storeu_si256^{⚠}  x8664 and avx Stores 256bits of integer data from 
_mm256_stream_pd^{⚠}  x8664 and avx Moves doubleprecision values from a 256bit vector of 
_mm256_stream_ps^{⚠}  x8664 and avx Moves singleprecision floating point values from a 256bit vector
of 
_mm256_stream_si256^{⚠}  x8664 and avx Moves integer data from a 256bit integer vector to a 32byte aligned memory location. To minimize caching, the data is flagged as nontemporal (unlikely to be used again soon) 
_mm256_sub_epi8^{⚠}  x8664 and avx2 Subtract packed 8bit integers in 
_mm256_sub_epi16^{⚠}  x8664 and avx2 Subtract packed 16bit integers in 
_mm256_sub_epi32^{⚠}  x8664 and avx2 Subtract packed 32bit integers in 
_mm256_sub_epi64^{⚠}  x8664 and avx2 Subtract packed 64bit integers in 
_mm256_sub_pd^{⚠}  x8664 and avx Subtracts packed doubleprecision (64bit) floatingpoint elements in 
_mm256_sub_ps^{⚠}  x8664 and avx Subtracts packed singleprecision (32bit) floatingpoint elements in 
_mm256_subs_epi8^{⚠}  x8664 and avx2 Subtract packed 8bit integers in 
_mm256_subs_epi16^{⚠}  x8664 and avx2 Subtract packed 16bit integers in 
_mm256_subs_epu8^{⚠}  x8664 and avx2 Subtract packed unsigned 8bit integers in 
_mm256_subs_epu16^{⚠}  x8664 and avx2 Subtract packed unsigned 16bit integers in 
_mm256_testc_pd^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing doubleprecision (64bit)
floatingpoint elements) in 
_mm256_testc_ps^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing singleprecision (32bit)
floatingpoint elements) in 
_mm256_testc_si256^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing integer data) in 
_mm256_testnzc_pd^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing doubleprecision (64bit)
floatingpoint elements) in 
_mm256_testnzc_ps^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing singleprecision (32bit)
floatingpoint elements) in 
_mm256_testnzc_si256^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing integer data) in 
_mm256_testz_pd^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing doubleprecision (64bit)
floatingpoint elements) in 
_mm256_testz_ps^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing singleprecision (32bit)
floatingpoint elements) in 
_mm256_testz_si256^{⚠}  x8664 and avx Computes the bitwise AND of 256 bits (representing integer data) in 
_mm256_undefined_pd^{⚠}  x8664 and avx Returns vector of type 
_mm256_undefined_ps^{⚠}  x8664 and avx Returns vector of type 
_mm256_undefined_si256^{⚠}  x8664 and avx Returns vector of type __m256i with undefined elements. 
_mm256_unpackhi_epi8^{⚠}  x8664 and avx2 Unpacks and interleave 8bit integers from the high half of each
128bit lane in 
_mm256_unpackhi_epi16^{⚠}  x8664 and avx2 Unpacks and interleave 16bit integers from the high half of each
128bit lane of 
_mm256_unpackhi_epi32^{⚠}  x8664 and avx2 Unpacks and interleave 32bit integers from the high half of each
128bit lane of 
_mm256_unpackhi_epi64^{⚠}  x8664 and avx2 Unpacks and interleave 64bit integers from the high half of each
128bit lane of 
_mm256_unpackhi_pd^{⚠}  x8664 and avx Unpacks and interleave doubleprecision (64bit) floatingpoint elements
from the high half of each 128bit lane in 
_mm256_unpackhi_ps^{⚠}  x8664 and avx Unpacks and interleave singleprecision (32bit) floatingpoint elements
from the high half of each 128bit lane in 
_mm256_unpacklo_epi8^{⚠}  x8664 and avx2 Unpacks and interleave 8bit integers from the low half of each
128bit lane of 
_mm256_unpacklo_epi16^{⚠}  x8664 and avx2 Unpacks and interleave 16bit integers from the low half of each
128bit lane of 
_mm256_unpacklo_epi32^{⚠}  x8664 and avx2 Unpacks and interleave 32bit integers from the low half of each
128bit lane of 
_mm256_unpacklo_epi64^{⚠}  x8664 and avx2 Unpacks and interleave 64bit integers from the low half of each
128bit lane of 
_mm256_unpacklo_pd^{⚠}  x8664 and avx Unpacks and interleave doubleprecision (64bit) floatingpoint elements
from the low half of each 128bit lane in 
_mm256_unpacklo_ps^{⚠}  x8664 and avx Unpacks and interleave singleprecision (32bit) floatingpoint elements
from the low half of each 128bit lane in 
_mm256_xor_pd^{⚠}  x8664 and avx Computes the bitwise XOR of packed doubleprecision (64bit) floatingpoint
elements in 
_mm256_xor_ps^{⚠}  x8664 and avx Computes the bitwise XOR of packed singleprecision (32bit) floatingpoint
elements in 
_mm256_xor_si256^{⚠}  x8664 and avx2 Computes the bitwise XOR of 256 bits (representing integer data)
in 
_mm256_zeroall^{⚠}  x8664 and avx Zeroes the contents of all XMM or YMM registers. 
_mm256_zeroupper^{⚠}  x8664 and avx Zeroes the upper 128 bits of all YMM registers; the lower 128bits of the registers are unmodified. 
_mm256_zextpd128_pd256^{⚠}  x8664 and avx,sse2 Constructs a 256bit floatingpoint vector of 
_mm256_zextps128_ps256^{⚠}  x8664 and avx,sse Constructs a 256bit floatingpoint vector of 
_mm256_zextsi128_si256^{⚠}  x8664 and avx,sse2 Constructs a 256bit integer vector from a 128bit integer vector. The lower 128 bits contain the value of the source vector. The upper 128 bits are set to zero. 
_mm512_storeu_ps^{⚠}  x8664 and avx512f Stores 512bits (composed of 16 packed singleprecision (32bit)
floatingpoint elements) from 
_mm_abs_epi8^{⚠}  x8664 and ssse3 Computes the absolute value of packed 8bit signed integers in 
_mm_abs_epi16^{⚠}  x8664 and ssse3 Computes the absolute value of each of the packed 16bit signed integers in

_mm_abs_epi32^{⚠}  x8664 and ssse3 Computes the absolute value of each of the packed 32bit signed integers in

_mm_add_epi8^{⚠}  x8664 and sse2 Adds packed 8bit integers in 
_mm_add_epi16^{⚠}  x8664 and sse2 Adds packed 16bit integers in 
_mm_add_epi32^{⚠}  x8664 and sse2 Adds packed 32bit integers in 
_mm_add_epi64^{⚠}  x8664 and sse2 Adds packed 64bit integers in 
_mm_add_pd^{⚠}  x8664 and sse2 Adds packed doubleprecision (64bit) floatingpoint elements in 
_mm_add_ps^{⚠}  x8664 and sse Adds __m128 vectors. 
_mm_add_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_add_ss^{⚠}  x8664 and sse Adds the first component of 
_mm_adds_epi8^{⚠}  x8664 and sse2 Adds packed 8bit integers in 
_mm_adds_epi16^{⚠}  x8664 and sse2 Adds packed 16bit integers in 
_mm_adds_epu8^{⚠}  x8664 and sse2 Adds packed unsigned 8bit integers in 
_mm_adds_epu16^{⚠}  x8664 and sse2 Adds packed unsigned 16bit integers in 
_mm_addsub_pd^{⚠}  x8664 and sse3 Alternatively add and subtract packed doubleprecision (64bit)
floatingpoint elements in 
_mm_addsub_ps^{⚠}  x8664 and sse3 Alternatively add and subtract packed singleprecision (32bit)
floatingpoint elements in 
_mm_aesdec_si128^{⚠}  x8664 and aes Performs one round of an AES decryption flow on data (state) in 
_mm_aesdeclast_si128^{⚠}  x8664 and aes Performs the last round of an AES decryption flow on data (state) in 
_mm_aesenc_si128^{⚠}  x8664 and aes Performs one round of an AES encryption flow on data (state) in 
_mm_aesenclast_si128^{⚠}  x8664 and aes Performs the last round of an AES encryption flow on data (state) in 
_mm_aesimc_si128^{⚠}  x8664 and aes Performs the 
_mm_aeskeygenassist_si128^{⚠}  x8664 and aes Assist in expanding the AES cipher key. 
_mm_alignr_epi8^{⚠}  x8664 and ssse3 Concatenate 16byte blocks in 
_mm_and_pd^{⚠}  x8664 and sse2 Computes the bitwise AND of packed doubleprecision (64bit) floatingpoint
elements in 
_mm_and_ps^{⚠}  x8664 and sse Bitwise AND of packed singleprecision (32bit) floatingpoint elements. 
_mm_and_si128^{⚠}  x8664 and sse2 Computes the bitwise AND of 128 bits (representing integer data) in 
_mm_andnot_pd^{⚠}  x8664 and sse2 Computes the bitwise NOT of 
_mm_andnot_ps^{⚠}  x8664 and sse Bitwise ANDNOT of packed singleprecision (32bit) floatingpoint elements. 
_mm_andnot_si128^{⚠}  x8664 and sse2 Computes the bitwise NOT of 128 bits (representing integer data) in 
_mm_avg_epu8^{⚠}  x8664 and sse2 Averages packed unsigned 8bit integers in 
_mm_avg_epu16^{⚠}  x8664 and sse2 Averages packed unsigned 16bit integers in 
_mm_blend_epi16^{⚠}  x8664 and sse4.1 Blend packed 16bit integers from 
_mm_blend_epi32^{⚠}  x8664 and avx2 Blends packed 32bit integers from 
_mm_blend_pd^{⚠}  x8664 and sse4.1 Blend packed doubleprecision (64bit) floatingpoint elements from 
_mm_blend_ps^{⚠}  x8664 and sse4.1 Blend packed singleprecision (32bit) floatingpoint elements from 
_mm_blendv_epi8^{⚠}  x8664 and sse4.1 Blend packed 8bit integers from 
_mm_blendv_pd^{⚠}  x8664 and sse4.1 Blend packed doubleprecision (64bit) floatingpoint elements from 
_mm_blendv_ps^{⚠}  x8664 and sse4.1 Blend packed singleprecision (32bit) floatingpoint elements from 
_mm_broadcast_ss^{⚠}  x8664 and avx Broadcasts a singleprecision (32bit) floatingpoint element from memory to all elements of the returned vector. 
_mm_broadcastb_epi8^{⚠}  x8664 and avx2 Broadcasts the low packed 8bit integer from 
_mm_broadcastd_epi32^{⚠}  x8664 and avx2 Broadcasts the low packed 32bit integer from 
_mm_broadcastq_epi64^{⚠}  x8664 and avx2 Broadcasts the low packed 64bit integer from 
_mm_broadcastsd_pd^{⚠}  x8664 and avx2 Broadcasts the low doubleprecision (64bit) floatingpoint element
from 
_mm_broadcastss_ps^{⚠}  x8664 and avx2 Broadcasts the low singleprecision (32bit) floatingpoint element
from 
_mm_broadcastw_epi16^{⚠}  x8664 and avx2 Broadcasts the low packed 16bit integer from a to all elements of the 128bit returned value 
_mm_bslli_si128^{⚠}  x8664 and sse2 Shifts 
_mm_bsrli_si128^{⚠}  x8664 and sse2 Shifts 
_mm_castpd_ps^{⚠}  x8664 and sse2 Casts a 128bit floatingpoint vector of 
_mm_castpd_si128^{⚠}  x8664 and sse2 Casts a 128bit floatingpoint vector of 
_mm_castps_pd^{⚠}  x8664 and sse2 Casts a 128bit floatingpoint vector of 
_mm_castps_si128^{⚠}  x8664 and sse2 Casts a 128bit floatingpoint vector of 
_mm_castsi128_pd^{⚠}  x8664 and sse2 Casts a 128bit integer vector into a 128bit floatingpoint vector
of 
_mm_castsi128_ps^{⚠}  x8664 and sse2 Casts a 128bit integer vector into a 128bit floatingpoint vector
of 
_mm_ceil_pd^{⚠}  x8664 and sse4.1 Round the packed doubleprecision (64bit) floatingpoint elements in 
_mm_ceil_ps^{⚠}  x8664 and sse4.1 Round the packed singleprecision (32bit) floatingpoint elements in 
_mm_ceil_sd^{⚠}  x8664 and sse4.1 Round the lower doubleprecision (64bit) floatingpoint element in 
_mm_ceil_ss^{⚠}  x8664 and sse4.1 Round the lower singleprecision (32bit) floatingpoint element in 
_mm_clflush^{⚠}  x8664 and sse2 Invalidates and flushes the cache line that contains 
_mm_clmulepi64_si128^{⚠}  x8664 and pclmulqdq Performs a carryless multiplication of two 64bit polynomials over the finite field GF(2^k). 
_mm_cmp_pd^{⚠}  x8664 and avx,sse2 Compares packed doubleprecision (64bit) floatingpoint
elements in 
_mm_cmp_ps^{⚠}  x8664 and avx,sse Compares packed singleprecision (32bit) floatingpoint
elements in 
_mm_cmp_sd^{⚠}  x8664 and avx,sse2 Compares the lower doubleprecision (64bit) floatingpoint element in

_mm_cmp_ss^{⚠}  x8664 and avx,sse Compares the lower singleprecision (32bit) floatingpoint element in

_mm_cmpeq_epi8^{⚠}  x8664 and sse2 Compares packed 8bit integers in 
_mm_cmpeq_epi16^{⚠}  x8664 and sse2 Compares packed 16bit integers in 
_mm_cmpeq_epi32^{⚠}  x8664 and sse2 Compares packed 32bit integers in 
_mm_cmpeq_epi64^{⚠}  x8664 and sse4.1 Compares packed 64bit integers in 
_mm_cmpeq_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpeq_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpeq_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpeq_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpestra^{⚠}  x8664 and sse4.2 Compares packed strings in 
_mm_cmpestrc^{⚠}  x8664 and sse4.2 Compares packed strings in 
_mm_cmpestri^{⚠}  x8664 and sse4.2 Compares packed strings 
_mm_cmpestrm^{⚠}  x8664 and sse4.2 Compares packed strings in 
_mm_cmpestro^{⚠}  x8664 and sse4.2 Compares packed strings in 
_mm_cmpestrs^{⚠}  x8664 and sse4.2 Compares packed strings in 
_mm_cmpestrz^{⚠}  x8664 and sse4.2 Compares packed strings in 
_mm_cmpge_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpge_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpge_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpge_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpgt_epi8^{⚠}  x8664 and sse2 Compares packed 8bit integers in 
_mm_cmpgt_epi16^{⚠}  x8664 and sse2 Compares packed 16bit integers in 
_mm_cmpgt_epi32^{⚠}  x8664 and sse2 Compares packed 32bit integers in 
_mm_cmpgt_epi64^{⚠}  x8664 and sse4.2 Compares packed 64bit integers in 
_mm_cmpgt_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpgt_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpgt_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpgt_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpistra^{⚠}  x8664 and sse4.2 Compares packed strings with implicit lengths in 
_mm_cmpistrc^{⚠}  x8664 and sse4.2 Compares packed strings with implicit lengths in 
_mm_cmpistri^{⚠}  x8664 and sse4.2 Compares packed strings with implicit lengths in 
_mm_cmpistrm^{⚠}  x8664 and sse4.2 Compares packed strings with implicit lengths in 
_mm_cmpistro^{⚠}  x8664 and sse4.2 Compares packed strings with implicit lengths in 
_mm_cmpistrs^{⚠}  x8664 and sse4.2 Compares packed strings with implicit lengths in 
_mm_cmpistrz^{⚠}  x8664 and sse4.2 Compares packed strings with implicit lengths in 
_mm_cmple_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmple_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmple_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmple_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmplt_epi8^{⚠}  x8664 and sse2 Compares packed 8bit integers in 
_mm_cmplt_epi16^{⚠}  x8664 and sse2 Compares packed 16bit integers in 
_mm_cmplt_epi32^{⚠}  x8664 and sse2 Compares packed 32bit integers in 
_mm_cmplt_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmplt_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmplt_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmplt_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpneq_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpneq_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpneq_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpneq_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpnge_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpnge_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpnge_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpnge_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpngt_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpngt_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpngt_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpngt_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpnle_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpnle_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpnle_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpnle_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpnlt_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpnlt_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpnlt_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpnlt_ss^{⚠}  x8664 and sse Compares the lowest 
_mm_cmpord_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpord_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpord_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpord_ss^{⚠}  x8664 and sse Checks if the lowest 
_mm_cmpunord_pd^{⚠}  x8664 and sse2 Compares corresponding elements in 
_mm_cmpunord_ps^{⚠}  x8664 and sse Compares each of the four floats in 
_mm_cmpunord_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_cmpunord_ss^{⚠}  x8664 and sse Checks if the lowest 
_mm_comieq_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_comieq_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_comige_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_comige_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_comigt_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_comigt_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_comile_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_comile_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_comilt_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_comilt_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_comineq_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_comineq_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_crc32_u8^{⚠}  x8664 and sse4.2 Starting with the initial value in 
_mm_crc32_u16^{⚠}  x8664 and sse4.2 Starting with the initial value in 
_mm_crc32_u32^{⚠}  x8664 and sse4.2 Starting with the initial value in 
_mm_crc32_u64^{⚠}  x8664 and sse4.2 Starting with the initial value in 
_mm_cvt_si2ss^{⚠}  x8664 and sse Alias for 
_mm_cvt_ss2si^{⚠}  x8664 and sse Alias for 
_mm_cvtepi8_epi16^{⚠}  x8664 and sse4.1 Sign extend packed 8bit integers in 
_mm_cvtepi8_epi32^{⚠}  x8664 and sse4.1 Sign extend packed 8bit integers in 
_mm_cvtepi8_epi64^{⚠}  x8664 and sse4.1 Sign extend packed 8bit integers in the low 8 bytes of 
_mm_cvtepi16_epi32^{⚠}  x8664 and sse4.1 Sign extend packed 16bit integers in 
_mm_cvtepi16_epi64^{⚠}  x8664 and sse4.1 Sign extend packed 16bit integers in 
_mm_cvtepi32_epi64^{⚠}  x8664 and sse4.1 Sign extend packed 32bit integers in 
_mm_cvtepi32_pd^{⚠}  x8664 and sse2 Converts the lower two packed 32bit integers in 
_mm_cvtepi32_ps^{⚠}  x8664 and sse2 Converts packed 32bit integers in 
_mm_cvtepu8_epi16^{⚠}  x8664 and sse4.1 Zeroes extend packed unsigned 8bit integers in 
_mm_cvtepu8_epi32^{⚠}  x8664 and sse4.1 Zeroes extend packed unsigned 8bit integers in 
_mm_cvtepu8_epi64^{⚠}  x8664 and sse4.1 Zeroes extend packed unsigned 8bit integers in 
_mm_cvtepu16_epi32^{⚠}  x8664 and sse4.1 Zeroes extend packed unsigned 16bit integers in 
_mm_cvtepu16_epi64^{⚠}  x8664 and sse4.1 Zeroes extend packed unsigned 16bit integers in 
_mm_cvtepu32_epi64^{⚠}  x8664 and sse4.1 Zeroes extend packed unsigned 32bit integers in 
_mm_cvtpd_epi32^{⚠}  x8664 and sse2 Converts packed doubleprecision (64bit) floatingpoint elements in 
_mm_cvtpd_ps^{⚠}  x8664 and sse2 Converts packed doubleprecision (64bit) floatingpoint elements in 
_mm_cvtps_epi32^{⚠}  x8664 and sse2 Converts packed singleprecision (32bit) floatingpoint elements in 
_mm_cvtps_pd^{⚠}  x8664 and sse2 Converts packed singleprecision (32bit) floatingpoint elements in 
_mm_cvtsd_f64^{⚠}  x8664 and sse2 Returns the lower doubleprecision (64bit) floatingpoint element of 
_mm_cvtsd_si32^{⚠}  x8664 and sse2 Converts the lower doubleprecision (64bit) floatingpoint element in a to a 32bit integer. 
_mm_cvtsd_si64^{⚠}  x8664 and sse2 Converts the lower doubleprecision (64bit) floatingpoint element in a to a 64bit integer. 
_mm_cvtsd_si64x^{⚠}  x8664 and sse2 Alias for 
_mm_cvtsd_ss^{⚠}  x8664 and sse2 Converts the lower doubleprecision (64bit) floatingpoint element in 
_mm_cvtsi32_sd^{⚠}  x8664 and sse2 Returns 
_mm_cvtsi32_si128^{⚠}  x8664 and sse2 Returns a vector whose lowest element is 
_mm_cvtsi32_ss^{⚠}  x8664 and sse Converts a 32 bit integer to a 32 bit float. The result vector is the input
vector 
_mm_cvtsi64_sd^{⚠}  x8664 and sse2 Returns 
_mm_cvtsi64_si128^{⚠}  x8664 and sse2 Returns a vector whose lowest element is 
_mm_cvtsi64_ss^{⚠}  x8664 and sse Converts a 64 bit integer to a 32 bit float. The result vector is the input
vector 
_mm_cvtsi64x_sd^{⚠}  x8664 and sse2 Returns 
_mm_cvtsi64x_si128^{⚠}  x8664 and sse2 Returns a vector whose lowest element is 
_mm_cvtsi128_si32^{⚠}  x8664 and sse2 Returns the lowest element of 
_mm_cvtsi128_si64^{⚠}  x8664 and sse2 Returns the lowest element of 
_mm_cvtsi128_si64x^{⚠}  x8664 and sse2 Returns the lowest element of 
_mm_cvtss_f32^{⚠}  x8664 and sse Extracts the lowest 32 bit float from the input vector. 
_mm_cvtss_sd^{⚠}  x8664 and sse2 Converts the lower singleprecision (32bit) floatingpoint element in 
_mm_cvtss_si32^{⚠}  x8664 and sse Converts the lowest 32 bit float in the input vector to a 32 bit integer. 
_mm_cvtss_si64^{⚠}  x8664 and sse Converts the lowest 32 bit float in the input vector to a 64 bit integer. 
_mm_cvtt_ss2si^{⚠}  x8664 and sse Alias for 
_mm_cvttpd_epi32^{⚠}  x8664 and sse2 Converts packed doubleprecision (64bit) floatingpoint elements in 
_mm_cvttps_epi32^{⚠}  x8664 and sse2 Converts packed singleprecision (32bit) floatingpoint elements in 
_mm_cvttsd_si32^{⚠}  x8664 and sse2 Converts the lower doubleprecision (64bit) floatingpoint element in 
_mm_cvttsd_si64^{⚠}  x8664 and sse2 Converts the lower doubleprecision (64bit) floatingpoint element in 
_mm_cvttsd_si64x^{⚠}  x8664 and sse2 Alias for 
_mm_cvttss_si32^{⚠}  x8664 and sse Converts the lowest 32 bit float in the input vector to a 32 bit integer with truncation. 
_mm_cvttss_si64^{⚠}  x8664 and sse Converts the lowest 32 bit float in the input vector to a 64 bit integer with truncation. 
_mm_div_pd^{⚠}  x8664 and sse2 Divide packed doubleprecision (64bit) floatingpoint elements in 
_mm_div_ps^{⚠}  x8664 and sse Divides __m128 vectors. 
_mm_div_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_div_ss^{⚠}  x8664 and sse Divides the first component of 
_mm_dp_pd^{⚠}  x8664 and sse4.1 Returns the dot product of two __m128d vectors. 
_mm_dp_ps^{⚠}  x8664 and sse4.1 Returns the dot product of two __m128 vectors. 
_mm_extract_epi8^{⚠}  x8664 and sse4.1 Extracts an 8bit integer from 
_mm_extract_epi16^{⚠}  x8664 and sse2 Returns the 
_mm_extract_epi32^{⚠}  x8664 and sse4.1 Extracts an 32bit integer from 
_mm_extract_epi64^{⚠}  x8664 and sse4.1 Extracts an 64bit integer from 
_mm_extract_ps^{⚠}  x8664 and sse4.1 Extracts a singleprecision (32bit) floatingpoint element from 
_mm_extract_si64^{⚠}  x8664 and sse4a Extracts the bit range specified by 
_mm_floor_pd^{⚠}  x8664 and sse4.1 Round the packed doubleprecision (64bit) floatingpoint elements in 
_mm_floor_ps^{⚠}  x8664 and sse4.1 Round the packed singleprecision (32bit) floatingpoint elements in 
_mm_floor_sd^{⚠}  x8664 and sse4.1 Round the lower doubleprecision (64bit) floatingpoint element in 
_mm_floor_ss^{⚠}  x8664 and sse4.1 Round the lower singleprecision (32bit) floatingpoint element in 
_mm_fmadd_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm_fmadd_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm_fmadd_sd^{⚠}  x8664 and fma Multiplies the lower doubleprecision (64bit) floatingpoint elements in

_mm_fmadd_ss^{⚠}  x8664 and fma Multiplies the lower singleprecision (32bit) floatingpoint elements in

_mm_fmaddsub_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm_fmaddsub_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm_fmsub_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm_fmsub_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm_fmsub_sd^{⚠}  x8664 and fma Multiplies the lower doubleprecision (64bit) floatingpoint elements in

_mm_fmsub_ss^{⚠}  x8664 and fma Multiplies the lower singleprecision (32bit) floatingpoint elements in

_mm_fmsubadd_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm_fmsubadd_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm_fnmadd_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm_fnmadd_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm_fnmadd_sd^{⚠}  x8664 and fma Multiplies the lower doubleprecision (64bit) floatingpoint elements in

_mm_fnmadd_ss^{⚠}  x8664 and fma Multiplies the lower singleprecision (32bit) floatingpoint elements in

_mm_fnmsub_pd^{⚠}  x8664 and fma Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm_fnmsub_ps^{⚠}  x8664 and fma Multiplies packed singleprecision (32bit) floatingpoint elements in 
_mm_fnmsub_sd^{⚠}  x8664 and fma Multiplies the lower doubleprecision (64bit) floatingpoint elements in

_mm_fnmsub_ss^{⚠}  x8664 and fma Multiplies the lower singleprecision (32bit) floatingpoint elements in

_mm_getcsr^{⚠}  x8664 and sse Gets the unsigned 32bit value of the MXCSR control and status register. 
_mm_hadd_epi16^{⚠}  x8664 and ssse3 Horizontally adds the adjacent pairs of values contained in 2 packed
128bit vectors of 
_mm_hadd_epi32^{⚠}  x8664 and ssse3 Horizontally adds the adjacent pairs of values contained in 2 packed
128bit vectors of 
_mm_hadd_pd^{⚠}  x8664 and sse3 Horizontally adds adjacent pairs of doubleprecision (64bit)
floatingpoint elements in 
_mm_hadd_ps^{⚠}  x8664 and sse3 Horizontally adds adjacent pairs of singleprecision (32bit)
floatingpoint elements in 
_mm_hadds_epi16^{⚠}  x8664 and ssse3 Horizontally adds the adjacent pairs of values contained in 2 packed
128bit vectors of 
_mm_hsub_epi16^{⚠}  x8664 and ssse3 Horizontally subtract the adjacent pairs of values contained in 2
packed 128bit vectors of 
_mm_hsub_epi32^{⚠}  x8664 and ssse3 Horizontally subtract the adjacent pairs of values contained in 2
packed 128bit vectors of 
_mm_hsub_pd^{⚠}  x8664 and sse3 Horizontally subtract adjacent pairs of doubleprecision (64bit)
floatingpoint elements in 
_mm_hsub_ps^{⚠}  x8664 and sse3 Horizontally adds adjacent pairs of singleprecision (32bit)
floatingpoint elements in 
_mm_hsubs_epi16^{⚠}  x8664 and ssse3 Horizontally subtract the adjacent pairs of values contained in 2
packed 128bit vectors of 
_mm_i32gather_epi32^{⚠}  x8664 and avx2 Returns values from 
_mm_i32gather_epi64^{⚠}  x8664 and avx2 Returns values from 
_mm_i32gather_pd^{⚠}  x8664 and avx2 Returns values from 
_mm_i32gather_ps^{⚠}  x8664 and avx2 Returns values from 
_mm_i64gather_epi32^{⚠}  x8664 and avx2 Returns values from 
_mm_i64gather_epi64^{⚠}  x8664 and avx2 Returns values from 
_mm_i64gather_pd^{⚠}  x8664 and avx2 Returns values from 
_mm_i64gather_ps^{⚠}  x8664 and avx2 Returns values from 
_mm_insert_epi8^{⚠}  x8664 and sse4.1 Returns a copy of 
_mm_insert_epi16^{⚠}  x8664 and sse2 Returns a new vector where the 
_mm_insert_epi32^{⚠}  x8664 and sse4.1 Returns a copy of 
_mm_insert_epi64^{⚠}  x8664 and sse4.1 Returns a copy of 
_mm_insert_ps^{⚠}  x8664 and sse4.1 Select a single value in 
_mm_insert_si64^{⚠}  x8664 and sse4a Inserts the 
_mm_lddqu_si128^{⚠}  x8664 and sse3 Loads 128bits of integer data from unaligned memory.
This intrinsic may perform better than 
_mm_lfence^{⚠}  x8664 and sse2 Performs a serializing operation on all loadfrommemory instructions that were issued prior to this instruction. 
_mm_load1_pd^{⚠}  x8664 and sse2 Loads a doubleprecision (64bit) floatingpoint element from memory into both elements of returned vector. 
_mm_load1_ps^{⚠}  x8664 and sse Construct a 
_mm_load_pd^{⚠}  x8664 and sse2 Loads 128bits (composed of 2 packed doubleprecision (64bit)
floatingpoint elements) from memory into the returned vector.

_mm_load_pd1^{⚠}  x8664 and sse2 Loads a doubleprecision (64bit) floatingpoint element from memory into both elements of returned vector. 
_mm_load_ps^{⚠}  x8664 and sse Loads four 
_mm_load_ps1^{⚠}  x8664 and sse Alias for 
_mm_load_sd^{⚠}  x8664 and sse2 Loads a 64bit doubleprecision value to the low element of a 128bit integer vector and clears the upper element. 
_mm_load_si128^{⚠}  x8664 and sse2 Loads 128bits of integer data from memory into a new vector. 
_mm_load_ss^{⚠}  x8664 and sse Construct a 
_mm_loaddup_pd^{⚠}  x8664 and sse3 Loads a doubleprecision (64bit) floatingpoint element from memory into both elements of return vector. 
_mm_loadh_pd^{⚠}  x8664 and sse2 Loads a doubleprecision value into the highorder bits of a 128bit
vector of 
_mm_loadl_epi64^{⚠}  x8664 and sse2 Loads 64bit integer from memory into first element of returned vector. 
_mm_loadl_pd^{⚠}  x8664 and sse2 Loads a doubleprecision value into the loworder bits of a 128bit
vector of 
_mm_loadr_pd^{⚠}  x8664 and sse2 Loads 2 doubleprecision (64bit) floatingpoint elements from memory into
the returned vector in reverse order. 
_mm_loadr_ps^{⚠}  x8664 and sse Loads four 
_mm_loadu_pd^{⚠}  x8664 and sse2 Loads 128bits (composed of 2 packed doubleprecision (64bit)
floatingpoint elements) from memory into the returned vector.

_mm_loadu_ps^{⚠}  x8664 and sse Loads four 
_mm_loadu_si64^{⚠}  x8664 and sse Loads unaligned 64bits of integer data from memory into new vector. 
_mm_loadu_si128^{⚠}  x8664 and sse2 Loads 128bits of integer data from memory into a new vector. 
_mm_madd_epi16^{⚠}  x8664 and sse2 Multiplies and then horizontally add signed 16 bit integers in 
_mm_maddubs_epi16^{⚠}  x8664 and ssse3 Multiplies corresponding pairs of packed 8bit unsigned integer values contained in the first source operand and packed 8bit signed integer values contained in the second source operand, add pairs of contiguous products with signed saturation, and writes the 16bit sums to the corresponding bits in the destination. 
_mm_mask_i32gather_epi32^{⚠}  x8664 and avx2 Returns values from 
_mm_mask_i32gather_epi64^{⚠}  x8664 and avx2 Returns values from 
_mm_mask_i32gather_pd^{⚠}  x8664 and avx2 Returns values from 
_mm_mask_i32gather_ps^{⚠}  x8664 and avx2 Returns values from 
_mm_mask_i64gather_epi32^{⚠}  x8664 and avx2 Returns values from 
_mm_mask_i64gather_epi64^{⚠}  x8664 and avx2 Returns values from 
_mm_mask_i64gather_pd^{⚠}  x8664 and avx2 Returns values from 
_mm_mask_i64gather_ps^{⚠}  x8664 and avx2 Returns values from 
_mm_maskload_epi32^{⚠}  x8664 and avx2 Loads packed 32bit integers from memory pointed by 
_mm_maskload_epi64^{⚠}  x8664 and avx2 Loads packed 64bit integers from memory pointed by 
_mm_maskload_pd^{⚠}  x8664 and avx Loads packed doubleprecision (64bit) floatingpoint elements from memory
into result using 
_mm_maskload_ps^{⚠}  x8664 and avx Loads packed singleprecision (32bit) floatingpoint elements from memory
into result using 
_mm_maskmoveu_si128^{⚠}  x8664 and sse2 Conditionally store 8bit integer elements from 
_mm_maskstore_epi32^{⚠}  x8664 and avx2 Stores packed 32bit integers from 
_mm_maskstore_epi64^{⚠}  x8664 and avx2 Stores packed 64bit integers from 
_mm_maskstore_pd^{⚠}  x8664 and avx Stores packed doubleprecision (64bit) floatingpoint elements from 
_mm_maskstore_ps^{⚠}  x8664 and avx Stores packed singleprecision (32bit) floatingpoint elements from 
_mm_max_epi8^{⚠}  x8664 and sse4.1 Compares packed 8bit integers in 
_mm_max_epi16^{⚠}  x8664 and sse2 Compares packed 16bit integers in 
_mm_max_epi32^{⚠}  x8664 and sse4.1 Compares packed 32bit integers in 
_mm_max_epu8^{⚠}  x8664 and sse2 Compares packed unsigned 8bit integers in 
_mm_max_epu16^{⚠}  x8664 and sse4.1 Compares packed unsigned 16bit integers in 
_mm_max_epu32^{⚠}  x8664 and sse4.1 Compares packed unsigned 32bit integers in 
_mm_max_pd^{⚠}  x8664 and sse2 Returns a new vector with the maximum values from corresponding elements in

_mm_max_ps^{⚠}  x8664 and sse Compares packed singleprecision (32bit) floatingpoint elements in 
_mm_max_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_max_ss^{⚠}  x8664 and sse Compares the first singleprecision (32bit) floatingpoint element of 
_mm_mfence^{⚠}  x8664 and sse2 Performs a serializing operation on all loadfrommemory and storetomemory instructions that were issued prior to this instruction. 
_mm_min_epi8^{⚠}  x8664 and sse4.1 Compares packed 8bit integers in 
_mm_min_epi16^{⚠}  x8664 and sse2 Compares packed 16bit integers in 
_mm_min_epi32^{⚠}  x8664 and sse4.1 Compares packed 32bit integers in 
_mm_min_epu8^{⚠}  x8664 and sse2 Compares packed unsigned 8bit integers in 
_mm_min_epu16^{⚠}  x8664 and sse4.1 Compares packed unsigned 16bit integers in 
_mm_min_epu32^{⚠}  x8664 and sse4.1 Compares packed unsigned 32bit integers in 
_mm_min_pd^{⚠}  x8664 and sse2 Returns a new vector with the minimum values from corresponding elements in

_mm_min_ps^{⚠}  x8664 and sse Compares packed singleprecision (32bit) floatingpoint elements in 
_mm_min_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_min_ss^{⚠}  x8664 and sse Compares the first singleprecision (32bit) floatingpoint element of 
_mm_minpos_epu16^{⚠}  x8664 and sse4.1 Finds the minimum unsigned 16bit element in the 128bit __m128i vector, returning a vector containing its value in its first position, and its index in its second position; all other elements are set to zero. 
_mm_move_epi64^{⚠}  x8664 and sse2 Returns a vector where the low element is extracted from 
_mm_move_sd^{⚠}  x8664 and sse2 Constructs a 128bit floatingpoint vector of 
_mm_move_ss^{⚠}  x8664 and sse Returns a 
_mm_movedup_pd^{⚠}  x8664 and sse3 Duplicate the low doubleprecision (64bit) floatingpoint element
from 
_mm_movehdup_ps^{⚠}  x8664 and sse3 Duplicate oddindexed singleprecision (32bit) floatingpoint elements
from 
_mm_movehl_ps^{⚠}  x8664 and sse Combine higher half of 
_mm_moveldup_ps^{⚠}  x8664 and sse3 Duplicate evenindexed singleprecision (32bit) floatingpoint elements
from 
_mm_movelh_ps^{⚠}  x8664 and sse Combine lower half of 
_mm_movemask_epi8^{⚠}  x8664 and sse2 Returns a mask of the most significant bit of each element in 
_mm_movemask_pd^{⚠}  x8664 and sse2 Returns a mask of the most significant bit of each element in 
_mm_movemask_ps^{⚠}  x8664 and sse Returns a mask of the most significant bit of each element in 
_mm_mpsadbw_epu8^{⚠}  x8664 and sse4.1 Subtracts 8bit unsigned integer values and computes the absolute values of the differences to the corresponding bits in the destination. Then sums of the absolute differences are returned according to the bit fields in the immediate operand. 
_mm_mul_epi32^{⚠}  x8664 and sse4.1 Multiplies the low 32bit integers from each packed 64bit
element in 
_mm_mul_epu32^{⚠}  x8664 and sse2 Multiplies the low unsigned 32bit integers from each packed 64bit element
in 
_mm_mul_pd^{⚠}  x8664 and sse2 Multiplies packed doubleprecision (64bit) floatingpoint elements in 
_mm_mul_ps^{⚠}  x8664 and sse Multiplies __m128 vectors. 
_mm_mul_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_mul_ss^{⚠}  x8664 and sse Multiplies the first component of 
_mm_mulhi_epi16^{⚠}  x8664 and sse2 Multiplies the packed 16bit integers in 
_mm_mulhi_epu16^{⚠}  x8664 and sse2 Multiplies the packed unsigned 16bit integers in 
_mm_mulhrs_epi16^{⚠}  x8664 and ssse3 Multiplies packed 16bit signed integer values, truncate the 32bit
product to the 18 most significant bits by rightshifting, round the
truncated value by adding 1, and write bits 
_mm_mullo_epi16^{⚠}  x8664 and sse2 Multiplies the packed 16bit integers in 
_mm_mullo_epi32^{⚠}  x8664 and sse4.1 Multiplies the packed 32bit integers in 
_mm_or_pd^{⚠}  x8664 and sse2 Computes the bitwise OR of 
_mm_or_ps^{⚠}  x8664 and sse Bitwise OR of packed singleprecision (32bit) floatingpoint elements. 
_mm_or_si128^{⚠}  x8664 and sse2 Computes the bitwise OR of 128 bits (representing integer data) in 
_mm_packs_epi16^{⚠}  x8664 and sse2 Converts packed 16bit integers from 
_mm_packs_epi32^{⚠}  x8664 and sse2 Converts packed 32bit integers from 
_mm_packus_epi16^{⚠}  x8664 and sse2 Converts packed 16bit integers from 
_mm_packus_epi32^{⚠}  x8664 and sse4.1 Converts packed 32bit integers from 
_mm_pause^{⚠}  x8664 Provides a hint to the processor that the code sequence is a spinwait loop. 
_mm_permute_pd^{⚠}  x8664 and avx,sse2 Shuffles doubleprecision (64bit) floatingpoint elements in 
_mm_permute_ps^{⚠}  x8664 and avx,sse Shuffles singleprecision (32bit) floatingpoint elements in 
_mm_permutevar_pd^{⚠}  x8664 and avx Shuffles doubleprecision (64bit) floatingpoint elements in 
_mm_permutevar_ps^{⚠}  x8664 and avx Shuffles singleprecision (32bit) floatingpoint elements in 
_mm_prefetch^{⚠}  x8664 and sse Fetch the cache line that contains address 
_mm_rcp_ps^{⚠}  x8664 and sse Returns the approximate reciprocal of packed singleprecision (32bit)
floatingpoint elements in 
_mm_rcp_ss^{⚠}  x8664 and sse Returns the approximate reciprocal of the first singleprecision
(32bit) floatingpoint element in 
_mm_round_pd^{⚠}  x8664 and sse4.1 Round the packed doubleprecision (64bit) floatingpoint elements in 
_mm_round_ps^{⚠}  x8664 and sse4.1 Round the packed singleprecision (32bit) floatingpoint elements in 
_mm_round_sd^{⚠}  x8664 and sse4.1 Round the lower doubleprecision (64bit) floatingpoint element in 
_mm_round_ss^{⚠}  x8664 and sse4.1 Round the lower singleprecision (32bit) floatingpoint element in 
_mm_rsqrt_ps^{⚠}  x8664 and sse Returns the approximate reciprocal square root of packed singleprecision
(32bit) floatingpoint elements in 
_mm_rsqrt_ss^{⚠}  x8664 and sse Returns the approximate reciprocal square root of the fist singleprecision
(32bit) floatingpoint elements in 
_mm_sad_epu8^{⚠}  x8664 and sse2 Sum the absolute differences of packed unsigned 8bit integers. 
_mm_set1_epi8^{⚠}  x8664 and sse2 Broadcasts 8bit integer 
_mm_set1_epi16^{⚠}  x8664 and sse2 Broadcasts 16bit integer 
_mm_set1_epi32^{⚠}  x8664 and sse2 Broadcasts 32bit integer 
_mm_set1_epi64x^{⚠}  x8664 and sse2 Broadcasts 64bit integer 
_mm_set1_pd^{⚠}  x8664 and sse2 Broadcasts doubleprecision (64bit) floatingpoint value a to all elements of the return value. 
_mm_set1_ps^{⚠}  x8664 and sse Construct a 
_mm_set_epi8^{⚠}  x8664 and sse2 Sets packed 8bit integers with the supplied values. 
_mm_set_epi16^{⚠}  x8664 and sse2 Sets packed 16bit integers with the supplied values. 
_mm_set_epi32^{⚠}  x8664 and sse2 Sets packed 32bit integers with the supplied values. 
_mm_set_epi64x^{⚠}  x8664 and sse2 Sets packed 64bit integers with the supplied values, from highest to lowest. 
_mm_set_pd^{⚠}  x8664 and sse2 Sets packed doubleprecision (64bit) floatingpoint elements in the return value with the supplied values. 
_mm_set_pd1^{⚠}  x8664 and sse2 Broadcasts doubleprecision (64bit) floatingpoint value a to all elements of the return value. 
_mm_set_ps^{⚠}  x8664 and sse Construct a 
_mm_set_ps1^{⚠}  x8664 and sse Alias for 
_mm_set_sd^{⚠}  x8664 and sse2 Copies doubleprecision (64bit) floatingpoint element 
_mm_set_ss^{⚠}  x8664 and sse Construct a 
_mm_setcsr^{⚠}  x8664 and sse Sets the MXCSR register with the 32bit unsigned integer value. 
_mm_setr_epi8^{⚠}  x8664 and sse2 Sets packed 8bit integers with the supplied values in reverse order. 
_mm_setr_epi16^{⚠}  x8664 and sse2 Sets packed 16bit integers with the supplied values in reverse order. 
_mm_setr_epi32^{⚠}  x8664 and sse2 Sets packed 32bit integers with the supplied values in reverse order. 
_mm_setr_pd^{⚠}  x8664 and sse2 Sets packed doubleprecision (64bit) floatingpoint elements in the return value with the supplied values in reverse order. 
_mm_setr_ps^{⚠}  x8664 and sse Construct a 
_mm_setzero_pd^{⚠}  x8664 and sse2 Returns packed doubleprecision (64bit) floatingpoint elements with all zeros. 
_mm_setzero_ps^{⚠}  x8664 and sse Construct a 
_mm_setzero_si128^{⚠}  x8664 and sse2 Returns a vector with all elements set to zero. 
_mm_sfence^{⚠}  x8664 and sse Performs a serializing operation on all storetomemory instructions that were issued prior to this instruction. 
_mm_sha1msg1_epu32^{⚠}  x8664 and sha Performs an intermediate calculation for the next four SHA1 message values
(unsigned 32bit integers) using previous message values from 
_mm_sha1msg2_epu32^{⚠}  x8664 and sha Performs the final calculation for the next four SHA1 message values
(unsigned 32bit integers) using the intermediate result in 
_mm_sha1nexte_epu32^{⚠}  x8664 and sha Calculate SHA1 state variable E after four rounds of operation from the
current SHA1 state variable 
_mm_sha1rnds4_epu32^{⚠}  x8664 and sha Performs four rounds of SHA1 operation using an initial SHA1 state (A,B,C,D)
from 
_mm_sha256msg1_epu32^{⚠}  x8664 and sha Performs an intermediate calculation for the next four SHA256 message values
(unsigned 32bit integers) using previous message values from 
_mm_sha256msg2_epu32^{⚠}  x8664 and sha Performs the final calculation for the next four SHA256 message values
(unsigned 32bit integers) using previous message values from 
_mm_sha256rnds2_epu32^{⚠}  x8664 and sha Performs 2 rounds of SHA256 operation using an initial SHA256 state
(C,D,G,H) from 
_mm_shuffle_epi8^{⚠}  x8664 and ssse3 Shuffles bytes from 
_mm_shuffle_epi32^{⚠}  x8664 and sse2 Shuffles 32bit integers in 
_mm_shuffle_pd^{⚠}  x8664 and sse2 Constructs a 128bit floatingpoint vector of 
_mm_shuffle_ps^{⚠}  x8664 and sse Shuffles packed singleprecision (32bit) floatingpoint elements in 
_mm_shufflehi_epi16^{⚠}  x8664 and sse2 Shuffles 16bit integers in the high 64 bits of 
_mm_shufflelo_epi16^{⚠}  x8664 and sse2 Shuffles 16bit integers in the low 64 bits of 
_mm_sign_epi8^{⚠}  x8664 and ssse3 Negates packed 8bit integers in 
_mm_sign_epi16^{⚠}  x8664 and ssse3 Negates packed 16bit integers in 
_mm_sign_epi32^{⚠}  x8664 and ssse3 Negates packed 32bit integers in 
_mm_sll_epi16^{⚠}  x8664 and sse2 Shifts packed 16bit integers in 
_mm_sll_epi32^{⚠}  x8664 and sse2 Shifts packed 32bit integers in 
_mm_sll_epi64^{⚠}  x8664 and sse2 Shifts packed 64bit integers in 
_mm_slli_epi16^{⚠}  x8664 and sse2 Shifts packed 16bit integers in 
_mm_slli_epi32^{⚠}  x8664 and sse2 Shifts packed 32bit integers in 
_mm_slli_epi64^{⚠}  x8664 and sse2 Shifts packed 64bit integers in 
_mm_slli_si128^{⚠}  x8664 and sse2 Shifts 
_mm_sllv_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm_sllv_epi64^{⚠}  x8664 and avx2 Shifts packed 64bit integers in 
_mm_sqrt_pd^{⚠}  x8664 and sse2 Returns a new vector with the square root of each of the values in 
_mm_sqrt_ps^{⚠}  x8664 and sse Returns the square root of packed singleprecision (32bit) floatingpoint
elements in 
_mm_sqrt_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_sqrt_ss^{⚠}  x8664 and sse Returns the square root of the first singleprecision (32bit)
floatingpoint element in 
_mm_sra_epi16^{⚠}  x8664 and sse2 Shifts packed 16bit integers in 
_mm_sra_epi32^{⚠}  x8664 and sse2 Shifts packed 32bit integers in 
_mm_srai_epi16^{⚠}  x8664 and sse2 Shifts packed 16bit integers in 
_mm_srai_epi32^{⚠}  x8664 and sse2 Shifts packed 32bit integers in 
_mm_srav_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm_srl_epi16^{⚠}  x8664 and sse2 Shifts packed 16bit integers in 
_mm_srl_epi32^{⚠}  x8664 and sse2 Shifts packed 32bit integers in 
_mm_srl_epi64^{⚠}  x8664 and sse2 Shifts packed 64bit integers in 
_mm_srli_epi16^{⚠}  x8664 and sse2 Shifts packed 16bit integers in 
_mm_srli_epi32^{⚠}  x8664 and sse2 Shifts packed 32bit integers in 
_mm_srli_epi64^{⚠}  x8664 and sse2 Shifts packed 64bit integers in 
_mm_srli_si128^{⚠}  x8664 and sse2 Shifts 
_mm_srlv_epi32^{⚠}  x8664 and avx2 Shifts packed 32bit integers in 
_mm_srlv_epi64^{⚠}  x8664 and avx2 Shifts packed 64bit integers in 
_mm_store1_pd^{⚠}  x8664 and sse2 Stores the lower doubleprecision (64bit) floatingpoint element from 
_mm_store1_ps^{⚠}  x8664 and sse Stores the lowest 32 bit float of 
_mm_store_pd^{⚠}  x8664 and sse2 Stores 128bits (composed of 2 packed doubleprecision (64bit)
floatingpoint elements) from 
_mm_store_pd1^{⚠}  x8664 and sse2 Stores the lower doubleprecision (64bit) floatingpoint element from 
_mm_store_ps^{⚠}  x8664 and sse Stores four 32bit floats into aligned memory. 
_mm_store_ps1^{⚠}  x8664 and sse Alias for 
_mm_store_sd^{⚠}  x8664 and sse2 Stores the lower 64 bits of a 128bit vector of 
_mm_store_si128^{⚠}  x8664 and sse2 Stores 128bits of integer data from 
_mm_store_ss^{⚠}  x8664 and sse Stores the lowest 32 bit float of 
_mm_storeh_pd^{⚠}  x8664 and sse2 Stores the upper 64 bits of a 128bit vector of 
_mm_storel_epi64^{⚠}  x8664 and sse2 Stores the lower 64bit integer 
_mm_storel_pd^{⚠}  x8664 and sse2 Stores the lower 64 bits of a 128bit vector of 
_mm_storer_pd^{⚠}  x8664 and sse2 Stores 2 doubleprecision (64bit) floatingpoint elements from 
_mm_storer_ps^{⚠}  x8664 and sse Stores four 32bit floats into aligned memory in reverse order. 
_mm_storeu_pd^{⚠}  x8664 and sse2 Stores 128bits (composed of 2 packed doubleprecision (64bit)
floatingpoint elements) from 
_mm_storeu_ps^{⚠}  x8664 and sse Stores four 32bit floats into memory. There are no restrictions on memory
alignment. For aligned memory 
_mm_storeu_si128^{⚠}  x8664 and sse2 Stores 128bits of integer data from 
_mm_stream_pd^{⚠}  x8664 and sse2 Stores a 128bit floating point vector of 
_mm_stream_ps^{⚠}  x8664 and sse Stores 
_mm_stream_sd^{⚠}  x8664 and sse4a Nontemporal store of 
_mm_stream_si32^{⚠}  x8664 and sse2 Stores a 32bit integer value in the specified memory location. To minimize caching, the data is flagged as nontemporal (unlikely to be used again soon). 
_mm_stream_si64^{⚠}  x8664 and sse2 Stores a 64bit integer value in the specified memory location. To minimize caching, the data is flagged as nontemporal (unlikely to be used again soon). 
_mm_stream_si128^{⚠}  x8664 and sse2 Stores a 128bit integer vector to a 128bit aligned memory location. To minimize caching, the data is flagged as nontemporal (unlikely to be used again soon). 
_mm_stream_ss^{⚠}  x8664 and sse4a Nontemporal store of 
_mm_sub_epi8^{⚠}  x8664 and sse2 Subtracts packed 8bit integers in 
_mm_sub_epi16^{⚠}  x8664 and sse2 Subtracts packed 16bit integers in 
_mm_sub_epi32^{⚠}  x8664 and sse2 Subtract packed 32bit integers in 
_mm_sub_epi64^{⚠}  x8664 and sse2 Subtract packed 64bit integers in 
_mm_sub_pd^{⚠}  x8664 and sse2 Subtract packed doubleprecision (64bit) floatingpoint elements in 
_mm_sub_ps^{⚠}  x8664 and sse Subtracts __m128 vectors. 
_mm_sub_sd^{⚠}  x8664 and sse2 Returns a new vector with the low element of 
_mm_sub_ss^{⚠}  x8664 and sse Subtracts the first component of 
_mm_subs_epi8^{⚠}  x8664 and sse2 Subtract packed 8bit integers in 
_mm_subs_epi16^{⚠}  x8664 and sse2 Subtract packed 16bit integers in 
_mm_subs_epu8^{⚠}  x8664 and sse2 Subtract packed unsigned 8bit integers in 
_mm_subs_epu16^{⚠}  x8664 and sse2 Subtract packed unsigned 16bit integers in 
_mm_test_all_ones^{⚠}  x8664 and sse4.1 Tests whether the specified bits in 
_mm_test_all_zeros^{⚠}  x8664 and sse4.1 Tests whether the specified bits in a 128bit integer vector are all zeros. 
_mm_test_mix_ones_zeros^{⚠}  x8664 and sse4.1 Tests whether the specified bits in a 128bit integer vector are neither all zeros nor all ones. 
_mm_testc_pd^{⚠}  x8664 and avx Computes the bitwise AND of 128 bits (representing doubleprecision (64bit)
floatingpoint elements) in 
_mm_testc_ps^{⚠}  x8664 and avx Computes the bitwise AND of 128 bits (representing singleprecision (32bit)
floatingpoint elements) in 
_mm_testc_si128^{⚠}  x8664 and sse4.1 Tests whether the specified bits in a 128bit integer vector are all ones. 
_mm_testnzc_pd^{⚠}  x8664 and avx Computes the bitwise AND of 128 bits (representing doubleprecision (64bit)
floatingpoint elements) in 
_mm_testnzc_ps^{⚠}  x8664 and avx Computes the bitwise AND of 128 bits (representing singleprecision (32bit)
floatingpoint elements) in 
_mm_testnzc_si128^{⚠}  x8664 and sse4.1 Tests whether the specified bits in a 128bit integer vector are neither all zeros nor all ones. 
_mm_testz_pd^{⚠}  x8664 and avx Computes the bitwise AND of 128 bits (representing doubleprecision (64bit)
floatingpoint elements) in 
_mm_testz_ps^{⚠}  x8664 and avx Computes the bitwise AND of 128 bits (representing singleprecision (32bit)
floatingpoint elements) in 
_mm_testz_si128^{⚠}  x8664 and sse4.1 Tests whether the specified bits in a 128bit integer vector are all zeros. 
_mm_tzcnt_32^{⚠}  x8664 and bmi1 Counts the number of trailing least significant zero bits. 
_mm_tzcnt_64^{⚠}  x8664 and bmi1 Counts the number of trailing least significant zero bits. 
_mm_ucomieq_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_ucomieq_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_ucomige_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_ucomige_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_ucomigt_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_ucomigt_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_ucomile_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_ucomile_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_ucomilt_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_ucomilt_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_ucomineq_sd^{⚠}  x8664 and sse2 Compares the lower element of 
_mm_ucomineq_ss^{⚠}  x8664 and sse Compares two 32bit floats from the loworder bits of 
_mm_undefined_pd^{⚠}  x8664 and sse2 Returns vector of type __m128d with undefined elements. 
_mm_undefined_ps^{⚠}  x8664 and sse Returns vector of type __m128 with undefined elements. 
_mm_undefined_si128^{⚠}  x8664 and sse2 Returns vector of type __m128i with undefined elements. 
_mm_unpackhi_epi8^{⚠}  x8664 and sse2 Unpacks and interleave 8bit integers from the high half of 
_mm_unpackhi_epi16^{⚠}  x8664 and sse2 Unpacks and interleave 16bit integers from the high half of 
_mm_unpackhi_epi32^{⚠}  x8664 and sse2 Unpacks and interleave 32bit integers from the high half of 
_mm_unpackhi_epi64^{⚠}  x8664 and sse2 Unpacks and interleave 64bit integers from the high half of 
_mm_unpackhi_pd^{⚠}  x8664 and sse2 The resulting 
_mm_unpackhi_ps^{⚠}  x8664 and sse Unpacks and interleave singleprecision (32bit) floatingpoint elements
from the higher half of 
_mm_unpacklo_epi8^{⚠}  x8664 and sse2 Unpacks and interleave 8bit integers from the low half of 
_mm_unpacklo_epi16^{⚠}  x8664 and sse2 Unpacks and interleave 16bit integers from the low half of 
_mm_unpacklo_epi32^{⚠}  x8664 and sse2 Unpacks and interleave 32bit integers from the low half of 
_mm_unpacklo_epi64^{⚠}  x8664 and sse2 Unpacks and interleave 64bit integers from the low half of 
_mm_unpacklo_pd^{⚠}  x8664 and sse2 The resulting 
_mm_unpacklo_ps^{⚠}  x8664 and sse Unpacks and interleave singleprecision (32bit) floatingpoint elements
from the lower half of 
_mm_xor_pd^{⚠}  x8664 and sse2 Computes the bitwise OR of 
_mm_xor_ps^{⚠}  x8664 and sse Bitwise exclusive OR of packed singleprecision (32bit) floatingpoint elements. 
_mm_xor_si128^{⚠}  x8664 and sse2 Computes the bitwise XOR of 128 bits (representing integer data) in 
_mulx_u32^{⚠}  x8664 and bmi2 Unsigned multiply without affecting flags. 
_mulx_u64^{⚠}  x8664 and bmi2 Unsigned multiply without affecting flags. 
_pdep_u32^{⚠}  x8664 and bmi2 Scatter contiguous low order bits of 
_pdep_u64^{⚠}  x8664 and bmi2 Scatter contiguous low order bits of 
_pext_u32^{⚠}  x8664 and bmi2 Gathers the bits of 
_pext_u64^{⚠}  x8664 and bmi2 Gathers the bits of 
_popcnt32^{⚠}  x8664 and popcnt Counts the bits that are set. 
_popcnt64^{⚠}  x8664 and popcnt Counts the bits that are set. 
_rdrand16_step^{⚠}  x8664 and rdrand Read a hardware generated 16bit random value and store the result in val. Returns 1 if a random value was generated, and 0 otherwise. 
_rdrand32_step^{⚠}  x8664 and rdrand Read a hardware generated 32bit random value and store the result in val. Returns 1 if a random value was generated, and 0 otherwise. 
_rdrand64_step^{⚠}  x8664 and rdrand Read a hardware generated 64bit random value and store the result in val. Returns 1 if a random value was generated, and 0 otherwise. 
_rdseed16_step^{⚠}  x8664 and rdseed Read a 16bit NIST SP80090B and SP80090C compliant random value and store in val. Return 1 if a random value was generated, and 0 otherwise. 
_rdseed32_step^{⚠}  x8664 and rdseed Read a 32bit NIST SP80090B and SP80090C compliant random value and store in val. Return 1 if a random value was generated, and 0 otherwise. 
_rdseed64_step^{⚠}  x8664 and rdseed Read a 64bit NIST SP80090B and SP80090C compliant random value and store in val. Return 1 if a random value was generated, and 0 otherwise. 
_rdtsc^{⚠}  x8664 Reads the current value of the processor’s timestamp counter. 
_subborrow_u32^{⚠}  x8664 Adds unsigned 32bit integers 
_subborrow_u64^{⚠}  x8664 Adds unsigned 64bit integers 
_t1mskc_u32^{⚠}  x8664 and tbm Clears all bits below the least significant zero of 
_t1mskc_u64^{⚠}  x8664 and tbm Clears all bits below the least significant zero of 
_tzcnt_u32^{⚠}  x8664 and bmi1 Counts the number of trailing least significant zero bits. 
_tzcnt_u64^{⚠}  x8664 and bmi1 Counts the number of trailing least significant zero bits. 
_tzmsk_u32^{⚠}  x8664 and tbm Sets all bits below the least significant one of 
_tzmsk_u64^{⚠}  x8664 and tbm Sets all bits below the least significant one of 
_xgetbv^{⚠}  x8664 and xsave Reads the contents of the extended control register 
_xrstor^{⚠}  x8664 and xsave Performs a full or partial restore of the enabled processor states using
the state information stored in memory at 
_xrstor64^{⚠}  x8664 and xsave Performs a full or partial restore of the enabled processor states using
the state information stored in memory at 
_xrstors^{⚠}  x8664 and xsave,xsaves Performs a full or partial restore of the enabled processor states using the
state information stored in memory at 
_xrstors64^{⚠}  x8664 and xsave,xsaves Performs a full or partial restore of the enabled processor states using the
state information stored in memory at 
_xsave^{⚠}  x8664 and xsave Performs a full or partial save of the enabled processor states to memory at

_xsave64^{⚠}  x8664 and xsave Performs a full or partial save of the enabled processor states to memory at

_xsavec^{⚠}  x8664 and xsave,xsavec Performs a full or partial save of the enabled processor states to memory
at 
_xsavec64^{⚠}  x8664 and xsave,xsavec Performs a full or partial save of the enabled processor states to memory
at 
_xsaveopt^{⚠}  x8664 and xsave,xsaveopt Performs a full or partial save of the enabled processor states to memory at

_xsaveopt64^{⚠}  x8664 and xsave,xsaveopt Performs a full or partial save of the enabled processor states to memory at

_xsaves^{⚠}  x8664 and xsave,xsaves Performs a full or partial save of the enabled processor states to memory at

_xsaves64^{⚠}  x8664 and xsave,xsaves Performs a full or partial save of the enabled processor states to memory at

_xsetbv^{⚠}  x8664 and xsave Copies 64bits from 
_MM_SHUFFLE  Experimentalx8664 A utility function for creating masks to use with Intel shuffle and permute intrinsics. 
_bittest^{⚠}  Experimentalx8664 Returns the bit in position 
_bittest64^{⚠}  Experimentalx8664 Returns the bit in position 
_bittestandcomplement^{⚠}  Experimentalx8664 Returns the bit in position 
_bittestandcomplement64^{⚠}  Experimentalx8664 Returns the bit in position 
_bittestandreset^{⚠}  Experimentalx8664 Returns the bit in position 
_bittestandreset64^{⚠}  Experimentalx8664 Returns the bit in position 
_bittestandset^{⚠}  Experimentalx8664 Returns the bit in position 
_bittestandset64^{⚠}  Experimentalx8664 Returns the bit in position 
_kand_mask16^{⚠}  Experimentalx8664 and avx512f Compute the bitwise AND of 16bit masks a and b, and store the result in k. 
_kandn_mask16^{⚠}  Experimentalx8664 and avx512f Compute the bitwise NOT of 16bit masks a and then AND with b, and store the result in k. 
_knot_mask16^{⚠}  Experimentalx8664 and avx512f Compute the bitwise NOT of 16bit mask a, and store the result in k. 
_kor_mask16^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of 16bit masks a and b, and store the result in k. 
_kxnor_mask16^{⚠}  Experimentalx8664 and avx512f Compute the bitwise XNOR of 16bit masks a and b, and store the result in k. 
_kxor_mask16^{⚠}  Experimentalx8664 and avx512f Compute the bitwise XOR of 16bit masks a and b, and store the result in k. 
_mm256_cvtph_ps^{⚠}  Experimentalx8664 and f16c Converts the 8 x 16bit halfprecision float values in the 128bit vector

_mm256_cvtps_ph^{⚠}  Experimentalx8664 and f16c Converts the 8 x 32bit float values in the 256bit vector 
_mm256_madd52hi_epu64^{⚠}  Experimentalx8664 and avx512ifma,avx512vl Multiply packed unsigned 52bit integers in each 64bit element of

_mm256_madd52lo_epu64^{⚠}  Experimentalx8664 and avx512ifma,avx512vl Multiply packed unsigned 52bit integers in each 64bit element of

_mm512_abs_epi32^{⚠}  Experimentalx8664 and avx512f Computes the absolute values of packed 32bit integers in 
_mm512_abs_epi64^{⚠}  Experimentalx8664 and avx512f Compute the absolute value of packed signed 64bit integers in a, and store the unsigned results in dst. 
_mm512_abs_pd^{⚠}  Experimentalx8664 and avx512f Finds the absolute value of each packed doubleprecision (64bit) floatingpoint element in v2, storing the results in dst. 
_mm512_abs_ps^{⚠}  Experimentalx8664 and avx512f Finds the absolute value of each packed singleprecision (32bit) floatingpoint element in v2, storing the results in dst. 
_mm512_add_epi32^{⚠}  Experimentalx8664 and avx512f Add packed 32bit integers in a and b, and store the results in dst. 
_mm512_add_epi64^{⚠}  Experimentalx8664 and avx512f Add packed 64bit integers in a and b, and store the results in dst. 
_mm512_add_pd^{⚠}  Experimentalx8664 and avx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_add_ps^{⚠}  Experimentalx8664 and avx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_add_round_pd^{⚠}  Experimentalx8664 and avx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_add_round_ps^{⚠}  Experimentalx8664 and avx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_and_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_and_epi64^{⚠}  Experimentalx8664 and avx512f Compute the bitwise AND of 512 bits (composed of packed 64bit integers) in a and b, and store the results in dst. 
_mm512_and_si512^{⚠}  Experimentalx8664 and avx512f Compute the bitwise AND of 512 bits (representing integer data) in a and b, and store the result in dst. 
_mm512_cmp_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b based on the comparison operand specified by op. 
_mm512_cmp_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b based on the comparison operand specified by op. 
_mm512_cmp_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b based on the comparison operand specified by op. 
_mm512_cmp_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b based on the comparison operand specified by op. 
_mm512_cmp_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b based on the comparison operand specified by op. 
_mm512_cmp_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b based on the comparison operand specified by op. 
_mm512_cmp_round_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b based on the comparison operand specified by op. 
_mm512_cmp_round_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b based on the comparison operand specified by op. 
_mm512_cmpeq_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for equality, and store the results in a mask vector. 
_mm512_cmpeq_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for equality, and store the results in a mask vector. 
_mm512_cmpeq_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for equality, and store the results in a mask vector. 
_mm512_cmpeq_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for equality, and store the results in a mask vector. 
_mm512_cmpeq_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for equality, and store the results in a mask vector. 
_mm512_cmpeq_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for equality, and store the results in a mask vector. 
_mm512_cmpge_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for greaterthanorequal, and store the results in a mask vector. 
_mm512_cmpge_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for greaterthanorequal, and store the results in a mask vector. 
_mm512_cmpge_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for greaterthanorequal, and store the results in a mask vector. 
_mm512_cmpge_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for greaterthanorequal, and store the results in a mask vector. 
_mm512_cmpgt_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for greaterthan, and store the results in a mask vector. 
_mm512_cmpgt_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for greaterthan, and store the results in a mask vector. 
_mm512_cmpgt_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for greaterthan, and store the results in a mask vector. 
_mm512_cmpgt_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for greaterthan, and store the results in a mask vector. 
_mm512_cmple_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for lessthanorequal, and store the results in a mask vector. 
_mm512_cmple_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for lessthanorequal, and store the results in a mask vector. 
_mm512_cmple_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for lessthanorequal, and store the results in a mask vector. 
_mm512_cmple_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for lessthanorequal, and store the results in a mask vector. 
_mm512_cmple_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for lessthanorequal, and store the results in a mask vector. 
_mm512_cmple_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for lessthanorequal, and store the results in a mask vector. 
_mm512_cmplt_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for lessthan, and store the results in a mask vector. 
_mm512_cmplt_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for lessthan, and store the results in a mask vector. 
_mm512_cmplt_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for lessthan, and store the results in a mask vector. 
_mm512_cmplt_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for lessthan, and store the results in a mask vector. 
_mm512_cmplt_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for lessthan, and store the results in a mask vector. 
_mm512_cmplt_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for lessthan, and store the results in a mask vector. 
_mm512_cmpneq_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for inequality, and store the results in a mask vector. 
_mm512_cmpneq_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for inequality, and store the results in a mask vector. 
_mm512_cmpneq_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for inequality, and store the results in a mask vector. 
_mm512_cmpneq_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for inequality, and store the results in a mask vector. 
_mm512_cmpneq_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for inequality, and store the results in a mask vector. 
_mm512_cmpneq_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for inequality, and store the results in a mask vector. 
_mm512_cmpnle_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for greaterthan, and store the results in a mask vector. 
_mm512_cmpnle_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for greaterthan, and store the results in a mask vector. 
_mm512_cmpnlt_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for greaterthan, and store the results in a mask vector. 
_mm512_cmpnlt_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for greaterthan, and store the results in a mask vector. 
_mm512_cmpord_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b to see if neither is NaN, and store the results in a mask vector. 
_mm512_cmpord_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b to see if neither is NaN, and store the results in a mask vector. 
_mm512_cmpunord_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b to see if either is NaN, and store the results in a mask vector. 
_mm512_cmpunord_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b to see if either is NaN, and store the results in a mask vector. 
_mm512_cvt_roundps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst. 
_mm512_cvt_roundps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst. 
_mm512_cvt_roundps_pd^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_cvtps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst. 
_mm512_cvtps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst. 
_mm512_cvtps_pd^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst. 
_mm512_cvtt_roundpd_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_cvtt_roundpd_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_cvtt_roundps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_cvtt_roundps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_cvttpd_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst. 
_mm512_cvttpd_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst. 
_mm512_cvttps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst. 
_mm512_cvttps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst. 
_mm512_div_pd^{⚠}  Experimentalx8664 and avx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst. 
_mm512_div_ps^{⚠}  Experimentalx8664 and avx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst. 
_mm512_div_round_pd^{⚠}  Experimentalx8664 and avx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, =and store the results in dst. 
_mm512_div_round_ps^{⚠}  Experimentalx8664 and avx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst. 
_mm512_extractf32x4_ps^{⚠}  Experimentalx8664 and avx512f Extract 128 bits (composed of 4 packed singleprecision (32bit) floatingpoint elements) from a, selected with imm8, and store the result in dst. 
_mm512_fmadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst. 
_mm512_fmadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst. 
_mm512_fmadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst. 
_mm512_fmadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst. 
_mm512_fmaddsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst. 
_mm512_fmaddsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst. 
_mm512_fmaddsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst. 
_mm512_fmaddsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst. 
_mm512_fmsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst. 
_mm512_fmsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst. 
_mm512_fmsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst. 
_mm512_fmsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst. 
_mm512_fmsubadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst. 
_mm512_fmsubadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst. 
_mm512_fmsubadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst. 
_mm512_fmsubadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst. 
_mm512_fnmadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst. 
_mm512_fnmadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst. 
_mm512_fnmadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst. 
_mm512_fnmadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst. 
_mm512_fnmsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst. 
_mm512_fnmsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst. 
_mm512_fnmsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst. 
_mm512_fnmsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst. 
_mm512_getexp_pd^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_getexp_ps^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_getexp_round_pd^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_getexp_round_ps^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst. This intrinsic essentially calculates floor(log2(x)) for each element. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_getmant_pd^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 
_mm512_getmant_ps^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 
_mm512_getmant_round_pd^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_getmant_round_ps^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_i32gather_epi32^{⚠}  Experimentalx8664 and avx512f Gather 32bit integers from memory using 32bit indices. 
_mm512_i32gather_epi64^{⚠}  Experimentalx8664 and avx512f Gather 64bit integers from memory using 32bit indices. 
_mm512_i32gather_pd^{⚠}  Experimentalx8664 and avx512f Gather doubleprecision (64bit) floatingpoint elements from memory using 32bit indices. 
_mm512_i32gather_ps^{⚠}  Experimentalx8664 and avx512f Gather singleprecision (32bit) floatingpoint elements from memory using 32bit indices. 
_mm512_i32scatter_epi32^{⚠}  Experimentalx8664 and avx512f Scatter 32bit integers from src into memory using 32bit indices. 
_mm512_i32scatter_epi64^{⚠}  Experimentalx8664 and avx512f Scatter 64bit integers from src into memory using 32bit indices. 
_mm512_i32scatter_pd^{⚠}  Experimentalx8664 and avx512f Scatter doubleprecision (64bit) floatingpoint elements from memory using 32bit indices. 
_mm512_i32scatter_ps^{⚠}  Experimentalx8664 and avx512f Scatter singleprecision (32bit) floatingpoint elements from memory using 32bit indices. 
_mm512_i64gather_epi32^{⚠}  Experimentalx8664 and avx512f Gather 32bit integers from memory using 64bit indices. 
_mm512_i64gather_epi64^{⚠}  Experimentalx8664 and avx512f Gather 64bit integers from memory using 64bit indices. 
_mm512_i64gather_pd^{⚠}  Experimentalx8664 and avx512f Gather doubleprecision (64bit) floatingpoint elements from memory using 64bit indices. 
_mm512_i64gather_ps^{⚠}  Experimentalx8664 and avx512f Gather singleprecision (32bit) floatingpoint elements from memory using 64bit indices. 
_mm512_i64scatter_epi32^{⚠}  Experimentalx8664 and avx512f Scatter 32bit integers from src into memory using 64bit indices. 
_mm512_i64scatter_epi64^{⚠}  Experimentalx8664 and avx512f Scatter 64bit integers from src into memory using 64bit indices. 
_mm512_i64scatter_pd^{⚠}  Experimentalx8664 and avx512f Scatter doubleprecision (64bit) floatingpoint elements from src into memory using 64bit indices. 
_mm512_i64scatter_ps^{⚠}  Experimentalx8664 and avx512f Scatter singleprecision (32bit) floatingpoint elements from src into memory using 64bit indices. 
_mm512_kand^{⚠}  Experimentalx8664 and avx512f Compute the bitwise AND of 16bit masks a and b, and store the result in k. 
_mm512_kandn^{⚠}  Experimentalx8664 and avx512f Compute the bitwise NOT of 16bit masks a and then AND with b, and store the result in k. 
_mm512_kmov^{⚠}  Experimentalx8664 and avx512f Copy 16bit mask a to k. 
_mm512_knot^{⚠}  Experimentalx8664 and avx512f Compute the bitwise NOT of 16bit mask a, and store the result in k. 
_mm512_kor^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of 16bit masks a and b, and store the result in k. 
_mm512_kxnor^{⚠}  Experimentalx8664 and avx512f Compute the bitwise XNOR of 16bit masks a and b, and store the result in k. 
_mm512_kxor^{⚠}  Experimentalx8664 and avx512f Compute the bitwise XOR of 16bit masks a and b, and store the result in k. 
_mm512_loadu_pd^{⚠}  Experimentalx8664 and avx512f Loads 512bits (composed of 8 packed doubleprecision (64bit)
floatingpoint elements) from memory into result.

_mm512_loadu_ps^{⚠}  Experimentalx8664 and avx512f Loads 512bits (composed of 16 packed singleprecision (32bit)
floatingpoint elements) from memory into result.

_mm512_madd52hi_epu64^{⚠}  Experimentalx8664 and avx512ifma Multiply packed unsigned 52bit integers in each 64bit element of

_mm512_madd52lo_epu64^{⚠}  Experimentalx8664 and avx512ifma Multiply packed unsigned 52bit integers in each 64bit element of

_mm512_mask2_permutex2var_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set). 
_mm512_mask2_permutex2var_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set). 
_mm512_mask2_permutex2var_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set) 
_mm512_mask2_permutex2var_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from idx when the corresponding mask bit is not set). 
_mm512_mask3_fmadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmaddsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmaddsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmaddsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmaddsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsubadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsubadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsubadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fmsubadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask3_fnmsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from c when the corresponding mask bit is not set). 
_mm512_mask_abs_epi32^{⚠}  Experimentalx8664 and avx512f Computes the absolute value of packed 32bit integers in 
_mm512_mask_abs_epi64^{⚠}  Experimentalx8664 and avx512f Compute the absolute value of packed signed 64bit integers in a, and store the unsigned results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_abs_pd^{⚠}  Experimentalx8664 and avx512f Finds the absolute value of each packed doubleprecision (64bit) floatingpoint element in v2, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_abs_ps^{⚠}  Experimentalx8664 and avx512f Finds the absolute value of each packed singleprecision (32bit) floatingpoint element in v2, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_epi32^{⚠}  Experimentalx8664 and avx512f Add packed 32bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_epi64^{⚠}  Experimentalx8664 and avx512f Add packed 64bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_pd^{⚠}  Experimentalx8664 and avx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_ps^{⚠}  Experimentalx8664 and avx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_round_pd^{⚠}  Experimentalx8664 and avx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_add_round_ps^{⚠}  Experimentalx8664 and avx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_and_epi32^{⚠}  Experimentalx8664 and avx512f Performs elementbyelement bitwise AND between packed 32bit integer elements of v2 and v3, storing the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_and_epi64^{⚠}  Experimentalx8664 and avx512f Compute the bitwise AND of packed 64bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cmp_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b based on the comparison operand specified by op, using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b based on the comparison operand specified by op, using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b based on the comparison operand specified by op, using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b based on the comparison operand specified by op, using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b based on the comparison operand specified by op, using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b based on the comparison operand specified by op, using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_round_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b based on the comparison operand specified by op, using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmp_round_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b based on the comparison operand specified by op, using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for equality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for equality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for equality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for equality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for equality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpeq_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for equality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpge_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for greaterthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpge_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for greaterthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpge_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for greaterthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpge_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for greaterthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpgt_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for greaterthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpgt_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for greaterthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpgt_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for greaterthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpgt_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for greaterthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for lessthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for lessthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for lessthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for lessthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for lessthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmple_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for lessthanorequal, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for lessthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for lessthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for lessthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for lessthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for lessthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmplt_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for lessthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_epi32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b for inequality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_epi64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b for inequality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_epu32_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b for inequality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_epu64_mask^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b for inequality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b for inequality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpneq_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for inequality, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpnle_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for greaterthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpnle_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for greaterthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpnlt_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for greaterthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpnlt_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b for greaterthan, and store the results in a mask vector k using zeromask m (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_mask_cmpord_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b to see if neither is NaN, and store the results in a mask vector. 
_mm512_mask_cmpord_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b to see if neither is NaN, and store the results in a mask vector. 
_mm512_mask_cmpunord_pd_mask^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b to see if either is NaN, and store the results in a mask vector. 
_mm512_mask_cmpunord_ps_mask^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b to see if either is NaN, and store the results in a mask vector. 
_mm512_mask_cvt_roundps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvt_roundps_pd^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_cvtps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtps_pd^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvtt_roundpd_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_cvtt_roundpd_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_cvtt_roundps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_cvtt_roundps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_cvttpd_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvttpd_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvttps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_cvttps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_div_pd^{⚠}  Experimentalx8664 and avx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_div_ps^{⚠}  Experimentalx8664 and avx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_div_round_pd^{⚠}  Experimentalx8664 and avx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_div_round_ps^{⚠}  Experimentalx8664 and avx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_fmadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmaddsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmaddsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmaddsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmaddsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsubadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsubadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsubadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fmsubadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_fnmsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_getexp_pd^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_mask_getexp_ps^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_mask_getexp_round_pd^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_getexp_round_ps^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_getmant_pd^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 
_mm512_mask_getmant_ps^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 
_mm512_mask_getmant_round_pd^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_getmant_round_ps^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_i32gather_epi32^{⚠}  Experimentalx8664 and avx512f Gather 32bit integers from memory using 32bit indices. 
_mm512_mask_i32gather_epi64^{⚠}  Experimentalx8664 and avx512f Gather 64bit integers from memory using 32bit indices. 
_mm512_mask_i32gather_pd^{⚠}  Experimentalx8664 and avx512f Gather doubleprecision (64bit) floatingpoint elements from memory using 32bit indices. 
_mm512_mask_i32gather_ps^{⚠}  Experimentalx8664 and avx512f Gather singleprecision (32bit) floatingpoint elements from memory using 32bit indices. 
_mm512_mask_i32scatter_epi32^{⚠}  Experimentalx8664 and avx512f Scatter 32bit integers from src into memory using 32bit indices. 
_mm512_mask_i32scatter_epi64^{⚠}  Experimentalx8664 and avx512f Scatter 64bit integers from src into memory using 32bit indices. 
_mm512_mask_i32scatter_pd^{⚠}  Experimentalx8664 and avx512f Scatter doubleprecision (64bit) floatingpoint elements from src into memory using 32bit indices. 
_mm512_mask_i32scatter_ps^{⚠}  Experimentalx8664 and avx512f Scatter singleprecision (32bit) floatingpoint elements from src into memory using 32bit indices. 
_mm512_mask_i64gather_epi32^{⚠}  Experimentalx8664 and avx512f Gather 32bit integers from memory using 64bit indices. 
_mm512_mask_i64gather_epi64^{⚠}  Experimentalx8664 and avx512f Gather 64bit integers from memory using 64bit indices. 
_mm512_mask_i64gather_pd^{⚠}  Experimentalx8664 and avx512f Gather doubleprecision (64bit) floatingpoint elements from memory using 64bit indices. 
_mm512_mask_i64gather_ps^{⚠}  Experimentalx8664 and avx512f Gather singleprecision (32bit) floatingpoint elements from memory using 64bit indices. 
_mm512_mask_i64scatter_epi32^{⚠}  Experimentalx8664 and avx512f Scatter 32bit integers from src into memory using 64bit indices. 
_mm512_mask_i64scatter_epi64^{⚠}  Experimentalx8664 and avx512f Scatter 64bit integers from src into memory using 64bit indices. 
_mm512_mask_i64scatter_pd^{⚠}  Experimentalx8664 and avx512f Scatter doubleprecision (64bit) floatingpoint elements from src into memory using 64bit indices. 
_mm512_mask_i64scatter_ps^{⚠}  Experimentalx8664 and avx512f Scatter singleprecision (32bit) floatingpoint elements from src into memory using 64bit indices. 
_mm512_mask_max_epi32^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_epi64^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_epu32^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_epu64^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_max_round_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_max_round_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_min_epi32^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_epi64^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_epu32^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_epu64^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_min_round_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_min_round_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_mask_movedup_pd^{⚠}  Experimentalx8664 and avx512f Duplicate evenindexed doubleprecision (64bit) floatingpoint elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_movehdup_ps^{⚠}  Experimentalx8664 and avx512f Duplicate oddindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_moveldup_ps^{⚠}  Experimentalx8664 and avx512f Duplicate evenindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mul_epi32^{⚠}  Experimentalx8664 and avx512f Multiply the low signed 32bit integers from each packed 64bit element in a and b, and store the signed 64bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mul_epu32^{⚠}  Experimentalx8664 and avx512f Multiply the low unsigned 32bit integers from each packed 64bit element in a and b, and store the unsigned 64bit results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mul_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). RM. 
_mm512_mask_mul_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). RM. 
_mm512_mask_mul_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mul_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mullo_epi32^{⚠}  Experimentalx8664 and avx512f Multiply the packed 32bit integers in a and b, producing intermediate 64bit integers, and store the low 32 bits of the intermediate integers in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_mullox_epi64^{⚠}  Experimentalx8664 and avx512f Multiplies elements in packed 64bit integer vectors a and b together, storing the lower 64 bits of the result in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_or_epi32^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of packed 32bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_or_epi64^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of packed 64bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permute_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permute_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutevar_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). Note that this intrinsic shuffles across 128bit lanes, unlike past intrinsics that use the permutevar name. This intrinsic is identical to _mm512_mask_permutexvar_epi32, and it is recommended that you use that intrinsic name. 
_mm512_mask_permutevar_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutevar_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutex2var_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_permutex2var_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_permutex2var_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_permutex2var_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using writemask k (elements are copied from a when the corresponding mask bit is not set). 
_mm512_mask_permutex_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a within 256bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutex_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 256bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutexvar_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutexvar_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutexvar_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_permutexvar_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rcp14_pd^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_mask_rcp14_ps^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_mask_rol_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rol_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rolv_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rolv_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_ror_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_ror_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rorv_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rorv_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_rsqrt14_pd^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_mask_rsqrt14_ps^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_mask_shuffle_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_f32x4^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 4 singleprecision (32bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_f64x2^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 2 doubleprecision (64bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_i32x4^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 4 32bit integers) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_i64x2^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 2 64bit integers) selected by imm8 from a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_shuffle_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sll_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sll_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a left by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_slli_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_slli_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a left by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sllv_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sllv_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sqrt_pd^{⚠}  Experimentalx8664 and avx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sqrt_ps^{⚠}  Experimentalx8664 and avx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sqrt_round_pd^{⚠}  Experimentalx8664 and avx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sqrt_round_ps^{⚠}  Experimentalx8664 and avx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sra_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sra_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srai_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srai_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srav_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srav_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srl_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srl_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srli_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srli_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by imm8 while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srlv_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_srlv_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_epi32^{⚠}  Experimentalx8664 and avx512f Subtract packed 32bit integers in b from packed 32bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_epi64^{⚠}  Experimentalx8664 and avx512f Subtract packed 64bit integers in b from packed 64bit integers in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_pd^{⚠}  Experimentalx8664 and avx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_ps^{⚠}  Experimentalx8664 and avx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_round_pd^{⚠}  Experimentalx8664 and avx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_sub_round_ps^{⚠}  Experimentalx8664 and avx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_xor_epi32^{⚠}  Experimentalx8664 and avx512f Compute the bitwise XOR of packed 32bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_mask_xor_epi64^{⚠}  Experimentalx8664 and avx512f Compute the bitwise XOR of packed 64bit integers in a and b, and store the results in dst using writemask k (elements are copied from src when the corresponding mask bit is not set). 
_mm512_maskz_abs_epi32^{⚠}  Experimentalx8664 and avx512f Computes the absolute value of packed 32bit integers in 
_mm512_maskz_abs_epi64^{⚠}  Experimentalx8664 and avx512f Compute the absolute value of packed signed 64bit integers in a, and store the unsigned results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_epi32^{⚠}  Experimentalx8664 and avx512f Add packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_epi64^{⚠}  Experimentalx8664 and avx512f Add packed 64bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_pd^{⚠}  Experimentalx8664 and avx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_ps^{⚠}  Experimentalx8664 and avx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_round_pd^{⚠}  Experimentalx8664 and avx512f Add packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_add_round_ps^{⚠}  Experimentalx8664 and avx512f Add packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_and_epi32^{⚠}  Experimentalx8664 and avx512f Compute the bitwise AND of packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_and_epi64^{⚠}  Experimentalx8664 and avx512f Compute the bitwise AND of packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvt_roundps_pd^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_cvtps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtps_pd^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed doubleprecision (64bit) floatingpoint elements, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvtt_roundpd_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_cvtt_roundpd_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_cvtt_roundps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_cvtt_roundps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_cvttpd_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvttpd_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (64bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvttps_epi32^{⚠}  Experimentalx8664 and avx512f Convert packed singleprecision (32bit) floatingpoint elements in a to packed 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_cvttps_epu32^{⚠}  Experimentalx8664 and avx512f Convert packed doubleprecision (32bit) floatingpoint elements in a to packed unsigned 32bit integers with truncation, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_div_pd^{⚠}  Experimentalx8664 and avx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_div_ps^{⚠}  Experimentalx8664 and avx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_div_round_pd^{⚠}  Experimentalx8664 and avx512f Divide packed doubleprecision (64bit) floatingpoint elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_div_round_ps^{⚠}  Experimentalx8664 and avx512f Divide packed singleprecision (32bit) floatingpoint elements in a by packed elements in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the intermediate result to packed elements in c, and store the results in a using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmaddsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmaddsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmaddsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmaddsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsubadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsubadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsubadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, alternatively add and subtract packed elements in c to/from the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fmsubadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, alternatively subtract and add packed elements in c from/to the intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmadd_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmadd_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmadd_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmadd_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, add the negated intermediate result to packed elements in c, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmsub_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmsub_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmsub_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_fnmsub_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, subtract packed elements in c from the negated intermediate result, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_getexp_pd^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_maskz_getexp_ps^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. 
_mm512_maskz_getexp_round_pd^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed doubleprecision (64bit) floatingpoint element in a to a doubleprecision (64bit) floatingpoint number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_getexp_round_ps^{⚠}  Experimentalx8664 and avx512f Convert the exponent of each packed singleprecision (32bit) floatingpoint element in a to a singleprecision (32bit) floatingpoint number representing the integer exponent, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates floor(log2(x)) for each element. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_getmant_pd^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 
_mm512_maskz_getmant_ps^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 
_mm512_maskz_getmant_round_pd^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_getmant_round_ps^{⚠}  Experimentalx8664 and avx512f Normalize the mantissas of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). This intrinsic essentially calculates ±(2^k)*x.significand, where k depends on the interval range defined by interv and the sign depends on sc and the source sign. The mantissa is normalized to the interval specified by interv, which can take the following values: _MM_MANT_NORM_1_2 // interval [1, 2) _MM_MANT_NORM_p5_2 // interval [0.5, 2) _MM_MANT_NORM_p5_1 // interval [0.5, 1) _MM_MANT_NORM_p75_1p5 // interval [0.75, 1.5) The sign is determined by sc which can take the following values: _MM_MANT_SIGN_src // sign = sign(src) _MM_MANT_SIGN_zero // sign = 0 _MM_MANT_SIGN_nan // dst = NaN if sign(src) = 1 Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_max_epi32^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_epi64^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_epu32^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_epu64^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_max_round_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_max_round_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_min_epi32^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_epi64^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_epu32^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_epu64^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_min_round_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_min_round_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_maskz_movedup_pd^{⚠}  Experimentalx8664 and avx512f Duplicate evenindexed doubleprecision (64bit) floatingpoint elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_movehdup_ps^{⚠}  Experimentalx8664 and avx512f Duplicate oddindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_moveldup_ps^{⚠}  Experimentalx8664 and avx512f Duplicate evenindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_epi32^{⚠}  Experimentalx8664 and avx512f Multiply the low signed 32bit integers from each packed 64bit element in a and b, and store the signed 64bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_epu32^{⚠}  Experimentalx8664 and avx512f Multiply the low unsigned 32bit integers from each packed 64bit element in a and b, and store the unsigned 64bit results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mul_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_mullo_epi32^{⚠}  Experimentalx8664 and avx512f Multiply the packed 32bit integers in a and b, producing intermediate 64bit integers, and store the low 32 bits of the intermediate integers in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_or_epi32^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_or_epi64^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of packed 64bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permute_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permute_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutevar_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutevar_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex2var_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex2var_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex2var_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex2var_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a within 256bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutex_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 256bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutexvar_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutexvar_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutexvar_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_permutexvar_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rcp14_pd^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_maskz_rcp14_ps^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_maskz_rol_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rol_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rolv_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rolv_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_ror_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_ror_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rorv_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rorv_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_rsqrt14_pd^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_maskz_rsqrt14_ps^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). The maximum relative error for this approximation is less than 2^14. 
_mm512_maskz_shuffle_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_f32x4^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 4 singleprecision (32bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_f64x2^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 2 doubleprecision (64bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_i32x4^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 4 32bit integers) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_i64x2^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 2 64bit integers) selected by imm8 from a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_shuffle_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sll_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sll_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_slli_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_slli_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a left by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sllv_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sllv_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a left by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sqrt_pd^{⚠}  Experimentalx8664 and avx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sqrt_ps^{⚠}  Experimentalx8664 and avx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sqrt_round_pd^{⚠}  Experimentalx8664 and avx512f Compute the square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sqrt_round_ps^{⚠}  Experimentalx8664 and avx512f Compute the square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sra_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sra_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srai_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srai_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by imm8 while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srav_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srav_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in sign bits, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srl_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srl_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a left by count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srli_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srli_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by imm8 while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srlv_epi32^{⚠}  Experimentalx8664 and avx512f Shift packed 32bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_srlv_epi64^{⚠}  Experimentalx8664 and avx512f Shift packed 64bit integers in a right by the amount specified by the corresponding element in count while shifting in zeros, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_epi32^{⚠}  Experimentalx8664 and avx512f Subtract packed 32bit integers in b from packed 32bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_epi64^{⚠}  Experimentalx8664 and avx512f Subtract packed 64bit integers in b from packed 64bit integers in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_pd^{⚠}  Experimentalx8664 and avx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_ps^{⚠}  Experimentalx8664 and avx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_round_pd^{⚠}  Experimentalx8664 and avx512f Subtract packed doubleprecision (64bit) floatingpoint elements in b from packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_sub_round_ps^{⚠}  Experimentalx8664 and avx512f Subtract packed singleprecision (32bit) floatingpoint elements in b from packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_xor_epi32^{⚠}  Experimentalx8664 and avx512f Compute the bitwise XOR of packed 32bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_maskz_xor_epi64^{⚠}  Experimentalx8664 and avx512f Compute the bitwise XOR of packed 64bit integers in a and b, and store the results in dst using zeromask k (elements are zeroed out when the corresponding mask bit is not set). 
_mm512_max_epi32^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b, and store packed maximum values in dst. 
_mm512_max_epi64^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b, and store packed maximum values in dst. 
_mm512_max_epu32^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b, and store packed maximum values in dst. 
_mm512_max_epu64^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b, and store packed maximum values in dst. 
_mm512_max_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst. 
_mm512_max_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst. 
_mm512_max_round_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed maximum values in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_max_round_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed maximum values in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_min_epi32^{⚠}  Experimentalx8664 and avx512f Compare packed signed 32bit integers in a and b, and store packed minimum values in dst. 
_mm512_min_epi64^{⚠}  Experimentalx8664 and avx512f Compare packed signed 64bit integers in a and b, and store packed minimum values in dst. 
_mm512_min_epu32^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 32bit integers in a and b, and store packed minimum values in dst. 
_mm512_min_epu64^{⚠}  Experimentalx8664 and avx512f Compare packed unsigned 64bit integers in a and b, and store packed minimum values in dst. 
_mm512_min_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst. 
_mm512_min_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst. 
_mm512_min_round_pd^{⚠}  Experimentalx8664 and avx512f Compare packed doubleprecision (64bit) floatingpoint elements in a and b, and store packed minimum values in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_min_round_ps^{⚠}  Experimentalx8664 and avx512f Compare packed singleprecision (32bit) floatingpoint elements in a and b, and store packed minimum values in dst. Exceptions can be suppressed by passing _MM_FROUND_NO_EXC in the sae parameter. 
_mm512_movedup_pd^{⚠}  Experimentalx8664 and avx512f Duplicate evenindexed doubleprecision (64bit) floatingpoint elements from a, and store the results in dst. 
_mm512_movehdup_ps^{⚠}  Experimentalx8664 and avx512f Duplicate oddindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst. 
_mm512_moveldup_ps^{⚠}  Experimentalx8664 and avx512f Duplicate evenindexed singleprecision (32bit) floatingpoint elements from a, and store the results in dst. 
_mm512_mul_epi32^{⚠}  Experimentalx8664 and avx512f Multiply the low signed 32bit integers from each packed 64bit element in a and b, and store the signed 64bit results in dst. 
_mm512_mul_epu32^{⚠}  Experimentalx8664 and avx512f Multiply the low unsigned 32bit integers from each packed 64bit element in a and b, and store the unsigned 64bit results in dst. 
_mm512_mul_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_mul_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_mul_round_pd^{⚠}  Experimentalx8664 and avx512f Multiply packed doubleprecision (64bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_mul_round_ps^{⚠}  Experimentalx8664 and avx512f Multiply packed singleprecision (32bit) floatingpoint elements in a and b, and store the results in dst. 
_mm512_mullo_epi32^{⚠}  Experimentalx8664 and avx512f Multiply the packed 32bit integers in a and b, producing intermediate 64bit integers, and store the low 32 bits of the intermediate integers in dst. 
_mm512_mullox_epi64^{⚠}  Experimentalx8664 and avx512f Multiplies elements in packed 64bit integer vectors a and b together, storing the lower 64 bits of the result in dst. 
_mm512_or_epi32^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of packed 32bit integers in a and b, and store the results in dst. 
_mm512_or_epi64^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of packed 64bit integers in a and b, and store the resut in dst. 
_mm512_or_si512^{⚠}  Experimentalx8664 and avx512f Compute the bitwise OR of 512 bits (representing integer data) in a and b, and store the result in dst. 
_mm512_permute_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst. 
_mm512_permute_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst. 
_mm512_permutevar_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst. Note that this intrinsic shuffles across 128bit lanes, unlike past intrinsics that use the permutevar name. This intrinsic is identical to _mm512_permutexvar_epi32, and it is recommended that you use that intrinsic name. 
_mm512_permutevar_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst. 
_mm512_permutevar_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in b, and store the results in dst. 
_mm512_permutex2var_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst. 
_mm512_permutex2var_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a and b across lanes using the corresponding selector and index in idx, and store the results in dst. 
_mm512_permutex2var_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst. 
_mm512_permutex2var_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a and b across lanes using the corresponding selector and index in idx, and store the results in dst. 
_mm512_permutex_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a within 256bit lanes using the control in imm8, and store the results in dst. 
_mm512_permutex_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a within 256bit lanes using the control in imm8, and store the results in dst. 
_mm512_permutexvar_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle 32bit integers in a across lanes using the corresponding index in idx, and store the results in dst. 
_mm512_permutexvar_epi64^{⚠}  Experimentalx8664 and avx512f Shuffle 64bit integers in a across lanes using the corresponding index in idx, and store the results in dst. 
_mm512_permutexvar_pd^{⚠}  Experimentalx8664 and avx512f Shuffle doubleprecision (64bit) floatingpoint elements in a across lanes using the corresponding index in idx, and store the results in dst. 
_mm512_permutexvar_ps^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a across lanes using the corresponding index in idx. 
_mm512_rcp14_pd^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^14. 
_mm512_rcp14_ps^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^14. 
_mm512_rol_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in imm8, and store the results in dst. 
_mm512_rol_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in imm8, and store the results in dst. 
_mm512_rolv_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst. 
_mm512_rolv_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the left by the number of bits specified in the corresponding element of b, and store the results in dst. 
_mm512_ror_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in imm8, and store the results in dst. 
_mm512_ror_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in imm8, and store the results in dst. 
_mm512_rorv_epi32^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 32bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst. 
_mm512_rorv_epi64^{⚠}  Experimentalx8664 and avx512f Rotate the bits in each packed 64bit integer in a to the right by the number of bits specified in the corresponding element of b, and store the results in dst. 
_mm512_rsqrt14_pd^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal square root of packed doubleprecision (64bit) floatingpoint elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^14. 
_mm512_rsqrt14_ps^{⚠}  Experimentalx8664 and avx512f Compute the approximate reciprocal square root of packed singleprecision (32bit) floatingpoint elements in a, and store the results in dst. The maximum relative error for this approximation is less than 2^14. 
_mm512_set1_epi32^{⚠}  Experimentalx8664 and avx512f Broadcast 32bit integer 
_mm512_set1_epi64^{⚠}  Experimentalx8664 and avx512f Broadcast 64bit integer 
_mm512_set1_pd^{⚠}  Experimentalx8664 and avx512f Broadcast 64bit float 
_mm512_set1_ps^{⚠}  Experimentalx8664 and avx512f Broadcast 32bit float 
_mm512_set_epi32^{⚠}  Experimentalx8664 and avx512f Sets packed 32bit integers in 
_mm512_set_epi64^{⚠}  Experimentalx8664 and avx512f Sets packed 64bit integers in 
_mm512_set_pd^{⚠}  Experimentalx8664 and avx512f Sets packed 64bit integers in 
_mm512_set_ps^{⚠}  Experimentalx8664 and avx512f Sets packed 32bit integers in 
_mm512_setr_epi32^{⚠}  Experimentalx8664 and avx512f Sets packed 32bit integers in 
_mm512_setr_epi64^{⚠}  Experimentalx8664 and avx512f Sets packed 64bit integers in 
_mm512_setr_pd^{⚠}  Experimentalx8664 and avx512f Sets packed 64bit integers in 
_mm512_setr_ps^{⚠}  Experimentalx8664 and avx512f Sets packed 32bit integers in 
_mm512_setzero_pd^{⚠}  Experimentalx8664 and avx512f Returns vector of type 
_mm512_setzero_ps^{⚠}  Experimentalx8664 and avx512f Returns vector of type 
_mm512_setzero_si512^{⚠}  Experimentalx8664 and avx512f Returns vector of type 
_mm512_shuffle_epi32^{⚠}  Experimentalx8664 and avx512f Shuffle singleprecision (32bit) floatingpoint elements in a within 128bit lanes using the control in imm8, and store the results in dst. 
_mm512_shuffle_f32x4^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 4 singleprecision (32bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst. 
_mm512_shuffle_f64x2^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 2 doubleprecision (64bit) floatingpoint elements) selected by imm8 from a and b, and store the results in dst. 
_mm512_shuffle_i32x4^{⚠}  Experimentalx8664 and avx512f Shuffle 128bits (composed of 4 32bit integers) selected by imm8 from a and b, and store the results in dst. 
_mm512_shuffle_i64x2 