core::arch::x86_64

Function _mm_prefetch

1.27.0 · Source
pub unsafe fn _mm_prefetch(p: *const i8, const STRATEGY: i32)
Available on (x86 or x86-64) and target feature sse and x86-64 only.
Expand description

Fetch the cache line that contains address p using the given STRATEGY.

The STRATEGY must be one of:

  • _MM_HINT_T0: Fetch into all levels of the cache hierarchy.

  • _MM_HINT_T1: Fetch into L2 and higher.

  • _MM_HINT_T2: Fetch into L3 and higher or an implementation-specific choice (e.g., L2 if there is no L3).

  • _MM_HINT_NTA: Fetch data using the non-temporal access (NTA) hint. It may be a place closer than main memory but outside of the cache hierarchy. This is used to reduce access latency without polluting the cache.

  • _MM_HINT_ET0 and _MM_HINT_ET1 are similar to _MM_HINT_T0 and _MM_HINT_T1 but indicate an anticipation to write to the address.

The actual implementation depends on the particular CPU. This instruction is considered a hint, so the CPU is also free to simply ignore the request.

The amount of prefetched data depends on the cache line size of the specific CPU, but it will be at least 32 bytes.

Common caveats:

  • Most modern CPUs already automatically prefetch data based on predicted access patterns.

  • Data is usually not fetched if this would cause a TLB miss or a page fault.

  • Too much prefetching can cause unnecessary cache evictions.

  • Prefetching may also fail if there are not enough memory-subsystem resources (e.g., request buffers).

Intel’s documentation