VOOZH about

URL: https://www.phoronix.com/news/GCC-16-x86-Inline-Memmove

⇱ GCC 16 Lands Improved Memmove Behavior For x86/x86_64 CPUs - Phoronix


👁 Phoronix

GCC 16 Lands Improved Memmove Behavior For x86/x86_64 CPUs

Written by Michael Larabel in GNU on 3 November 2025 at 06:17 AM EST. 2 Comments
H.J. Lu, a long-time compiler expert at Intel, merged today improved memmove() behavior for the GNU Compiler Collection ahead of the upcoming GCC 16 release.

The change for GCC x86/x86_64 is for inlining memmove with overlapping unaligned loads and stores. H.J. Lu argued his rationale with the patch for inlining memmove functionality more:
"x86-64: Inline memmove with overlapping unaligned loads and stores

Inline memmove in 64-bit since there are much less registers available in 32-bit:

1. Load all sources into registers and store them together to avoid possible address overlap between source and destination.
2. For known size, first try to fully unroll with 8 registers.
3. For size <= 2 * MOVE_MAX, load all sources into 2 registers first and then store them together.
4. For size > 2 * MOVE_MAX and size <= 4 * MOVE_MAX, load all sources into 4 registers first and then store them together.
5. For size > 4 * MOVE_MAX and size <= 8 * MOVE_MAX, load all sources into 8 registers first and then store them together.
6. For size > 8 * MOVE_MAX,
a. If address of destination > address of source, copy backward with a 4 * MOVE_MAX loop with unaligned loads and stores. Load the first 4 * MOVE_MAX into 4 registers before the loop and store them after the loop to support overlapping addresses.
b. Otherwise, copy forward with a 4 * MOVE_MAX loop with unaligned loads and stores. Load the last 4 * MOVE_MAX into 4 registers before the loop and store them after the loop to support overlapping addresses.

Verified and benchmarked memmove implementations inlined with GPR, SSE2, AVX2 and AVX512 using glibc memmove tests.
...
Their performances are comparable with optimized memmove implementations in glibc on Intel Core i7-1195G7."

The code was merged this morning ahead of GCC 16's stage 3 milestone this month. GCC 16.1 as the first stable release of GCC 16 should be out around March~April.

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.