• SIMD on x64/x86

    MMX Primer: Packed Integer SIMD on Early x86 CPUs

    MMX was the first widely adopted SIMD instruction set on x86 processors. It was introduced by Intel in the Pentium MMX generation and later supported by AMD and other x86-compatible processors. At the time, it was a major step forward for multimedia and communications software because it allowed one instruction…

  • SIMD on x64/x86

    Programming models

    Any computer, whether sequential or parallel, operates by executing instructions on data. A stream of instructions (the algorithm) tells the computer what to do at each step. A stream of data (the input to the algorithm) is affected by these instructions. A widely used classification of parallel systems, due to…

  • SIMD on x64/x86

    SSE State Management: MXCSR, FXSAVE, FXRSTOR, and FP control

    SSE state management is the part of SIMD programming concerned with the processor state used by SSE floating-point instructions. Most SSE code does not need explicit state management. If you are writing ordinary code with intrinsics such as _mm_add_ps, _mm_mul_ps, _mm_loadu_ps, and _mm_storeu_ps, the compiler, operating system, and calling convention…

  • SIMD on x64/x86

    SSE Shuffle

    SHUFPS is able to shuffle any of the numbers from one source operand to the lower two destination fields; the upper two destination fields are generated from a shuffle of any of the four SP FP numbers from the second source operand. By using the same register for both sources,…

  • SIMD on x64/x86

    SSE Reciprocal

    A basic building block operation in geometry involves computing divisions and square roots. For instance, transformation often involves dividing each x, y, z coordinate by the W perspective coordinate; normalization is another common geometry operation, which requires the computation of 1/square-root. In order to optimize these cases, SSE introduces two…