• SIMD

    SSE Arithmetic

    ADDPS (parallel) and ADDSS (scalar) add the pair of operands. SUBPS (parallel) and SUBSS (scalar) subtract the pair of operands. MULPS (parallel) and MULSS (scalar) multiply the pair of operands.DIVPS (parallel) and DIVSS (scalar) divides the pair of operands. SQRTPS (parallel) and SQRTSS (scalar) return the square root of the…

  • SIMD

    SSE2 and MMX

    I?’m quite sure that Intel would not like to see SSE2 named 128-bit MMX. In fact, MMX has a bad reputation: the Intel marketing hype pushed it as an universal solution to multimedia requirements, but at the same time the gaming industry switched from mostly 2D games to Virtual Reality-like…

  • SIMD

    MMX Shift

    The logical shift left, logical shift right and arithmetic shift right instructions shift each element by a specified number of bits. The logical left and right shifts also enable a 64-bit quantity (quadword) to be shifted as one block, assisting in data type conversions and alignment operations.   PSLLW mm,…

  • SIMD

    MMX Logical

    PAND mm, mm/m64 PANDN mm, mm/m64 POR mm, mm/m64 PXOR mm, mm/m64 The PAND (Bitwise Logical And), PANDN (Bitwise Logical And Not), POR (Bitwise Logical OR), and PXOR (Bitwise Logical Exclusive OR) instructions perform bitwise logical operations on 64-bit quantities. The destination operand is an MMX register, while the source…

  • SIMD

    MMX Examples

    This section describes example uses of the MMX instruction set to implement basic coding structures. Conditional Select Operating on multiple data operands using a single instruction presents an interesting issue: what happens when a computation is only done if the operand value passes some conditional check? For example, in an…

  • SIMD

    MMX EMMS

    The EMMS instruction empties the MMX state. This instruction must be used to clear the MMX state (i.e. empty the floating-point tag word) at the end of an MMX routine before calling other routines that can execute floating-point instructions. If a floating-point instruction loads into one of the registers before…

  • SIMD

    MMX Data Transfer

    When stored in memory the bytes, words, and doublewords in the packed data types are stored in consecutive addresses, with the least significant byte, word, or doubleword being stored in the lowest address and the more significant bytes, words, or doubleword being stored at consecutively higher addresses. The ordering of…

  • SIMD

    MMX Conversion

    There are several cases where elements of packed data may be required to be repositioned within the packed data, or the elements of two packed data operands may need to be merged. There are cases where either input or the desired output representation of a data may not be ideal…

  • SIMD

    MMX Comparison

    These instructions generate a mask of ones or zeros which can be used by logical operations to select elements within a register: a developer can implement a packed conditional move operation without a set of branch instructions.

  • SIMD

    MMX Arithmetic

    The MMX technology supports both saturating and wraparound modes. In wraparound mode, results that overflow or underflow are truncated and only the lower (least significant) bits of the result are returned. In saturation mode, results of an operation that overflow or underflow are clipped (saturated) to a data-range limit for…