SSE Arithmetic

ADDPS (parallel) and ADDSS (scalar) add the pair of operands. SUBPS (parallel) and SUBSS (scalar) subtract the pair of operands. MULPS (parallel) and MULSS (scalar) multiply the pair of operands.DIVPS (parallel) and DIVSS (scalar) divides the pair of operands. SQRTPS…

SSE2 and MMX

layout-sse2

I’’m quite sure that Intel would not like to see SSE2 named 128-bit MMX. In fact, MMX has a bad reputation: the Intel marketing hype pushed it as an universal solution to multimedia requirements, but at the same time the…

MMX Shift

The logical shift left, logical shift right and arithmetic shift right instructions shift each element by a specified number of bits. The logical left and right shifts also enable a 64-bit quantity (quadword) to be shifted as one block, assisting…

MMX Logical

PAND mm, mm/m64 PANDN mm, mm/m64 POR mm, mm/m64 PXOR mm, mm/m64 The PAND (Bitwise Logical And), PANDN (Bitwise Logical And Not), POR (Bitwise Logical OR), and PXOR (Bitwise Logical Exclusive OR) instructions perform bitwise logical operations on 64-bit quantities.…

MMX Examples

This section describes example uses of the MMX instruction set to implement basic coding structures. Conditional Select Operating on multiple data operands using a single instruction presents an interesting issue: what happens when a computation is only done if the…

MMX EMMS

The EMMS instruction empties the MMX state. This instruction must be used to clear the MMX state (i.e. empty the floating-point tag word) at the end of an MMX routine before calling other routines that can execute floating-point instructions. If…

MMX Data Transfer

When stored in memory the bytes, words, and doublewords in the packed data types are stored in consecutive addresses, with the least significant byte, word, or doubleword being stored in the lowest address and the more significant bytes, words, or…

MMX Conversion

There are several cases where elements of packed data may be required to be repositioned within the packed data, or the elements of two packed data operands may need to be merged. There are cases where either input or the…

MMX Comparison

These instructions generate a mask of ones or zeros which can be used by logical operations to select elements within a register: a developer can implement a packed conditional move operation without a set of branch instructions.   PCMPEQB mm,…

MMX Arithmetic

SIMD addition

The MMX technology supports both saturating and wraparound modes. In wraparound mode, results that overflow or underflow are truncated and only the lower (least significant) bits of the result are returned. In saturation mode, results of an operation that overflow…